The DSI utilities are a mish mash of classes accumulated during the last
ten years in projects developed at the DSI (Dipartimento di Scienze dell'Informazione,
e.g., Information Sciences Department) of the Università degli Studi di Milano.
They were originally distributed in several projects
(mainly in MG4J) but we finally decided to
gather all the material in a single place.
The DSI utilities are distributed under the GNU Lesser General Public License.
Highlights
The implementations available are a bit eclectic due to the particular kind of applications
we developed. Very broadly, we have:
- {@link it.unimi.dsi.lang.MutableString}, our answer to the Java {@link java.lang.String} class.
- {@link it.unimi.dsi.bits.BitVector} and its implementations—a high-performance but flexible set of bit vector classes.
- A {@link it.unimi.dsi.compression} package containing codecs for several types of encodings.
- {@link it.unimi.dsi.logging.ProgressLogger}, marking the progress of the (many) classes
we use that require hours of computation.
- The {@link it.unimi.dsi.parser.BulletParser}, that we use to parse HTML and XML.
- The {@link it.unimi.dsi.io I/O package}, containing fast version of several classes existing in {@link java.io}
and many useful classes to read easily text data (e.g., {@link it.unimi.dsi.io.FileLinesCollection}).
- The {@link it.unimi.dsi.util} package, containing {@linkplain it.unimi.dsi.util.ImmutableBinaryTrie tries},
{@linkplain it.unimi.dsi.util.ImmutableExternalPrefixMap immutable prefix maps}, {@linkplain it.unimi.dsi.util.BloomFilter Bloom filters},
a very comfortable {@link it.unimi.dsi.util.Properties} class and more.
- {@link it.unimi.dsi.util.XorShiftStarRandom}, a replacement for {@link java.util.Random} that is, really,
a better mousetrap.
- Lots of utility methods in {@link it.unimi.dsi.Util} (have a look!)
- Big versions of I/O and utility classes
in {@link it.unimi.dsi.big.io} and {@link it.unimi.dsi.big.util}.
Package Dependencies
The DSI utilities require Java ≥6 and use fastutil 6.4 or greater
for high-performance containers and algorithms. Command-line parsing and support requires JSAP.
They use also a number of useful libraries from the Jakarta commons project,
including collections,
lang,
configuration and
io.
All logging is performed using log4j.