WebGraph is a framework to study the web graph. It provides simple ways to manage very large graphs, exploiting modern compression techniques. The big version is a fork of the original WebGraph that can handle more than 231 nodes. For more details on WebGraph that are common between the standard and the big version, please see WebGraph.
If you are used to WebGraph, the main difference is that, of course, nodes are indexed by long integers. Correspondingly, iterators on nodes are {@link it.unimi.dsi.big.webgraph.LazyLongIterator}s, and all array-based methods (such as {@link it.unimi.dsi.big.webgraph.ImmutableGraph#successorBigArray(long)} or {@link it.unimi.dsi.big.webgraph.labelling.ArcLabelledImmutableGraph#labelBigArray(long)}) return {@linkplain it.unimi.dsi.fastutil.BigArrays big arrays}.
Some classes have not been ported, and will be ported on an “as-needed” basis.
If you want to port code written for WebGraph to the big version, the main
nuisance is the fact that {@link it.unimi.dsi.big.webgraph.ImmutableGraph#successorBigArray(long)}
returns, as the name says, a {@linkplain it.unimi.dsi.fastutil.BigArrays big array}, which
cannot be accessed like a standard Java array. Watch out in particular for accesses to the
length
field, which will be syntactically correct even on a big array, but
must be replaced by calls to a suitable method (e.g.,
{@link it.unimi.dsi.fastutil.longs.LongBigArrays#length(long[][])}). In general, you
must get accustomed to big-array methods before porting code.
To simplify many mundane matters, such as unit tests, {@link it.unimi.dsi.big.webgraph.ImmutableGraph} provides two static wrapping methods ({@link it.unimi.dsi.big.webgraph.ImmutableGraph#wrap(it.unimi.dsi.webgraph.ImmutableGraph)} and {@link it.unimi.dsi.big.webgraph.ImmutableGraph#wrap(ImmutableGraph)}) that turn a standard {@link it.unimi.dsi.webgraph.ImmutableGraph} into a big {@link it.unimi.dsi.big.webgraph.ImmutableGraph} and viceversa. Thus, for instance, there is no big version of {@link it.unimi.dsi.webgraph.ArrayListMutableGraph}: it is expected that instances will be just wrapped should you need to use them in the big framework.
The serialisation format of the standard and big versions of {@link it.unimi.dsi.webgraph.BVGraph} are compatible (of course, you cannot load a graph with more than 231 elements using the standard version). The same graph loaded with instances of the two classes, however, will not by {@linkplain java.lang.Object#equals(Object) equal}. You must wrap one or the other (see above) to check for equality.
Note also that usually satellite data generated by various utilities (e.g., {@link it.unimi.dsi.big.webgraph.algo.StronglyConnectedComponents}) are written using formats that are not compatible.
WebGraph (big) requires Java ≥6, depends on the standard WebGraph distribution and relies on fastutil 6.4 or greater for high-performance containers and algorithms, on the COLT distribution for statistics, on the DSI utilities for bit-level I/O, on Sux4J for succinct data structures, on JSAP for line-command parsing and on log4j for logging.
Note that in principle the DSI utilities depend on a number of additional useful libraries from the Jakarta commons project, including collections, lang, configuration and io.