graph_gen_utils
Class NeoFromFile

java.lang.Object
  extended by graph_gen_utils.NeoFromFile

public class NeoFromFile
extends java.lang.Object

Provides easy means of creating a Neo4j instance from various graph file formats, loading a Neo4j instance into an in-memory graph, calculating various graph metrics and writing them to file.

Since:
2010-04-01
Author:
Alex Averbuch

Nested Class Summary
static class NeoFromFile.ChacoType
           
 
Constructor Summary
NeoFromFile()
           
 
Method Summary
static void appendMetricsCSV(org.neo4j.graphdb.GraphDatabaseService transNeo, java.lang.String metricsPath, java.lang.Long timeStep)
          Calculates graph metrics for the current Neo4j instance and appends the results to a comma separated metrics (.met) file.
static void applyPtnToNeo(org.neo4j.graphdb.GraphDatabaseService transNeo, graph_gen_utils.partitioner.Partitioner partitioner)
          Allocates nodes of a Neo4j instance to clusters/partitions.
static void applyPtnToNeo(org.neo4j.graphdb.GraphDatabaseService transNeo, graph_gen_utils.partitioner.Partitioner partitioner, java.util.Map<java.lang.String,java.lang.Object> props)
          Allocates nodes of a Neo4j instance to clusters/partitions.
static void main(java.lang.String[] args)
           
static graph_gen_utils.memory_graph.MemGraph readMemGraph(org.neo4j.graphdb.GraphDatabaseService transNeo)
          Loads the current Neo4j instance into an in-memory graph.
static graph_gen_utils.memory_graph.MemGraph readMemGraph(org.neo4j.graphdb.GraphDatabaseService transNeo, java.util.Set<java.lang.String> nodeProps, java.util.Set<java.lang.String> relProps)
           
static void removeDuplicateRelationships(org.neo4j.graphdb.GraphDatabaseService transNeo, org.neo4j.graphdb.Direction direction)
          Deletes duplicate Relationships (all but one Relationship between any two Nodes).
static void removeOrphanNodes(org.neo4j.graphdb.GraphDatabaseService transNeo)
          Deletes all Nodes that do not have at least one Relationship from the Neo4j instance.
static void removeRelationshipsByType(org.neo4j.graphdb.GraphDatabaseService transNeo, java.util.HashSet<java.lang.String> relTypes)
          Deletes all Relationships of the specified RelationshipType values (as given by the String values in relTypes parameter) from the Neo4j instance.
static void writeChaco(org.neo4j.graphdb.GraphDatabaseService transNeo, java.lang.String chacoPath, NeoFromFile.ChacoType chacoType)
          Creates a Chaco file and populates it with the adjacency list representation of the current Neo4j instance.
static void writeChacoAndPtn(org.neo4j.graphdb.GraphDatabaseService transNeo, java.lang.String chacoPath, NeoFromFile.ChacoType chacoType, java.lang.String ptnPath)
          Creates a Chaco (.graph) file and partition (.ptn) files, populates them with the representation of the current Neo4j instance.
static void writeGMLBasic(org.neo4j.graphdb.GraphDatabaseService transNeo, java.lang.String gmlPath)
          Creates a GML (.gml) file and populates it with the representation of the current Neo4j instance.
static void writeGMLFull(org.neo4j.graphdb.GraphDatabaseService transNeo, java.lang.String gmlPath)
          Creates a GML (.gml) file and populates it with the representation of the current Neo4j instance.
static void writeMetrics(org.neo4j.graphdb.GraphDatabaseService transNeo, java.lang.String metricsPath)
          Calculates graph metrics for the current Neo4j instance, creates a metrics (.met) file and populates it.
static void writeMetricsCSV(org.neo4j.graphdb.GraphDatabaseService transNeo, java.lang.String metricsPath)
          Calculates graph metrics for the current Neo4j instance, creates a comma separated metrics (.met) file and populates it.
static void writeNeoFromChaco(org.neo4j.graphdb.GraphDatabaseService transNeo, java.lang.String graphPath)
          Creates a Neo4j instance and populates it from the contents of a Chaco (.graph) file.
static void writeNeoFromChacoAndPtn(org.neo4j.graphdb.GraphDatabaseService transNeo, java.lang.String graphPath, graph_gen_utils.partitioner.Partitioner partitioner)
          Creates a Neo4j instance, populates it from the contents of a Chaco (.graph) file, then allocates Nodes to partitions/clusters.
static void writeNeoFromChacoAndPtn(org.neo4j.graphdb.GraphDatabaseService transNeo, java.lang.String graphPath, java.lang.String ptnPath)
          Creates a Neo4j instance, populates it from the contents of a Chaco (.graph) file, then allocates Nodes to partitions/clusters.
static void writeNeoFromChacoAndPtnBatch(java.lang.String dbDir, java.lang.String graphPath, graph_gen_utils.partitioner.Partitioner partitioner)
          Creates a Neo4j instance using the BatchInserter, populates it from the contents of a Chaco (.graph) file, then allocates Nodes to partitions/clusters.
static void writeNeoFromGML(org.neo4j.graphdb.GraphDatabaseService transNeo, java.lang.String gmlPath)
          Creates a Neo4j instance and populates it from the contents of a GML (.gml) file.
static void writeNeoFromGMLAndPtn(org.neo4j.graphdb.GraphDatabaseService transNeo, java.lang.String gmlPath, graph_gen_utils.partitioner.Partitioner partitioner)
          Creates a Neo4j instance and populates it from the contents of a GML (.gml) file, then allocates Nodes to partitions/clusters.
static void writeNeoFromTopology(org.neo4j.graphdb.GraphDatabaseService transNeo, graph_gen_utils.reader.topology.GraphTopology topology)
          Creates a Neo4j instance and populates it according to a generated graph topology.
static void writeNeoFromTopologyAndPtn(org.neo4j.graphdb.GraphDatabaseService transNeo, graph_gen_utils.reader.topology.GraphTopology topology, graph_gen_utils.partitioner.Partitioner partitioner)
          Creates a Neo4j instance, populates it according to a generated graph topology, then allocates Nodes to partitions/clusters.
static void writeNeoFromTwitterDataset(org.neo4j.graphdb.GraphDatabaseService transNeo, java.lang.String twitterPath)
          Creates a Neo4j instance and populates it from the contents of a dataset with a proprietry binary file format, which contains user follows/following connectivity data from Twitter.
static void writeNeoFromTwitterDatasetAndPtn(org.neo4j.graphdb.GraphDatabaseService transNeo, java.lang.String twitterPath, graph_gen_utils.partitioner.Partitioner partitioner)
          Creates a Neo4j instance and populates it from the contents of a dataset with a proprietry binary file format, which contains user follows/following connectivity data from Twitter.
static void writeNeoFromTwitterDatasetBatch(java.lang.String dbDir, java.lang.String twitterPath)
          Creates a Neo4j instance using the BatchInserter, then populates it from the contents of a dataset with a proprietry binary file format, which contains user follows/following connectivity data from Twitter.
static p_graph_service.PGraphDatabaseService writePNeoFromNeo(java.lang.String pdbPath, org.neo4j.graphdb.GraphDatabaseService transNeo)
          Moved from neo4j_partitioned_api.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

NeoFromFile

public NeoFromFile()
Method Detail

main

public static void main(java.lang.String[] args)
                 throws java.lang.Exception
Throws:
java.lang.Exception

writePNeoFromNeo

public static p_graph_service.PGraphDatabaseService writePNeoFromNeo(java.lang.String pdbPath,
                                                                     org.neo4j.graphdb.GraphDatabaseService transNeo)
Moved from neo4j_partitioned_api. Takes a normal Neo4j instance GraphDatabaseService as input, creates a new partitioned version PGraphDatabaseService in the specified directory, then copies all data from the input instance into the new instance. Nodes must have a Consts.COLOR attribute as this is used to decide which partition each Node is stored in.

Parameters:
transNeo - GraphDatabaseService representing the regular Neo4j instance
pdbPath - String specifying the directory where partitioned Neo4j instance should be created
Returns:
PGraphDatabaseService

applyPtnToNeo

public static void applyPtnToNeo(org.neo4j.graphdb.GraphDatabaseService transNeo,
                                 graph_gen_utils.partitioner.Partitioner partitioner)
Allocates nodes of a Neo4j instance to clusters/partitions. Allocation scheme is defined by the Partitioner parameter. Method writes Consts.COLOR property to all nodes of an existing Neo4j instance. Consts.NODE_LID, Consts.NODE_GID, Consts.LATITUDE , Consts.LONGITUDE properties are also written to all nodes.

Parameters:
transNeo - GraphDatabaseService representing a Neo4j instance
partitioner - implementation of Partitioner that defines cluster/partition allocation scheme

applyPtnToNeo

public static void applyPtnToNeo(org.neo4j.graphdb.GraphDatabaseService transNeo,
                                 graph_gen_utils.partitioner.Partitioner partitioner,
                                 java.util.Map<java.lang.String,java.lang.Object> props)
Allocates nodes of a Neo4j instance to clusters/partitions. Allocation scheme is defined by the Partitioner parameter. Method writes Consts.COLOR property to all nodes of an existing Neo4j instance. Consts.NODE_GID property is also written to all Nodes. User may also specify additional properties (e.g. Consts.LATITUDE ) and default values. These will be written to all Nodes.

Parameters:
transNeo - GraphDatabaseService representing a Neo4j instance
partitioner - implementation of Partitioner that defines cluster/partition allocation scheme
props - Additional properties

writeNeoFromTopology

public static void writeNeoFromTopology(org.neo4j.graphdb.GraphDatabaseService transNeo,
                                        graph_gen_utils.reader.topology.GraphTopology topology)
Creates a Neo4j instance and populates it according to a generated graph topology. Examples of possible topologies are random ( GraphTopologyRandom) and fully connected ( GraphTopologyFullyConnected) graphs.

Parameters:
transNeo - GraphDatabaseService representing a Neo4j instance
topology - instance of GraphTopology the defines generated topology

writeNeoFromTopologyAndPtn

public static void writeNeoFromTopologyAndPtn(org.neo4j.graphdb.GraphDatabaseService transNeo,
                                              graph_gen_utils.reader.topology.GraphTopology topology,
                                              graph_gen_utils.partitioner.Partitioner partitioner)
Creates a Neo4j instance, populates it according to a generated graph topology, then allocates Nodes to partitions/clusters. Examples of possible topologies are random (GraphTopologyRandom) and fully connected (GraphTopologyFullyConnected) graphs. Allocation scheme is defined by the Partitioner parameter.

Parameters:
transNeo - GraphDatabaseService representing a Neo4j instance
topology - instance of GraphTopology the defines generated topology
partitioner - implementation of Partitioner that defines cluster/partition allocation scheme

writeNeoFromChaco

public static void writeNeoFromChaco(org.neo4j.graphdb.GraphDatabaseService transNeo,
                                     java.lang.String graphPath)
Creates a Neo4j instance and populates it from the contents of a Chaco (.graph) file. Chaco files are basically persistent adjacency lists.

Parameters:
transNeo - GraphDatabaseService representing a Neo4j instance
graphPath - String representing path to .graph file

writeNeoFromChacoAndPtn

public static void writeNeoFromChacoAndPtn(org.neo4j.graphdb.GraphDatabaseService transNeo,
                                           java.lang.String graphPath,
                                           java.lang.String ptnPath)
Creates a Neo4j instance, populates it from the contents of a Chaco (.graph) file, then allocates Nodes to partitions/clusters. Chaco files are basically persistent adjacency lists. Partition/cluster allocation is defined by the contents of a .ptn file. This method is only included for convenience/ease of use. writeNeoFromChacoAndPtn(GraphDatabaseService, String, Partitioner) can achieve the same thing.

Parameters:
transNeo - GraphDatabaseService representing a Neo4j instance
graphPath - String representing path to .graph file
ptnPath - String representing path to .ptn file

writeNeoFromChacoAndPtn

public static void writeNeoFromChacoAndPtn(org.neo4j.graphdb.GraphDatabaseService transNeo,
                                           java.lang.String graphPath,
                                           graph_gen_utils.partitioner.Partitioner partitioner)
Creates a Neo4j instance, populates it from the contents of a Chaco (.graph) file, then allocates Nodes to partitions/clusters. Chaco files are basically persistent adjacency lists. Allocation scheme is defined by the Partitioner parameter.

Parameters:
transNeo - GraphDatabaseService representing a Neo4j instance
graphPath - String representing path to .graph file
partitioner - implementation of Partitioner that defines cluster/partition allocation scheme

writeNeoFromChacoAndPtnBatch

public static void writeNeoFromChacoAndPtnBatch(java.lang.String dbDir,
                                                java.lang.String graphPath,
                                                graph_gen_utils.partitioner.Partitioner partitioner)
Creates a Neo4j instance using the BatchInserter, populates it from the contents of a Chaco (.graph) file, then allocates Nodes to partitions/clusters. Chaco files are basically persistent adjacency lists. Allocation scheme is defined by the Partitioner parameter.

Parameters:
dbDir - String representing the path to a Neo4j instance
graphPath - String representing path to .graph file
partitioner - implementation of Partitioner that defines cluster/partition allocation scheme

writeNeoFromGML

public static void writeNeoFromGML(org.neo4j.graphdb.GraphDatabaseService transNeo,
                                   java.lang.String gmlPath)
Creates a Neo4j instance and populates it from the contents of a GML (.gml) file. GML files are basically an ASCII version of the GraphML format.

Parameters:
transNeo - GraphDatabaseService representing a Neo4j instance
gmlPath - String representing path to .gml file

writeNeoFromGMLAndPtn

public static void writeNeoFromGMLAndPtn(org.neo4j.graphdb.GraphDatabaseService transNeo,
                                         java.lang.String gmlPath,
                                         graph_gen_utils.partitioner.Partitioner partitioner)
Creates a Neo4j instance and populates it from the contents of a GML (.gml) file, then allocates Nodes to partitions/clusters. GML files are basically an ASCII version of the GraphML format. Allocation scheme is defined by the Partitioner parameter.

Parameters:
transNeo - GraphDatabaseService representing a Neo4j instance
gmlPath - String representing path to .gml file
partitioner - implementation of Partitioner that defines cluster/partition allocation scheme

writeNeoFromTwitterDataset

public static void writeNeoFromTwitterDataset(org.neo4j.graphdb.GraphDatabaseService transNeo,
                                              java.lang.String twitterPath)
Creates a Neo4j instance and populates it from the contents of a dataset with a proprietry binary file format, which contains user follows/following connectivity data from Twitter. The file was obtained by crawling Twitter for 300 hours.

Parameters:
transNeo - GraphDatabaseService representing a Neo4j instance
twitterPath - String representing path to Twitter dataset

writeNeoFromTwitterDatasetAndPtn

public static void writeNeoFromTwitterDatasetAndPtn(org.neo4j.graphdb.GraphDatabaseService transNeo,
                                                    java.lang.String twitterPath,
                                                    graph_gen_utils.partitioner.Partitioner partitioner)
Creates a Neo4j instance and populates it from the contents of a dataset with a proprietry binary file format, which contains user follows/following connectivity data from Twitter. The file was obtained by crawling Twitter for 300 hours. Allocates Nodes to partitions/clusters. Allocation scheme is defined by the Partitioner parameter.

Parameters:
transNeo - GraphDatabaseService representing a Neo4j instance
twitterPath - String representing path to Twitter dataset
partitioner - implementation of Partitioner that defines cluster/partition allocation scheme

writeNeoFromTwitterDatasetBatch

public static void writeNeoFromTwitterDatasetBatch(java.lang.String dbDir,
                                                   java.lang.String twitterPath)
Creates a Neo4j instance using the BatchInserter, then populates it from the contents of a dataset with a proprietry binary file format, which contains user follows/following connectivity data from Twitter. The file was obtained by crawling Twitter for 300 hours.

Parameters:
dbDir - String representing the path to a Neo4j instance
twitterPath - String representing path to Twitter dataset

writeChaco

public static void writeChaco(org.neo4j.graphdb.GraphDatabaseService transNeo,
                              java.lang.String chacoPath,
                              NeoFromFile.ChacoType chacoType)
Creates a Chaco file and populates it with the adjacency list representation of the current Neo4j instance. Chaco files are assumed to be undirected, this means edges are duplicated in each direction.

Parameters:
transNeo - GraphDatabaseService representing a Neo4j instance
chacoPath - String representing path to .graph file
chacoType - NeoFromFile.ChacoType specifies whether node and/or edge weights are written to the chaco file

writeChacoAndPtn

public static void writeChacoAndPtn(org.neo4j.graphdb.GraphDatabaseService transNeo,
                                    java.lang.String chacoPath,
                                    NeoFromFile.ChacoType chacoType,
                                    java.lang.String ptnPath)
Creates a Chaco (.graph) file and partition (.ptn) files, populates them with the representation of the current Neo4j instance. Chaco files are assumed to be undirected, this means edges are duplicated in each direction.

Parameters:
transNeo - GraphDatabaseService representing a Neo4j instance
chacoPath - String representing path to .graph file
chacoType - NeoFromFile.ChacoType specifies whether node and/or edge weights are written to the chaco file
ptnPath - String representing path to .ptn file

writeGMLFull

public static void writeGMLFull(org.neo4j.graphdb.GraphDatabaseService transNeo,
                                java.lang.String gmlPath)
Creates a GML (.gml) file and populates it with the representation of the current Neo4j instance. All Node and Relationship properties are written to the GML file.

Parameters:
transNeo - GraphDatabaseService representing a Neo4j instance
gmlPath - String representing path to .gml file

writeGMLBasic

public static void writeGMLBasic(org.neo4j.graphdb.GraphDatabaseService transNeo,
                                 java.lang.String gmlPath)
Creates a GML (.gml) file and populates it with the representation of the current Neo4j instance. Only certain Node and Relationship properties are written to the GML file. Consts.COLOR, Consts.WEIGHT, Consts.NODE_GID.

Parameters:
transNeo - GraphDatabaseService representing a Neo4j instance
gmlPath - String representing path to .gml file

writeMetrics

public static void writeMetrics(org.neo4j.graphdb.GraphDatabaseService transNeo,
                                java.lang.String metricsPath)
Calculates graph metrics for the current Neo4j instance, creates a metrics (.met) file and populates it.

Parameters:
transNeo - GraphDatabaseService representing a Neo4j instance
metricsPath - String representing path to .met file

writeMetricsCSV

public static void writeMetricsCSV(org.neo4j.graphdb.GraphDatabaseService transNeo,
                                   java.lang.String metricsPath)
Calculates graph metrics for the current Neo4j instance, creates a comma separated metrics (.met) file and populates it.

Parameters:
transNeo - GraphDatabaseService representing a Neo4j instance
metricsPath - String representing path to .met file

appendMetricsCSV

public static void appendMetricsCSV(org.neo4j.graphdb.GraphDatabaseService transNeo,
                                    java.lang.String metricsPath,
                                    java.lang.Long timeStep)
Calculates graph metrics for the current Neo4j instance and appends the results to a comma separated metrics (.met) file.

Parameters:
transNeo - GraphDatabaseService representing a Neo4j instance
metricsPath - String representing path to .met file
timeStep - Long representing the time-step/iteration related to these metrics

readMemGraph

public static graph_gen_utils.memory_graph.MemGraph readMemGraph(org.neo4j.graphdb.GraphDatabaseService transNeo)
Loads the current Neo4j instance into an in-memory graph. Normalized edge weights to the range [0,1].

Parameters:
transNeo - GraphDatabaseService representing a Neo4j instance
Returns:
MemGraph

readMemGraph

public static graph_gen_utils.memory_graph.MemGraph readMemGraph(org.neo4j.graphdb.GraphDatabaseService transNeo,
                                                                 java.util.Set<java.lang.String> nodeProps,
                                                                 java.util.Set<java.lang.String> relProps)

removeOrphanNodes

public static void removeOrphanNodes(org.neo4j.graphdb.GraphDatabaseService transNeo)
Deletes all Nodes that do not have at least one Relationship from the Neo4j instance.

Parameters:
transNeo - GraphDatabaseService representing a Neo4j instance

removeRelationshipsByType

public static void removeRelationshipsByType(org.neo4j.graphdb.GraphDatabaseService transNeo,
                                             java.util.HashSet<java.lang.String> relTypes)
Deletes all Relationships of the specified RelationshipType values (as given by the String values in relTypes parameter) from the Neo4j instance.

Parameters:
transNeo - GraphDatabaseService representing a Neo4j instance
relTypes - HashSet of RelationshipType representing the Relationships that should be deleted

removeDuplicateRelationships

public static void removeDuplicateRelationships(org.neo4j.graphdb.GraphDatabaseService transNeo,
                                                org.neo4j.graphdb.Direction direction)
Deletes duplicate Relationships (all but one Relationship between any two Nodes). Useful for reducing the size of a Neo4j instance while maintaining the same basic connectivity/structure. Direction is considered when identifying duplicates. RelationshipType is NOT considered (they are all regarded as equal).

Parameters:
transNeo - GraphDatabaseService representing a Neo4j instance
direction - Direction which defines what a duplicate is. If direction equals Direction.OUTGOING then all but one outgoing Relationships between any two Nodes are kept. In this case two Relationships may exist between a pair of Nodes. If direction equals Direction.BOTH then all but one Relationship of any direction between any two Node s is kept.