DEX

License

There are two different licenses: License code is required, thus have a look at configuration section.

Installation

Requeriments

DEX requires:

How to install

Just add jdex.jar to the Java classpath. For example, to run the java program myprog having the DEX library installed in the path ./dex:

$JAVA_HOME/bin/java -cp ./dex/jdex.jar myprog

Introduction

DEX is a high-performance graph database that allows for an efficient storage and management of very large graphs. Implemented as a Java library for persistent graph-like data manipulation and query system, it fulfills the conditions of a graph database model since (i) its data representation is in the form of a large graph, (ii) the query operations are based on graph operations or extensions to graph operations, (iii) query results are also in the form of new graphs, and (iv) there are constraints based on node and edge types, explicit and implicit relationships, and attribute domains.

Data model

The basic logical data structure in DEX is a labeled and directed attributed multigraph, where:
Labeled graph
A graph which has a label for each node and edge which denotes the node or edge type.
Directed graph
A graph which allows for edges with a fixed direction from the tail or source node to the head or destination node.
Attributed graph
A graph which allows a variable list of attributes for each node or edge, where an attribute is a value associated to a string which identifies the attribute.
Multigraph
A graph which allows multiple edges between two nodes even if those edges have the same label (that is, they belong to the same edge type) and direction.

Graph construction

DEX, Session and GraphPool

A {@link edu.upc.dama.dex.core.DEX} instance is a graph database management system that manages one or more graph databases. Each graph database is managed by an instance of the {@link edu.upc.dama.dex.core.GraphPool} class which is responsible for all the memory and I/O management of a graph database.

The persistent graph database is stored in a single DEX data file. Temporary data generated during the activity of a DEX application, such as {@link edu.upc.dama.dex.core.RGraph}s or {@link edu.upc.dama.dex.core.Objects}, is stored in a different temporary file which is destroyed when the {@link edu.upc.dama.dex.core.GraphPool} is closed.

Thus, a new empty {@link edu.upc.dama.dex.core.GraphPool} can be created ({@link edu.upc.dama.dex.core.DEX#create(File, String)}), or an existing {@link edu.upc.dama.dex.core.GraphPool} can be open from a persistent DEX data file ({@link edu.upc.dama.dex.core.DEX#open(File)}).

Moreover, all activity with the graph database must be done through a user's sessions which is created with the {@link edu.upc.dama.dex.core.GraphPool#newSession} method. A {@link edu.upc.dama.dex.core.Session} keeps and manages all temporary data, which is exclusive for the session.

Graph

All DEX graphs belong to a {@link edu.upc.dama.dex.core.Session} and are valid just while its {@link edu.upc.dama.dex.core.Session} is active, that is it has not been closed.

Also, the persistent {@link edu.upc.dama.dex.core.DbGraph} as well as all temporary {@link edu.upc.dama.dex.core.RGraph}s are specializations of the {@link edu.upc.dama.dex.core.Graph} class which follows the logical graph model defined previously.

Types

All nodes and edges (the graph objects) into a {@link edu.upc.dama.dex.core.Graph} are typed (labeled). Each object (node or edge) type has a unique numeric identifier and a unique type name.

Nodes

Each node into a DEX {@link edu.upc.dama.dex.core.Graph} has the following properties:

Edges

Each edge into a {@link edu.upc.dama.dex.core.Graph} has the following properties: There are different kinds of edge types:

Virtual edges

Virtual edges are a special kind of edge which are not materialized. A virtual edge exist only for navigational purposes. They are defined between two attributes which both must belong to the same domain in such a way that a virtual edge will exist between two nodes if they have the same value for the attributes which define the virtual edge. In fact, they are similar to a foreign key in a relational model. Thus, a virtual edge:

Attributes

All objects (node or edges) into a {@link edu.upc.dama.dex.core.Graph} can have attributes. An attribute has the following properties: Class {@link edu.upc.dama.dex.core.Value} allows for setting or getting attribute values of nodes and edges. Global attributes are those defined for all node and edge types. That is, all node or edge objects (no matters which type they belong to) can set and get values from that attribute identifier. To do that, {@link edu.upc.dama.dex.core.Graph#GLOBAL_TYPE} must be used when creating the attribute.

Basic operations

Basic example

Here there is an example of a directed multigraph.

[IMG]Graph-DB

Query

DEX queries ara implemented as a combination of low-level graph-oriented operations, which are highly optimized to get the maximum from the data structures.

These operations can be grouped in different areas.

Selection

Navigation

Collections

The {@link edu.upc.dama.dex.core.Objects} class is used to manage collections of object (node or edge) identifiers, which are the result of most of query methods. These collections can be traversed by means of {@link edu.upc.dama.dex.core.Objects.Iterator}s.

Additionally, a user can build its own temporary collections to store object identifiers. These collections can be updated adding or removing elements from them or using different methods to combine them.

Collections consume important resources from the core memory and from the temporary storage. Thus, it is important to close ({@link edu.upc.dama.dex.core.Objects#close()}) unused collections as soon as possible to recover their memory and resources.

Combination

Temporary graphs

A {@link edu.upc.dama.dex.core.GraphPool} can manage several temporary graphs, or {@link edu.upc.dama.dex.core.RGraph}s.

{@link edu.upc.dama.dex.core.RGraph}s can inherit or copy nodes from another {@link edu.upc.dama.dex.core.Graph} by means of the {@link edu.upc.dama.dex.core.RGraph#addNode(edu.upc.dama.dex.core.Graph, long, boolean)} method. When a node is added, its node type and the attribute defined over its node type are added too. In case of a node from a {@link edu.upc.dama.dex.core.DbGraph}, the {@link edu.upc.dama.dex.core.RGraph} will keep only a reference to it, which is called node inheritance. In case of a node from another {@link edu.upc.dama.dex.core.RGraph}, the {@link edu.upc.dama.dex.core.RGraph} will make a complete copy of it.

Configuration

Some runtime configuration parameters can be set for a DEX instance. This configuration can be set in two different ways: One of these configuration parameters is the license code. The users' license restricts some capabilities of the technology such as maximum number of nodes or maximum number of concurrent Sessions.

Modules

Some additional modules allows to perform different kind of tasks.

Graph dump

There are some methods to get information from a {@link edu.upc.dama.dex.core.GraphPool}. These methods are: Also it is possible to get some of these {@link edu.upc.dama.dex.core.GraphPool.Statistics} by means of the method {@link edu.upc.dama.dex.core.GraphPool#getStatistics}.

Graph export

It is possible to export a {@link edu.upc.dama.dex.core.Graph} to different formats in order to visualize it. To do that it is necessary to implement the {@link edu.upc.dama.dex.core.Export} interface and call the {@link edu.upc.dama.dex.core.Graph#export(PrintWriter, edu.upc.dama.dex.core.Export.Type, edu.upc.dama.dex.core.Export)} method.

Nowadays it is possible to export a {@link edu.upc.dama.dex.core.Graph} to:

Scripting

Package {@link edu.upc.dama.dex.script} allows to create and populate a graph database from CSV files and by means of a scripting language. This way it is not necessary to develop any piece of code.

Loading facilities

Package {@link edu.upc.dama.dex.io} allows to easily create and populate a graph database from CSV files or relational databases by means of a set of classes. In this case it will be necessary to develop a piece of code which uses classes from this package.

Shell

Package {@link edu.upc.dama.dex.shell} contains an interactive shell to query an existing graph database, that is an image file where a {@link edu.upc.dama.dex.core.GraphPool} was previously created.

Algorithms

Package {@link edu.upc.dama.dex.algorithms} allows to run useful algorithms among graphs in order to solve the following known problems:

Architecture

[IMG]Architecture

DEX has been built using an architecture of two layers as shown if the figure above.

Java applications must be built on top of the JDEX Java public API. Also, the public library must be installed into the Java classpath (the native library is included into the JDEX Java public library and automatically loaded from there).

Performance

Some well known databases were loaded for performance evaluation:

Benchmark#Nodes#EdgesRaw (GB)Dex (GB)Load
IMDB13338793222598691.52.421m 8s
Social network908730176635462261132298m
Scopus1510748411329431578115m 50s
Wikipedia194904891802304535.57.6129m 53s
Xmark_25497016298488121.9215m

Appendix A: Examples

Creating a DbGraph

Create and get a DbGraph, the instance which represents the persistent graph data base.

DEX dex = new DEX();
GraphPool gpool = dex.create("C:/image.dex");
Session sess = gpool.newSession();
DbGraph dbg = sess.getDbGraph();
...
...
sess.close();
gpool.close();
dex.close();

Creating node types

Create new node types and their attributes. After that, create new node objects and set a Value for their attributes.

Creating node types ...
sess.beginTx();
DbGraph dbg = sess.getDbGraph();
int person = dbg.newNodeType("PERSON");
long name = dbg.newAttribute(person, "NAME", STRING);
long age= dbg.newAttribute(person, "AGE", INT);
long p1 = dbg.newNode(person);
dbg.setAttribute(p1, name, "JOHN");
dbg.setAttribute(p1, age, 18);
long p2 = dbg.newNode(person);
dbg.setAttribute(p2, name, "KELLY");
long p3 = dbg.newNode(person);
dbg.setAttribute(p3, name, "MARY");
sess.commitTx();
...

Creating edge types

Create new edge (directed and undirected) types and their attributes. After that, create new edge objects and set a Value for their attributes.

Creating edge types ...
sess.beginTx();
DbGraph dbg = sess.getDbGraph();
int friend = dbg.newUndirectedEdgeType("FRIEND");
int since = dbg.newAttribute(friend, "SINCE", INT);
long e1 = dbg.newEdge(p1, p2, friend);
dbg.setAttribute(e1, since, 2000);
long e2 = dbg.newEdge(p2, p3, friend);
dbg.setAttribute(e2, since, 1995);
...
int loves = dbg.newEdgeType("LOVES");
long e3 = dbg.newEdge(p1, p3, loves);
sess.commitTx();
...




Creating edge types ...
sess.beginTx();
DbGraph dbg = sess.getDbGraph();
int phones = dbg.newEdgeType("PHONES");
int when = dbg.newAttribute(phones, "WHEN", TIMESTAMP);
long e4 = dbg.newEdge(p1, p3, phones);
dbg.setAttribute(e4, when, 4pm);
long e5 = dbg.newEdge(p1, p3, phones);
dbg.setAttribute(e5, when, 5pm);
long e6 = dbg.newEdge(p3, p2, phones);
dbg.setAttribute(e6, when, 6pm);
sess.commitTx();
...





Selecting objects

Select all objects from an specific node type and, then, iterate them.

Selecting objects ...
sess.beginTx();
DbGraph dbg = sess.getDbGraph();
Objects persons = dbg.select(person);
Objects.Iterator it = persons.iterator();
while (it.hasNext()) {
   long p = it.next();
   String name = dbg.getAttribute(p, name);
}
it.close();
persons.close();
sess.commitTx();
...







Operating with objects

Select some objects and combine them.

Operating with objects ...
DbGraph dbg = sess.getDbGraph();
...
Objects objs1 = dbg.select(when, >=, 5pm);
...










Operating with objects ...
Objects objs2 = dbg.explode(p1, phones, OUT);
...










Operating with objects ...
Objects objs = objs1.intersection(objs2);
...
objs.close();
objs1.close();
objs2.close();
...





Appendix B: Academic license details

Software end-user license agreement
===================================

NOTICE TO THE USER: BY COPYING, INSTALLING OR USING THIS SOFTWARE OR PART
OF THIS SOFTWARE, YOU AGREE TO THE TERMS AND CONDITIONS OF THIS AGREEMENT
AS IF IT WERE A WRITTEN AGREEMENT NEGOTIATED AND SIGNED BY YOU.  THIS
AGREEMENT IS ENFORCEABLE AGAINST YOU AND ANY OTHER LEGAL PERSON ACTING ON
YOUR BEHALF.

IF, AFTER READING THE TERMS AND CONDITIONS HEREIN, YOU DO NOT AGREE TO
THEM, YOU MAY NOT INSTALL THIS SOFTWARE ON YOUR COMPUTER.

SPARSITY, S.L. (hereinafter ÒThe licensorÓ) IS A SPANISH SPIN-OUT COMPANY
OF UNIVERSITAT POLITECNICA DE CATALUNYA (UPC) AND IS THE EXCLUSIVE
LICENSOR OF ALL THE INTELLECTUAL PROPERTY OF THE SOFTWARE AND ONLY
AUTHORIZES YOU TO USE THE SOFTWARE IN ACCORDANCE WITH THE TERMS SET OUT
IN THIS AGREEMENT.

1. Definitions
==============

"Software" means (a) all the information provided in this agreement,
including but not limited to (i) software files and other computer
information belonging to the licensor or third parties; (ii) written material
and explanatory files ("Documentation"); and (b) any modified versions
and copies of this information, such as improvements, updates and
additions to the information provided to you by the licensor at any time,
insofar as it is not the object of another agreement (collectively,
"Updates").

"User" means the physical or legal person who accepts the terms and
conditions of this agreement.

"Computer" means a computer device that accepts information in digital
or similar form and processes that information to achieve a specific
result based on a sequence of instructions.

2. Intellectual Property
========================

The Software and the authorized copies are property of UPC and Sparsity S.L
is the exclusive licensor. This agreement shall not imply the waiver or
transfer, in whole or in part , of this ownership neither by UPC nor the
licensor. The Software is protected by law, including, but not limited to,
the laws of Spain and other countries on intellectual property, and by the
applicable international agreements.

Except to the extent expressly stipulated here, this agreement does not
grant the user any intellectual property rights over the Software. All
rights not expressly granted are reserved for the licensor.

3. License
==========

The licensor grants the user a free and non-exclusive license to use the
Software. The user may

1) Install and use a copy of the Software for personal evaluation use

2) Develop software using the Software licensed for personal evaluation use

This agreement does not authorize the user to do any of the following:

1) Use or copy the Software in any way other than that specified in
this agreement;

2) Disassemble, decompile or translate the Software, or modify it in
any way other than that permitted by the specific Spanish law on
intellectual property;

3) Sublicense, lease, rent, sell, transfer or transmit his or her
right to the use of the Software.

4) Authorize the total or partial copying of the Software onto the
computer of another natural or legal person;

5) Use the Software as the basis for software developed by third
parties.

Should the user wish to license software developed using the licensed
Software to third parties, he or she should contact the licensor.

4. Limited Guarantee
====================

The licensor does not guarantee the uninterrupted or error-free operation
of a program or that any defects that may occur will be corrected.
The user is responsible for the results obtained through the use of
the software.

5. Liability
============

The licensor shall not be liable for any claims for damages, including
partial or total noncompliance, negligence, falsification or any other
claim

6. Authorization for the use of personal data
=============================================

The user accepts that the licensor uses their personal data for the
following purposes only:

- Communication on DEX updates

- Information about the licensor courses, events and any other licensor
product updates

In order to withdraw the authorization, the user shall send an e-mail to
the following e-mail address:   info@sparsity-technologies.com

7. Other Provisions
===================

7.1 No legally recognized rights of the user shall be subject to
elimination or limitation in virtue of this agreement.

7.2 The licensor may terminate this license if the user fails to comply with
the conditions of this agreement. If the licensor decides to terminate the
license, this decision also terminates the authorization of the user to
make use of the Software.

7.3 The user may not bring any action which may arise from this agreement
more than two years after the cause of said action, unless otherwise
established by Spanish law without the possibility of contractual waiver
or limitation.

7.4 This license is indefinite and failure to comply with it by the user
may give rise to legal action by the licensor at any time.

7.5 The licensor shall not be liable for noncompliance with their obligations
when said noncompliance is due to force majeure.

7.6 This agreement is governed by Spanish law and the courts of Barcelona
shall be the authority competent to resolve any conflicts that may arise
from this agreement.



Barcelona, October 1st 2010