A distributed data store
table may store millions or billions of rows, so avoiding full table scans is
obviously of great importance. CloudGraph™ leverages
all available scan mechanisms for a particular data store but gives priority to
the more performant API. With Apache HBase for example, the partial-key-scan facility is
extremely fast and therefore given first priority. All graph queries are transformed
into a full or partial-key-scan whenever possible based on available field
literals found in a query. Short of that, a fuzzy-row-key filter scan is used,
and finally if the expressions comprising a query are sufficiently complex, a
filter hierarchy is assembled.