HBase

 

Apache HBase is arguably the leader within this rapidly evolving market and numerous best practices have emerged out of the open-source software ecosystem surrounding HBase. Many best practices target specific strengths of HBase and some accommodate various weaknesses, such as limited support for ACID transactions. In HBase ACID transactions are supported only across a single row, not multiple rows or tables. Therefore a common best practice involves grouping potentially large segments of data, for example an entire user profile, within a single HBase row. Other critical best practices involve the use of column families and in particular the use and format and design of composite row and column keys. Composite row key design in particular involves critical decisions affecting the current and future query capabilities of a table and in general the performance and even distribution of table data across regions in a cluster.

 

The CloudGraph™ implementation encapsulates many HBase best practices in each of these areas and provides a framework within which to encapsulate future best practices as they evolve.  Complexities of terse and efficient physical row and column key generation are completely hidden and the client user is provided with rich configuration capability and a generated, standards-based API based on one or more domain-specific business models.