- Released in 2008 as Hadoop sub-project
- Open Source
- Written in Java
- Implementation of Google’s BigTable
“AA Bigtable is a sparse, distributed, persistent multidimensional sorted map.”
“Bigtable: A Distributed Storage System for Structured Data”
Chang et al. (2006)
- Columnar Store
- Big Data: billions of rows X millions of columns
- Distributed on commodity hardware (no Exadata needed)
- Strict consistency
“Seven Databases in Seven Weeks”
• No data types. Everything is byte[].
Redmond & Wilson (2012) Pragmatic Programmers
Name Connection Method Shell Direct Java API Direct Thrift Binary protocol REST HTTP Avro Binary protocol
- Made to work with Hadoop
- Really big data
- Big queries
- High volume of writes
Three modes
Recommendation is no fewer than five nodes.- Stand-alone mode
- Pseudodistributed mode
- Fully distributed mode