|Developer(s)||Apache Software Foundation|
|Stable release||1.2.2 / 13 July 2016|
|License||Apache License 2.0|
HBase is an open source, non-relational, distributed database modeled after Google's BigTable and is written in Java. It is developed as part of Apache Software Foundation's Apache Hadoop project and runs on top of HDFS (Hadoop Distributed Filesystem), providing BigTable-like capabilities for Hadoop. That is, it provides a fault-tolerant way of storing large quantities of sparse data (small amounts of information caught within a large collection of empty or unimportant data, such as finding the 50 largest items in a group of 2 billion records, or finding the non-zero items representing less than 0.1% of a huge collection).
HBase features compression, in-memory operation, and Bloom filters on a per-column basis as outlined in the original BigTable paper. Tables in HBase can serve as the input and output for MapReduce jobs run in Hadoop, and may be accessed through the Java API but also through REST, Avro or Thrift gateway APIs. HBase is a column-oriented key-value data store and has idolized widely because of its lineage with Hadoop and HDFS. HBase runs on top of HDFS and is well-suited for faster read and write operations on large datasets with high throughput and low input/output latency.
HBase is not a direct replacement for a classic SQL database, however Apache Phoenix project provides an SQL layer for HBase as well as JDBC driver that can be integrated with various analytics and business intelligence applications. The Apache Trafodion project provides a SQL query engine with ODBC and JDBC drivers and distributed ACID transaction protection across multiple statements, tables and rows that uses HBase as a storage engine.
HBase is now serving several data-driven websites, including Facebook's Messaging Platform. Unlike relational and traditional databases, HBase does not support SQL scripting; instead the equivalent is written in Java, employing similarity with a MapReduce application.
In the parlance of Eric Brewer’s CAP Theorem, HBase is a CP type system.
Use cases & production deployments
Enterprises that use HBase
The following is a list of notable enterprises that have used or are using HBase:
- Amadeus IT Group, as its main long-term storage DB.
- Facebook uses HBase for its messaging platform.
- Sophos, for some of their back-end systems.
- Spotify uses HBase as base for Hadoop and machine learning jobs.
- Tuenti uses HBase for its messaging platform.
- Wide column store
- Apache Cassandra
- Cask (company)
- Oracle NOSQL
- Apache Accumulo
- Project Voldemort
- Apache Phoenix
- Chang, et al. (2006). Bigtable: A Distributed Storage System for Structured Data
- "Hbase in Nutshell"
- Powered By HBase
- The Underlying Technology of Messages
- Facebook: Why our 'next-gen' comms ditched MySQL Retrieved: 17 December 2010
- Doyung Yoon. "S2Graph : A Large-Scale Graph Database with HBase".
- Cheolsoo Park and Ashwin Shankar. "Netflix: Integrating Spark at Petabyte Scale".
- Josh Baer. "How Apache Drives Spotify's Music Recommendations".
- "Tuenti Group Chat: Simple, yet complex".
- "Tuenti Asyncthrift".
- Dimiduk, Nick; Khurana, Amandeep (28 November 2012). HBase in Action (1st ed.). Manning Publications. p. 350. ISBN 978-1617290527.
- George, Lars (20 September 2011). HBase: The Definitive Guide (1st ed.). O'Reilly Media. p. 556. ISBN 978-1449396107.
- Jiang, Yifeng (16 August 2012). HBase Administration Cookbook (1st ed.). Packt Publishing. p. 332. ISBN 978-1849517140.