Cassandra vs HBase
Cassandra | HBase |
Based on DynamoDB (Amazon). | Based on Bigtable (Google). |
Many Cassandra deployments uses Cassandra + Storm (which uses zookeeper), and/or Cassandra + Hadoop. | Uses the Hadoop infrastructure (Zookeeper, NameNode, HDFS). |
Uses a single node-type with each node performing same functions. | Uses several “moving parts” consisting of Zookeeper, Name Node, HBase master, and data nodes, each performing different functionalities. |
Does not support range based row-scans. | Supports range based scans. |
Random partitioning provides for row-replication of a single row across a wan. | Facilitates asynchronous replication of an HBase cluster across a wan. |
Officially supports ordered partitioning, but not used by production users due to its limitations. | Support ordered partitioning. |
The practical limitation of a row size in Cassandra is 10’s of MB, when data is stored in columns to support range scans. | Easily scale horizontally due to ordered partitioning, while still supporting Rowkey range scans. |
Does not support atomic compare and set. | Support atomic compare and set. |
Support read load balancing against a single row. | Does not support read load balancing against a single row. |
Uses bloom filters for key lookup. | Bloom filters can be used as another form of indexing. |
Does not support co-processor-like functionality. | The coprocessor capability supports Triggers. |