Apache Cassandra is a scalable database cluster that uses a NoSQL philosophy like MongoDB. It started life as a "column-oriented" database that could have thousands of columns and an unstructured storage system, but ended up with SQL-like tables and a SQL-like query language called CQL. It has some unique properties, including immutable tables (every write makes a new row, including updates and deletes), cross-data-center replication, and a "ring" design that requires no single coordinating server that could be a point of failure.
Cassandra's definitely a specialized tool that needs careful planning, but our experience with it has been a good one. For those used to SQL databases like MySQL or PostgreSQL, it requires some unlearning: There are no JOINs, sorting is limited to primary keys, and any query that takes more than a few seconds will just get timed out, but INSERTs are much faster than other databases, denormalization is a feature rather than an antipattern, and replication is a matter of spinning up a new node and letting it connect to the pool. In fact, one finds that most of its shortcomings are actually present in traditional SQL DBs too; the difference is that SQL DBs often provide enough abstraction to give the illusion that a poorly-optimized data design will work, at least in the short term.
Cassandra is good for write-heavy loads in which the data does not change much afterwards and in which it's not critical that every member of the cluster have the same data set immediately after an update. We used it for a messaging system, but it's also good for analytics summaries and logging.
The experimental nature of Cassandra can make it difficult to install on some systems and integrate with some projects. It isn't necessarily a good fit for everything, but can solve some problems nothing else can.
Accomplish your software projects fast with our experience.Get A Free Estimate