I gave a talk at recent Cassandra meetup on the data structure that is conceptually used for Cassandras read/write path.  Having a solid understanding of this is something critical to debugging and creating an appropriate data model for Cassandra.  DataStax academy has a couple highly recommended courses that covers this in a lot more detail.  Something it doesn’t mention much is the actual data structure that its based on. Log Structured Merge Trees.

An LSM-tree is composed of two or more tree-like components, each optimized for their type of storage in the case of Cassandra, a small in-memory tree and one or more on disk trees.  LSM-Trees are used in Cassandra, HBase, LevelDB, Google Big Table, SQLite4 & more

I walked through some examples on how it works at a high level during my talk, you can see more at:


Post navigation