Last time we visited Cassandra we talked about how compactions work and how to use write-survey mode to safely test configuration changes and see their impact in production. This time we are going to move into a slightly more in-depth topic.
As a reminder here is the way Cassandra handles writes from my last post:
New data is written first to the commit log, then when the commit log is flushed to disk to an in memory structure called a memtable. When that is full it is in turn flushed to disk to a data structure called an SSTable that is immutable. If a new write happens on the same row after the SSTable has been created then that data will end up on a different SSTable.
The most interesting word there is “immutable”. Let’s look into why.
immutability is good
The most expensive operations in the Cassandra write path are compactions The more a particular row is updated and the bigger these row updates are, then the more work compactions have to do. To see why let’s imagine the exact opposite scenario, that of a row written once and never again. That row will appear once in the memtables, then it will be flushed in a single SSTable and since the SSTable is immutable—unless some other rows in it have to be compacted—Cassandra will never have to do anything more on that row except read it. As we said last time Cassandra achieves its good write performance by “cheating” and delaying work until the compaction phase. Since this row does not have to be compacted writing it becomes a very cheap operation.
Conversely, imagine a row that gets mutated often and/or where every mutation is quite large. In this scenario fragments, possible quite large ones, will exist in multiple SSTables and those tables will have to be compacted fairly often in order to ensure a reasonable upper bound on read latency. The work that Cassandra tried to avoid doing in the first place has to take place now and depending on the number of SSTables and the size of the mutations in them it will require a fair few system resources. On an extreme scenario this could end up overloading your cluster to the point where it either can’t keep up with the build up of pending compactions or where, if compaction throughput is set too high, read performance will be impacted. Keep in mind that due to SSTable immutability deletions are not done in place, but are rather done by writing a special tombstone entry. Those tombstones also have to get compacted so a row that gets deleted and rewritten often will also experience similar problems.
So what does all this mean? Let us state the obvious first. Cassandra, much like every other piece of technology out there is not a silver bullet. It is a great tool with substantial capabilities, but it is intended to be used in a specific way and if you stray too far from the expected use cases then you can expect behaviour to degrade. In other words Cassandra is software, not magic.
To get the most out of it you should structure your data model with the aim to reduce the volume of writes as much as possible by using immutable data and finely granular updates. The latter means that if you know that only a single column of data has changed then write a statement that only updates that single column not the entire row. This will serve to minimise Cassandra’s workload when writing and improve the cluster’s performance.
Hopefully this will prove useful to those of you working on or considering to use Cassandra in your tech stack. See you next time.
If you enjoyed the read, drop us a comment below or share the article, follow us on Twitter or subscribe to our #MetaBeers newsletter. Before you go, grab a PDF of the article, and let us know if it’s time we worked together.