Hamann Distributed

Making mistakes at scale so you don't have to.

Careful With Cassandra Upserts

A nice thing about Cassandra is the easily understandable data model: There are just upserts – an insert will automatically update / overwrite old rows. This does NOT hold true however in every case when using dynamic columns, as Cassandra does not have the same concept of a “row” as a traditional database.

Essentially, a Cassandra “row” is just a double hashmap. One layer goes to the key and says exactly on which server the row is, and the column key says where on the server the column is. This very flexible concept can lead to a problem later on though when some of the columns are different.

Here’s an entry in the “Employees” ColumnFamily:

employee_id: 599 (KEY)
name: "Larry Page"
age: 46

Now for various reasons, we have to update employee 599 with another denormalized person:

employee_id: 599 (KEY)
name: "Sylvie Stone"
devices: ["MacBook Pro"]

Sylvie didn’t tell us her age (she’s a lady after all!) and for new employees, we’re also tracking the devices we handed them. When we’re upserting employee 599, a lot of people with SQL or a document-oriented database background are expecting to have the second entry in the database. That’s not true at all unfortunately – what we will find now is this:

employee_id: 599 (KEY)
name: "Sylvie Stone"
devices: ["MacBook Pro"]
age: 46

Welcome to the world of column-oriented databases – and before you think “WTF”, think about it for a moment. This is expected behaviour and part of Cassandra’s “independent columns” paradigm. Even if it looks like it in CQL, you never actually overwrite rows – you overwrite the columns behind it.

So how to avoid this? You just need to model your data properly or navigate around it. As Cassandra columns are way smarter than columns in other databases, there exists a way to correct for this effect in case it’s needed. How? Look under the hood. What Cassandra really stores is this:

"599": [
    {name:employee_id, value:599, timestamp: 1340385863990010, ttl: 0},
    {name:name, value:"Sylvie Stone", timestamp: 1340385863990010, ttl: 0},
    {name:devices, value:["MacBook Pro"], timestamp: 1340385863990010, ttl: 0},
    {name:age, value:46, timestamp: 1340133763990010, ttl: 0}

As you’ll have imagined, Sylvie is relieved she’s not really 46… the entry is simply older than the rest of them, but was neither deleted nor overwritten!

Every decent driver for Cassandra can expose the timestamps and TTL’s as well – and there’s your solution to clean up the mess in the “eventually consistent” paradigm that the database follows: If it’s not the same timestamp as the key, simply discard the column (you’re free to delete it as well).

And don’t worry, this added hassle in handling something that would be considered a no-brainer with more conventional databases is more than worth the flexibility gained with Cassandra’s independent columns. More on advanced data modelling leveraging this power will follow!