Monday, February 21, 2011

Cassandra’s data model as records and lists

I have to admit I’ve never really been happy with Cassandra’s data model, or to be more precisely, I’ve never really been with my understanding of the model. However I’ve realized that if we think of two use cases for column families then things may become a bit clearer. For me, Column families can be used in one of two ways, either as a record or an ordered list.

Columns used as a record
If we place name value pairs under the column key that contain different attributes then we can consider the columns as classic database record. So if we are wanting to store the details of a user then the columns might be:

Name (key)

  • Email: user@example.com
  • Twiter: TwitterUser
  • Phone: 01 000 345678


In this schema the order of the columns is not important because the names are not related. However unlike a relational database, there is no definition of the “fields” in the record, we define them at runtime in the application. This does give us the flexibility to add new fields providing our application can handle missing “fields”.

Columns used as a List
If each of the name value pairs are the same attributes then we can consider this as an ordered list . In this use case the ordering of the columns is important and the ordering type needs to be carefully thought out. For example if we want to store messages from a user, and we want to be able to get the most recent, then we will store them as:

Author (key)

  • Timeuuid: Message
  • Timeuuid: Message
  • Timeuuid: Message


This ordered list can be thought of as an index of records. The records would be stored in another column family.

Supercolumns as a list of records
Even better, we can use supercolumns to create a combination of lists and records. Normally we would make the supercolumns the ordered list and the columns the record. In our messaging system, we want to get the latest messages from a user:

Author (Key)

  • Timeuuid: (Supercolumn name)
    • Message: Message Text
    • Time: Time of message
    • Picture: Binary picture data
  • Timeuuid: (Supercolumn name)
    • Message: Message Text
    • Time: Time of message
    • Picture: Binary picture data

The supercolumns are ordered by time, the columns under it are not ordered.

As ever I look forward to comments about this post.

7 comments:

  1. Website design is a very important aspect of setting up and managing a website, especially if you want to attract visitors to your website.
    reshma

    ReplyDelete
  2. I love your articles. Nicely presented information in this post, I prefer to read this kind of stuff. The quality of content is fine and the conclusion is good. Thanks for the post.

    Joomla developer

    ReplyDelete
  3. Thanks for sharing your info. I really appreciate your efforts and I will be waiting for your further write ups thanks once again.
    Vee Eee Technologies

    ReplyDelete
  4. This comment has been removed by the author.

    ReplyDelete
  5. This is the Very Nice Article.. Thank you very much for sharing..
    great information.java application development

    ReplyDelete
  6. Thanks for taking the time to discuss this, I feel strongly about it and love learning more on this topic. If possible, as you gain expertise, would you mind updating your blog with more information? It is extremely helpful for me.

    Snap Pack & Direct Mail Advertising

    ReplyDelete
  7. This is the Very Nice post.Thank you ..for sharing this..

    ReplyDelete