Columns used as a record
If we place name value pairs under the column key that contain different attributes then we can consider the columns as classic database record. So if we are wanting to store the details of a user then the columns might be:
Name (key)
- Email: user@example.com
- Twiter: TwitterUser
- Phone: 01 000 345678
In this schema the order of the columns is not important because the names are not related. However unlike a relational database, there is no definition of the “fields” in the record, we define them at runtime in the application. This does give us the flexibility to add new fields providing our application can handle missing “fields”.
Columns used as a List
If each of the name value pairs are the same attributes then we can consider this as an ordered list . In this use case the ordering of the columns is important and the ordering type needs to be carefully thought out. For example if we want to store messages from a user, and we want to be able to get the most recent, then we will store them as:
Author (key)
- Timeuuid: Message
- Timeuuid: Message
- Timeuuid: Message
This ordered list can be thought of as an index of records. The records would be stored in another column family.
Supercolumns as a list of records
Even better, we can use supercolumns to create a combination of lists and records. Normally we would make the supercolumns the ordered list and the columns the record. In our messaging system, we want to get the latest messages from a user:
Author (Key)
- Timeuuid: (Supercolumn name)
- Message: Message Text
- Time: Time of message
- Picture: Binary picture data
- Timeuuid: (Supercolumn name)
- Message: Message Text
- Time: Time of message
- Picture: Binary picture data
The supercolumns are ordered by time, the columns under it are not ordered.
As ever I look forward to comments about this post.