Experimental Cassandra Lock·
Cassandra is my favourite column oriented data store, I’ve decided to create a simple Cassandra lock system to be able to do synchronised updates on Cassandra, I hear you say there are several DLMs like Zookeeper or Memcached can be implemented as a simple locking system, but I like a solution purely based on Cassandra itself. After lots of surfing on the net I came up with nothing! absolutely nothing! well, I created one (better to say experimented).
Disclaimer normally comes at the end of article but, I put it here because, this is PURELY EXPERIMENTAL IMPLEMENTATION AND THIS SOLUTION WAS NOT USED IN PRODUCTION, THIS IS JUST A CRAZY IDEA INTO ACTION. Actually I’m not sure it’ll work at all :-) - sigh - If you are looking for something that works please stop reading here and go back to Google otherwise you are welcome to test it and improve it.
The original idea is not mine and you can find it on Apache Cassandra web site and it uses Lamport’s Bakery locking algorithm. You can find more information about that in Wikipedia.
Lamport envisioned a bakery with a numbering machine at its entrance so each customer is given a unique number. Numbers increase by one as customers enter the store. A global counter displays the number of the customer that is currently being served. All other customers must wait in a queue until the baker finishes serving the current customer and the next number is displayed. When the customer is done shopping and has disposed of his or her number, the clerk increments the number, allowing the next customer to be served. That customer must draw another number from the numbering machine in order to shop again. – Wikipedia
I use two
Choosing. Whenever a process wants to enter a critical section
- first it must get a new number, in order to do that it must look into
truevalue regardless of column names to make sure that nobody is withdrawing new number at the moment.
- Then informs other processes by puting its
PIDas column name (in actual code PID is optional and any other
unique IDcan be used) into
CFwith value of
true, meaning that it’s going to get new number from
- Then the process withdraws new number from
CFby fetching the last number which is registered in
CF, and incrementing it by one.
- Then puts that new number into
CFalong with its
PIDas column name.
- Releases the
Choosinglock by changing the
falsethen in its own column (identified by
- At this point the process waits for all other processes to get their number again by looking into
- When all process got their numbers it waits for process with lower numbers to leave critical section and remove their PID from Numbers CF.
- As we can not guarantee any two processes don’t get the same number, we use both chosen number and PID (unique id) to find the higher priority process to enter critical section.
- When the process finds that there is no one with higher priority in the list then it will enter the critical section.
- Lastly when a process wants to leave critical section simply removes its PID from both
Talk is cheap, show me the code – Torvalds
PHP to implement this concept and PHPCassa library is used to communicate with Cassandra. The column families in Cassandra (ie.
Choosing) must be created with consistency level of QUORUM. Both column families have row key and column name as integers and column values as boolean defined. The
PHP code is available on GitHub.
Well, that’s all for now, thank you for reading this up to here :-) and please leave comments if you have any interesting idea about this. All patch, fixes, improvements are welcomed only on GitHub.