« A world of weblogs | Main | Optimism is paying off (and other stories) »

April 20, 2005

Transactions

I have been working (with Burak's help) on writing a simple library providing atomic transactions in Scala. To do this, the for comprehension is diverted from its original use to be considered as a monad instead. In practice, here is how an atomic exchange of the value of two variables would look like:

val x = new atomic.Variable[Int];
val y = new atomic.Variable[Int];
val z = new atomic.Variable[Int];
x.value = 1; y.value = 2;
(for (
  val a <- x.get();
  val _ <- z.put(a);
  val b <- y.get();
  val _ <- x.put(b);
  val c <- z.get();
  val d <- y.put(c);
) yield d) run

The problem now is that the for loop as a monad is something that I find very difficult to get an intuition about. I have a very simple version that locks down the entire program when evaluating the atomic bloc, but this is of course hopelessly inefficient. But when I try to improve from this basic version, I am having troubles grasping how things interact around this comprehension/monad, and getting the right intuition on how to do it. Please, read on for a small explanation on the locking mechanisms I intend to add.

Both locking mechanism below should considerably improve the performance when compared with the first one. But both are optimistic so could actually also decrease performance in the worst case.

  • Locking variables. With this method, the locking would be done on the variable itself. A variable must be locked as soon as it is read or written to guarantee a consistent state during the entire transaction. When a variable that is already locked by another transaction is reached, the transaction is aborted and restarted to prevent deadlocks. Restarting means reverting all variables that have been changed to their pre-change value. The easiest here is to write all changes to a buffer in the variable and commit the buffers when the transaction is completed or leave them alone when the transaction is aborted.
  • Versioning variables. When a variable is read, its version number is remembered. When a variable is written, its version number is incremented and a function that will undo the write is generated (this is not easy). At the end, the used variables are locked and their current version numbers are compared with the version numbers remembered. If they concur, the transaction is confirmed. If they don't, it means another thread has been messing with the variables; the transaction is restarted after the undo functions are applied to the variables.

Comments

Have you looked at Tim Harris's work on atomic blocks? http://research.microsoft.com/~tharris/.
I think there are implementations around for Java and Haskell.