Wednesday, February 13, 2013

Deadlock in Jetty or Be Careful while Synchronizing

About nine months ago I reported a bug to the Jetty community that session timeout doesn't work properly. The bug was fixed quite quickly, but nine months later I have discovered that the fix leads to a deadlock in some scenarios.

Deadlock in Jetty illustrates an interesting coding guideline that you must follow while writing your code.

So what happened in Jetty?

Consider a class A that carries state and lives in a multi-threaded environment. Obviously this class must be synchronized.
Consider that you can subscribe to events of class A. So let's say class you must implement an interface I that will be notified when something important in class A happens.
Let's assume that the method in which class A invokes instances of I is synchronized (A carries state, remember?)
Let's also assume that your implementation of I also carries state and must be synchronized as well.

And now let's see what happens:

Thread-1: Some event on A occurs. It's wants to notify I, but first acquires LOCK_A and then invokes method of I. Method of I tries to change state of I, so it tries to acquire LOCK_I, but it was already acquired by Thread-2.

Thread-2: Runs on I. It changes the state of I, so it acquires the LOCK_I. During the change it needs some information of A. It tries to get it, but LOCK_A was already acquired by class A.

And here we have a deadlock.

So what is wrong here?
The most wrong part is of class A: it invokes method of some other class while it is locked. BAD! Finish the lock before calling someone else! And when I say "someone else" I include the other methods of the class! (What really happened in Jetty is that in class A method f1() was synchronized. Method f1() called to f2(), which called to f3(), which called to f4(), which called to I. It was clear in f4() that no synchronization is needed. But the mistake is actually in f1()!)
So you have some member to change? acquire the lock, change them and release the lock.

In addition, the situation could be a little improved if Read-Write lock was used instead of synchronized: most of the access to class are to read data. May be if LOCK_A was split to READ_LOCK_A and WRITE_LOCK_A; and LOCK_I was split to READ_LOCK_I and WRITE_LOCK_I, it was not causing the deadlock. But this is not about preventing the situation, but about improving.

Summary

The main point of my post is that when synchronizing, find the critical section and synchronize it only! Do not call other methods (even if they are of the same class) from the critical section: gather all information before and notify everyone else after.

No comments: