Avoiding Read/Write Locks
Intuitively, read/write locks seem to be a ideal solution. They prevent the read/write and write/write race conditions while making sure multiple read routines don’t block one another. So as long as the write operations are optimized to be as fast as possible, the overall throughput should be high.
In most cases this is true, but there is a subtlety to read/write locks that can end up causing them to be considerably more expensive than they initially appear.
Consider the following:
Thread 1 requests read access to a shared read/write lock.
At this point in time no one else has requested the lock for either read or write access so a read lock is granted to Thread 1.
Thread 2 requests read access to the same read/write lock.
Since the lock is currently only read locked by Thread 1, Thread 2 is also given read access.
Thread 3 requests write access to the same read/write lock.
Since the lock has been locked for read access by both Thread 1 and Thread 2, Thread 3 is put on a wait list for write access until Thread 1 and Thread 2 release their read locks.
Thread 4 requests read access to the same read/write lock.
What happens here?
A first guess might be to give Thread 4 the read lock, along with Threads 1 and 2, since Thread 3 has not yet obtained the write lock, but lets consider the fall out if we let this happen. Imagine a system where every read operation took a minute to execute and requests are comming in every 30 seconds. If every read request is granted a read lock so long as no write thread had been given a write lock, new read requests would always come in before previous ones finished ensuring that the lock was never free (ie not read locked.) In this situation the write operation would be blocked from every achieving write access. aka, lock starvation.
In order to prevent lock starvation it is necessary that as soon as a write lock is requested every subsequent read lock request is blocked from obtaining their lock until the thread requesting the write lock has been given the write lock, executes, and then releases their lock.
This means that as soon as a write lock is requested any read threads currently running will block all new read requests. So a read/write lock is, worse case, as “expensive” as the time it takes to execute the write operation plus the worse case time it takes to execute a read operation.
Therefore, even if you optimize your system so the write operations are as fast as possible, your overall throughput may still be dependent on your slowest running read operation.
What read/write locks are good for.
Read/Write are useful when you need a memory barrier and every thread on your service should only see the state of some memory either before or after a write operation, but never but never both. Meaning that it is not allowable for some threads to see the value of the memory before the write operations and other threads running at the same time to see the memory value after the write operation.
But, as we saw above, this comes at a cost and in many cases it may not be needed.
For example, suppose we have a data structure that holds the account balance for an individual. Suppose also that this account receives a constant stream of read requests that, for what ever reason, take several minutes to calculate. If we were to use a traditional read/write lock here then every time there is a deposit or withdrawal (a write operation) all read operations will be blocked until all running read operations finish and the write executes.
Although the write operation may be quick, the long execution time on the read operation is going to cause the system to appear blocked while it waits for the write lock. But if we can tolerate concurrent threads seeing two different values then we can remove the block all together and achieve higher overall throughput.
To achieve this we first need to make the data structure immutable. Updates would instead be handled by the write thread copying the immutable data structure locally, modifying it (or modify on copy) and then using an atomic operation to swap in the new data structure.
Read operations running against the old data structure would continue to run and would report the old value, while any new read operation would run over the new data structure and report the new value. Eventually all old read operations would finish and the only view would be of the new data structure, but for a time it would be possible for different threads to report different values.
Destroying the old data structure can be handled either automatically by letting the garbage collector pick it up, or we can add read counters to objects which can tell us when an object is no longer being accessed and we can delete it manually.
We could also re-introduce a simple lock for just the writer operations to ensure that multiple write operations always happen in sequence so they can never overlap and they would never cause inconsistent results.
So while read/write locks often appear, at first blush, like a panacea, they can actually carry a substantial hidden cost. And in some cases the functionality provided by a read/write lock may not even be what you really need.