Reading and writing, part 2: files and databases

This article is part of a five-part series on race conditions. The complete series is:

–

In part 1 we introduced the Location type, which provides operations for reading and writing a value, and saw how it occurs inside programs as they manage the data they hold in memory. This led us towards some strategies for dealing with the safety problems this type creates in concurrent code, including the use of object orientation and mutexes to manage access to data. We will now apply this concept to data stored outside program memory – on disk, in databases, and other external services.

Let’s start with maybe the simplest form of external storage: a single file. We’ll modify our counter example to store the counter’s value in a file named counter.txt, whose initial content is 0.

filename = "counter.txt"

threads = (1..4).map do
  Thread.new do
    100.times do
      tmp = File.read(filename).strip.to_i      # read file
      content = "#{tmp + 1}\n"
      File.write(filename, content)             # write file
    end
  end
end

threads.each(&:join)
puts File.read(filename)

We have the same Location pattern here that we saw in previous examples: we read from the counter.txt file, generate some new value based on the data we read, and then write that value back to the same file. Ignoring the fact that file reads and writes to disk can fail, these are occurrences of the read() -> T and write(T) operations. If you run this, you’ll see that almost all the updates get lost – this happens because most of the read calls are producing the same value, as they don’t wait for any write calls currently in progress to finish.

This is a really common operation in command-line programs, for example for editing a configuration file, or automatically formatting source code. If this operation is done concurrently, then one task can have its data overwritten by another that was running at the same time, giving us a lost update bug. To make sure every task sees the changes made by all others, the tasks must be made to run sequentially.

When dealing with data in memory, we wrapped the data we needed to protect in abstractions to control the program’s access to it. When dealing with files on disk, we can’t wrap data in the same way: every process running on the machine has the same access to the filesystem, and one process can’t insert an abstraction that affects every other program’s access. But the filesystem does offer some operations to help programs co-ordinate their use of files, and make concurrent file access safe.

The open() system call accepts a set of flags that control whether the file is opened for reading or writing or both, whether to create the file if it doesn’t exist, whether to truncate or append to it, and so on. By combining the flags O_RDWR, O_CREAT and O_EXCL, we can ask the kernel to open a file for reading and writing, and to create the file only if it doesn’t exist; if the file already exists then open() will fail.

Let’s rewrite the loop body of the above program to make use of this. Before trying to read counter.txt, we open counter.txt.lock with the above flags. This will only succeed if that file does not exist, so if we manage to open the file, that means nobody else can open it until we delete it again – the file acts like a mutex. If we fail to open the file, we retry until we’re successful; this is like a mutex’s lock() method blocking until the lock is acquired. Once the lockfile is open, we can read the counter.txt file to get the current value. To store the new value, we write to the lockfile, and then rename the lockfile to the name of the file we want to update, i.e. we rename counter.txt.lock to counter.txt. This achieves two things: first, because rename() is atomic, it means counter.txt never exists in a partially complete state, where not all the new data has finished being written to it. Second, it effectively removes the lockfile so another task can claim it.

      begin
        flags = File::RDWR | File::CREAT | File::EXCL
        lock = File.open("#{filename}.lock", flags)
      rescue Errno::EEXIST
        retry
      end
      tmp = File.read(filename).strip.to_i
      lock.puts("#{tmp + 1}")
      lock.close
      File.rename("#{filename}.lock", filename)

This is a much safer way to update a file, which avoids race conditions and partially-complete data. It’s how Git updates its index, branch pointers and config files, to make sure that no two processes try to update the same file at the same time and end up overwriting each other’s changes. In part 1, we introduced the Mutex type, which gates the data access operations behind a successful call to lock():

Mutex<T> = {
    lock() -> Lock<T>;
}

Lock<T> = {
    read() -> T;
    write(T);
    unlock();
}

The File API gives us something very similar, but its open() method can fail rather than blocking indefinitely:

FileSystem = {
    open() -> Option<File>;
}

File = {
    read() -> String;
    write(String);
    close();
}

open() returns Option<File>, which means it returns a File on success, but may also return nothing when it fails. We can’t call read() or write() until we have a File, and open() makes sure we can’t get one until it’s safe. Letting this operation fail gives us the choice of giving up rather than waiting indefinitely; we can regain lock() semantics by calling open() in a loop until it succeeds, as we did above.

So we have seen that the Unix file system provides tools for programs to use it concurrently and safely co-ordinate their activity – we don’t need to rely on any special mechanism in our programming languages to manage file access. But while files are the bedrock of the Unix system, many applications don’t deal with them directly. They instead deal with databases and other services that provide powerful ways to store, index and query their data. Here we see a rich variety of tools for handling concurrency safely, but it’s also where we see a lot of concurrency errors occur in the wild.

The paper Feral Concurrency Control catalogs the problems with how Ruby on Rails in particular manages data integrity. Many other frameworks have similar problems, but Rails is what I’m most familiar with in the web application space and serves as a good example of how developers are discouraged from using the tools their database provides to handle concurrency. We’ll use our counter example to demonstrate a few ways that a database can give us safe concurrency, before talking about how these problems tend to come up in web applications.

For these examples, I’ll be using PostgreSQL with the following schema and model: a Counter has a name and a value, which must not be null and defaults to 0.

ActiveRecord::Schema.define do
  create_table :counters, force: true do |t|
    t.string :name, null: false
    t.integer :value, null: false, default: 0
  end
end

class Counter < ActiveRecord::Base
end

As usual, we start up four threads, and have each of them attempt to increment the counter repeatedly. In our first implementation of inc_counter, we do this by calling Counter.find_by to read the counter record from the database, and then Counter#update to write a new value, based on the one we initially read.

Counter.create(name: "my-counter")

def inc_counter
  ctr = Counter.find_by(name: "my-counter")
  ctr.update(value: ctr.value + 1)
end

threads = (1..4).map do
  Thread.new do
    100.times { inc_counter }
  end
end

threads.each(&:join)
p Counter.find_by(name: "my-counter")

As you can no doubt guess by now, this example does not work. On my machine, the counter ends up having a value somewhere between 100 and 200, so a majority of the updates this program makes are lost. We can see why by examining the SQL that is logged when we run this code:

SELECT * FROM counters WHERE name = 'my-counter' LIMIT 1;
BEGIN;
UPDATE counters SET value = 2 WHERE id = 1;
COMMIT;

We SELECT the record, loading a copy of it into memory in the ctr variable. Calling ctr.update triggers the BEGIN/UPDATE/COMMIT queries that update the record, writing the new value. Rails automatically wraps all updates in a transaction, so that any validations and life-cycle callbacks triggered by the update are run atomically with it. But, the initial SELECT is not inside this transaction and is done without holding a lock on the record. That means two concurrent tasks can both run the SELECT query, both get an identical copy of its current state, and then both UPDATE the record with the same new value.

Task 1                          | Task 2
--------------------------------+--------------------------------
SELECT * FROM counters ...      |
    --> value = 1               |
                                | SELECT * FROM counters ...
                                |     --> value = 1
BEGIN                           |
                                | BEGIN
UPDATE counters SET value = 2   |
COMMIT                          |
                                | UPDATE counters SET value = 2
                                | COMMIT

The only protection the transaction adds is that (with the default PostgreSQL transaction configuration) it forces the UPDATE in Task 2 to happen after the transaction in Task 1 has committed; when Task 1 runs UPDATE the database automatically places a write lock against the affected records which prevents any other transaction modifying them until Task 1 commits, releasing its locks. It does not prevent the broader race condition caused by concurrent reads that determine the values in the ensuing writes. Further co-ordination is needed to make this work correctly.

Now, this particular example can be implemented quite straightforwardly in SQL, by making the database increment the value rather than copying the record into application memory and incrementing it there:

UPDATE counters SET value = value + 1 WHERE name = 'my-counter';

In Rails, this query can be accomplished with this code:

def inc_counter
  Counter.where(name: "my-counter").update_all("value = value + 1")
end

However, most application-level problems in the wild aren’t so simple, and we usually want to load a record into memory, run some business logic against it, update some values, validate the result, and save it back. This pattern is how most Rails controller actions for updating a record from a form work:

  def update
    thing = Thing.find(params[:id])
    thing.update(params[:thing])
  end

There’ll be some additional logic in here, like access control, validation, generating the next page to display, and so on, but this idiom is at the core of how most Rails apps update records, and just like our counter example, it’s unsafe unless your server only handles a single request at a time. How do we spot this unsafe pattern? We can see a Location type here: we read a record out of the database, making a copy of it in memory, we change the details of that copy by incorporating some form data into it, and then we write that copy back to the same record in the database, with no other control mechanism around how that update happens.

We’ll examine some real-world phenomena that happen with form-based updates in more detail later. First, let’s talk about what methods databases like PostgreSQL provide for concurrency control, and how they’re exposed in the Rails framework.

One technique you can use is the with_lock method. If we call ctr.update inside a block passed to ctr.with_lock, then the program works correctly and leaves the counter’s value at 400:

def inc_counter
  ctr = Counter.find_by(name: "my-counter")
  ctr.with_lock do
    ctr.update(value: ctr.value + 1)
  end
end

with_lock works using the SQL SELECT FOR UPDATE mechanism. It begins a transaction, and then reloads the ctr object by issuing a SELECT for the record it represents, with FOR UPDATE appended to the query. This makes the database place an exclusive lock against the record, meaning that no concurrent transaction is allowed to read or write it until this transaction completes.

This program generates the following queries. The first SELECT comes from the initial Counter.find_by call. BEGIN and SELECT ... FOR UPDATE are produced by calling ctr.with_lock. The UPDATE query comes from calling ctr.update, and COMMIT happens when the with_lock block exits.

SELECT * FROM counters WHERE name = 'my-counter' LIMIT 1;
BEGIN;
SELECT * FROM counters WHERE id = 1 LIMIT 1 FOR UPDATE;
UPDATE counters SET value = 2 WHERE id = 1;
COMMIT;

This makes the program work correctly because it ensures that each logical increment task accesses the counter record sequentially – the SELECT FOR UPDATE query will be blocked until any current transaction completes, releasing the lock on that record. Above we saw that an UPDATE query cannot be processed until any other transaction currently holding a write lock for the affected record is committed. Adding FOR UPDATE means that the SELECT query also requires a write lock, so it too is blocked until all other transactions are complete. By signalling that we’re planning on updating the selected record, we’re preemptively locking it to stop anyone else modifying it, so the result of our read remains valid.

The program now produces the right result, but it is wasteful, since we’re immediately reloading the record after loading it the first time, and this creates a subtle opportunity to break the program. When we discussed mutexes, we saw that a mutex lock must cover all the reads and writes involved in a task. It is no good to read something, then take a lock before writing it. In this case, it’s possible to grab values off the ctr object before it is reloaded, and then use them to increment the counter:

def inc_counter
  ctr = Counter.find_by(name: "my-counter")
  tmp = ctr.value

  ctr.with_lock do
    ctr.update(value: tmp + 1)
  end
end

tmp is read before the record is reloaded with FOR UPDATE inside the transaction, so it may be stale: the record may be updated between our first SELECT, and the SELECT FOR UPDATE. That makes the value we stored in tmp invalid, and we end up losing updates just like our original counter example did. In our first with_lock example, we access ctr.value inside the block, once the record has been safely reloaded, whereas in the above example we access it outside the block and its result may not be valid by the time it’s used to update the record.

Using values you acquired before locking something is quite a common cause of concurrency bugs; in fact the above is equivalent to how most Rails update actions work. with_lock makes it quite easy to do this without realising, as visually it only appears to wrap the update operation – it’s not obvious that it’s implicitly reloading the record so you have the latest version in memory.

This problem can be recognised by the fact that Rails allows the read, write and lock methods to be called on the same type, and in any order; there is nothing to force you to lock the object before interacting with it. As we’ve seen, constructing the program so that we know we have a locked object before accessing it will make it safe.

Ultimately with_lock is a convenience method. A method that’s less wasteful and harder to accidentally break, is to explicitly start a transaction yourself, and use Counter.lock.find_by to load the record.

def inc_counter
  ActiveRecord::Base.transaction do
    ctr = Counter.lock.find_by(name: "my-counter")
    ctr.update(value: ctr.value + 1)
  end
end

This does effectively the same thing as the with_lock code, but without the wasted initial SELECT. Counter.lock.find_by performs a SELECT FOR UPDATE query, so we can safely use its result to update the counter’s value. This code generates the following queries:

BEGIN;
SELECT * FROM counters WHERE name = 'my-counter' LIMIT 1 FOR UPDATE;
UPDATE counters SET value = 2 WHERE id = 1;
COMMIT;

A couple of things to note here: first, the lock created by the FOR UPDATE query is scoped to a transaction, so you can only use this inside a transaction. with_lock takes a block, which means it can start a transaction, run the block, and then commit the transaction for you. Counter.lock doesn’t work like this, so you need to supply the transaction for it to run in. This might be annoying but it does make your code visually clearer in terms of signalling the scope of the lock, and which queries are covered by it. It’s harder to make the mistake of using stale read results using this construction, because the Counter instance we obtain is in a locked state for its entire lifetime. We’ve made the code work like the Mutex type, in that it’s only possible to acquire a Counter object once it’s safe. Counter.lock.find_by plays the role of Mutex::lock(), giving us an object that’s safe to read and write by restricting concurrent access to it.

Second, the use of FOR UPDATE is critical, and merely putting the SELECT and the UPDATE in a transaction together is not sufficient. This implementation, without the lock call, will lose updates:

def inc_counter
  ActiveRecord::Base.transaction do
    ctr = Counter.find_by(name: "my-counter")
    ctr.update(value: ctr.value + 1)
  end
end

This does not lock the counter record while it’s in use, which means multiple transactions can run concurrently, for example producing the following sequence of queries:

Task 1                          | Task 2
--------------------------------+--------------------------------
BEGIN                           |
                                | BEGIN
SELECT * FROM counters ...      |
    --> value = 1               |
                                | SELECT * FROM counters ...
                                |     --> value = 1
UPDATE counters SET value = 2   |
COMMIT                          |
                                | UPDATE counters SET value = 2
                                | COMMIT

Task 2 is still allowed to SELECT the record before Task 1 commits, because Task 1’s SELECT does not lock the record. Only once Task 1 performs an UPDATE is a lock placed on the record, delaying Task 2’s UPDATE until Task 1 commits.

The only way this becomes safe is if you configure PostgreSQL to run transactions using the repeatable read isolation level. This causes the database to reject any transaction that attempts to update a record that another concurrent transaction has already written, and requires application-level exception handling:

def inc_counter
  ActiveRecord::Base.transaction(isolation: :repeatable_read) do
    ctr = Counter.find_by(name: "my-counter")
    ctr.update(value: ctr.value + 1)
  end
rescue ActiveRecord::SerializationFailure
  retry
end

This implementation results in the following queries being sent to PostgreSQL. The UPDATE query will block if another pending transaction has already updated the record, and will return an error once that other transaction commits.

BEGIN ISOLATION LEVEL REPEATABLE READ;
SELECT * FROM counters WHERE name = 'my-counter' LIMIT 1;
UPDATE counters SET value = 2 WHERE id = 1;
COMMIT;

If you run these queries in two separate psql sessions, you’ll see the following behaviour:

Task 1                          | Task 2
--------------------------------+--------------------------------
BEGIN ... REPEATABLE READ       |
                                | BEGIN ... REPEATABLE READ
SELECT * FROM counters ...      |
    --> value = 1               |
                                | SELECT * FROM counters ...
                                |     --> value = 1
UPDATE counters SET value = 2   |
                                | UPDATE counters SET value = 2
                                |     --> blocks...
COMMIT                          |
                                |     --> ... and fails

The UPDATE in Task 2 blocks as usual, because Task 1 is holding an exclusive lock on the record after its own UPDATE. The difference in REPEATABLE READ mode is that once Task 1 commits, the UPDATE in Task 2 is rejected, because it conflicts with a write made in Task 1.

Note that you do not need to use Counter.lock to get a SELECT FOR UPDATE here, because repeatable read isolation means the database will automatically prevent concurrent transactions from modifying the same record, without you explicitly locking them. In PostgreSQL the default is the weaker read committed level, which allows the UPDATE in Task 2 above to succeed, even though it’s delayed until Task 1 commits.

This has made a very subtle change to the type of the interaction with the counter record: the UPDATE can now fail, which we need to represent in the type somehow. The simplest possible representation that the operation can succeed or fail is to make write(T) return a boolean.

NonLossy<T> = {
    read() -> T;
    write(T) -> bool;
}

This tiny change to the type means that it’s possible (though not guaranteed) for it to prevent lost updates.

It’s important to note here that although isolation level terms like read committed, repeatable read and serializable are part of the SQL standard, their definitions are open to interpretation and database implementations may have differing semantics. The above results apply specifically to PostgreSQL; In MySQL, the default isolation level is called “repeatable read” but it does not prevent the above race condition and both UPDATE queries will succeed. Running this task under the stronger serializable isolation level causes MySQL to deadlock. If you want to know how this task works on your database, you should test it yourself. In the case of MySQL, this example is best handled by using the Rails lock method to send a SELECT FOR UPDATE query under any isolation level.

We’ve scratched the surface of how databases handle concurrency here, and if you’re interested there is substantial documentation of concurrency control in PostgreSQL. There are many different kinds of locks for different use cases, but the key point here is that, to get consistent changes to records, you need to read and write them while holding some kind of transaction-scoped lock that prevents competing tasks from overwriting each other’s changes.

This kind of concurrency control is called pessimistic: to prevent conflicting changes, we defensively lock resources before interacting with them. The use of mutexes in the previous article is a form of pessimistic control too. It makes the use of a Location safer by preventing bare read/write access to it, and imposing a control mechanism that only allows one task to use the Location at a time.

There’s another kind of control known as optimistic, in which you don’t use any locks, you just attempt the desired update, and the update fails if some other task modified the resource at the same time. It’s optimistic because you assume there usually won’t be any other tasks competing with you for control of a resource, so it would be wasteful to do the work of locking the resource for every single operation.

Optimistic control often takes the form of a compare-and-swap (CAS) operation, where a write is made conditional on the current value, or on some secondary value that tracks the resource’s current version. In SQL, it might look like a query to set the counter’s value if it has not changed since we read it, for example we can set value = 2 if and only if the record still has value = 1. This check is implemented as an additional WHERE condition following the id clause.

UPDATE counters SET value = 2 WHERE id = 1 AND value = 1;

UPDATE queries return the count of the number of rows affected, so we know if this had any effect and can retry our task if not. The database executes each query atomically, so this doesn’t need any transaction around it if we’re only performing a single update. In Rails, we can implement this workflow as follows:

def inc_counter
  loop do
    ctr   = Counter.find_by(name: "my-counter")
    rows  = Counter.where(id: ctr.id, value: ctr.value)
    count = rows.update_all(value: ctr.value + 1)

    break if count > 0
  end
end

In a loop, we retrieve the record, and issue an update that’s conditional on the value still being the one we just read. If this affects a non-zero number of rows, then we’ve successfully updated the record without any other task invalidating our changes, so we can stop retrying.

Rails will do this transparently if you add a lock_version to your table schema:

ActiveRecord::Schema.define do
  create_table :counters, force: true do |t|
    t.integer :lock_version, null: false
    t.string :name, null: false
    t.integer :value, null: false, default: 0
  end
end

Now, Rails will take care of the CAS operation, using lock_version as an incrementing number to check if the record has been updated since we read it. The implementation code then has to rescue the exception that Rails throws if the record is stale – you can retry indefinitely, or some fixed number of times, or just fail immediately.

def inc_counter
  ctr = Counter.find_by(name: "my-counter")
  ctr.update(value: ctr.value + 1)
rescue ActiveRecord::StaleObjectError
  retry
end

This produces the correct result and doesn’t lose any updates, and does not require the SELECT to be in a transaction with the UPDATE, or to acquire any locks. The transaction in the generated SQL is only there because Rails wraps all updates in transactions; since the SELECT is outside the transaction it’s treated as an independent operation by the database.

SELECT * FROM counters WHERE name = 'my-counter' LIMIT 1;
BEGIN;
UPDATE counters SET value = 2, lock_version = 2
    WHERE id = 1 AND lock_version = 1;
COMMIT;

If another task has updated the record’s lock_version since we read lock_version = 1 in our SELECT, this will affect zero rows and Rails converts this result into a StaleObjectError.

In terms of our Location type, CAS is a subtle modification to the write(T) operation:

CompareAndSwap<T> = {
    read() -> T;
    write(T, T) -> bool;
}

There’s no control mechanism wrapping the resource, so we can still read() -> T from it and get a value whenever we want. The difference is that write(T, T) -> bool takes two values: the new value, and our copy of the last value we read. If that second argument does indeed match the current state, then write() succeeds and returns true, otherwise it returns false. In the Location type, write(T) returns nothing because it always succeeds, so there’s no result to report. In CompareAndSwap, it can fail, so it needs to return a representation of that.

This type hints at some additional benefits of optimistic control. It doesn’t require locks, and thus allows more work to happen concurrently under the assumption that operations will conflict infrequently. In particular, it allows reads to happen freely; when using pessimistic control, a locking mechanism often blocks writes and reads, so you can’t read a resource that somebody else is currently modifying. With optimistic control, any task can read at any time, and writes are controlled by sending additional input, to confirm you’ve seen the latest version.

Some databases embrace optimistic concurrency control as part of their core mechanics. In the document database CouchDB, all documents have a field called _rev that indicates their current version. Any attempt to update a document needs to supply this value, and the update only succeeds if that version is still the latest one. For example, let’s PUT a new counter document whose content is { "value": 0 } into the database:

$ curl -si http://localhost:5984/examples/counter -X PUT -d '{ "value": 0 }'
HTTP/1.1 201 Created
Content-Type: application/json
ETag: "1-7e97409c987eac3a99385a17ad4cbabe"

{"ok":true,"id":"counter","rev":"1-7e97409c987eac3a99385a17ad4cbabe"}

This gives us a successful result, and tells us the _rev that the document has been given – it’s a combination of an incrementing integer and a hash of the document’s content.

If we GET this document, we see its current state, including its _rev field.

$ curl -si http://localhost:5984/examples/counter -X GET
HTTP/1.1 200 OK
Content-Type: application/json
ETag: "1-7e97409c987eac3a99385a17ad4cbabe"

{"_id":"counter","_rev":"1-7e97409c987eac3a99385a17ad4cbabe","value":0}

If we try to update it using PUT, the request will fail with a 409 status if we don’t supply a _rev value that matches the document’s current version.

$ curl -si http://localhost:5984/examples/counter -X PUT -d '{ "value": 1 }'
HTTP/1.1 409 Conflict
Content-Type: application/json

{"error":"conflict","reason":"Document update conflict."}

For our write to succeed, we either have to include a _rev field in the replacement document:

$ curl -si http://localhost:5984/examples/counter \
      -X PUT -d '{ "value": 1, "_rev": "1-7e97409c987eac3a99385a17ad4cbabe" }'
HTTP/1.1 201 Created
Content-Type: application/json
ETag: "2-610dfd70da9e1879c0addba97d2abbb9"

{"ok":true,"id":"counter","rev":"2-610dfd70da9e1879c0addba97d2abbb9"}

Or, we can pass it in the HTTP headers. HTTP has a built-in optimistic control feature called ETags, and if a GET for a resource includes an ETag header in the response, we can make a conditional PUT for the resource using the If-Match header.

$ curl -si http://localhost:5984/examples/counter \
      -X PUT -d '{ "value": 2 }' \
      -H 'If-Match: "2-610dfd70da9e1879c0addba97d2abbb9"'
HTTP/1.1 201 Created
Content-Type: application/json
ETag: "3-23fc9ec7ebedcb1eb8d49391bd9a95e8"

{"ok":true,"id":"counter","rev":"3-23fc9ec7ebedcb1eb8d49391bd9a95e8"}

The PUT response includes the new _rev value in the body and in the ETag header, so we immediately know the document’s new version and can update it again if required. Making a GET for the document shows that our value updates have indeed been saved.

$ curl -si http://localhost:5984/examples/counter -X GET
HTTP/1.1 200 OK
Content-Type: application/json
ETag: "3-23fc9ec7ebedcb1eb8d49391bd9a95e8"

{"_id":"counter","_rev":"3-23fc9ec7ebedcb1eb8d49391bd9a95e8","value":2}

Just like we did in Rails, we can use this to implement a safe counter, by retrying requests when they fail with a conflict. This implementation won’t work, because the put call will throw an exception when a conflict happens:

DB = "http://admin:admin@localhost:5984/examples"
CouchRest.put("#{DB}/counter", "value" => 0)

def inc_counter
  counter = CouchRest.get("#{DB}/counter")
  counter["value"] += 1
  CouchRest.put("#{DB}/counter", counter)
end

threads = (1..4).map do
  Thread.new do
    100.times { inc_counter }
  end
end

(The client library I’m using is CouchRest which provides a thin wrapper over CouchDB’s HTTP and JSON interface.) To make this work, all we need to do is retry the operation when it fails:

def inc_counter
  counter = CouchRest.get("#{DB}/counter")
  counter["value"] += 1
  CouchRest.put("#{DB}/counter", counter)
rescue CouchRest::Conflict
  retry
end

It’s important to note that when using CAS, we do not just retry the write, we retry the whole task. The write fails because we passed in a version argument that’s now stale, so we need to read the resource again to get its latest state and try our modification procedure again. This protocol forces the client to prove it’s seen the latest version of a resource before being allowed to overwrite it.

Some HTTP-based APIs use this technique, which is effective when it would be inappropriate to let the client hold a long-lived lock over a resource. For example, Dropbox uses ETag and If-Match in its API to prevent two clients causing lost updates, just like CouchDB does, which is especially effective because someone editing a file may have it open for minutes, hours or even days between first loading it and trying to save their changes. However, in my experience it’s relatively rare for HTTP services, either public-facing or internal, to provide any form of concurrency control, and such services are vulnerable to all the problems discussed here.

In our CompareAndSwap type above, it’s implied that read() -> T and write(T, T) -> bool are independent operations, whereas the example of CouchDB shows that actually one can only write after a successful read for an existing resource. So the type could more accurately be given as:

CompareAndSwap<T> = {
    read() -> Version<T>;
}

Version<T> = {
    get() -> T;
    replace(T) -> bool;
}

Reading from a CompareAndSwap gives us a Version. Once we have a Version, we can either get() the current value out of it, or replace() it with a new one. replace() implicitly combines the resource’s current value (to get its _rev value, for example) with the new value, and sends off an update request that may fail, prompting us to start over with another read() call. One of the nice things about turning protocols into types is it can show us how to design object interfaces so that you can’t perform illegal sequences of operations with them; you can’t call replace() in this design without calling read() first.

If it’s common for operations to conflict, then optimistic concurrency control methods end up spending a lot of time retrying and in that situation it may perform better to preemptively lock resources during modification. Before we move onto to talk about some more application scenarios, I want to mention one final kind of lock that PostgreSQL and other SQL databases offer: advisory locks. These locks aren’t directly related to database operations – the database ignores them when deciding how to schedule or reject transactional reads and writes. Instead they’re enforced by your application, so you can protect some of your logic using a mutex that’s stored in the database rather than in program memory. That means the mutex protects all your concurrent server processes from each other, not just operations inside a single process.

We can use pg_advisory_lock(id) to implement inc_counter. We’ll call it before we try to update the counter record, and PostgreSQL will block if this lock is already held by another connection. When pg_advisory_lock() returns, that means we now hold the lock and can go ahead with our logic, before releasing the lock with pg_advisory_unlock(id). Rails automatically sets up a separate database connection for each running thread, so this protects threads within the same process from one another, as well as separate processes running the same logic.

def inc_counter
  Counter.connection.execute("SELECT pg_advisory_lock(1)")
  ctr = Counter.find_by(name: "my-counter")
  ctr.update(value: ctr.value + 1)
  Counter.connection.execute("SELECT pg_advisory_unlock(1)")
end

This lock only has any meaning to the application; the database won’t prevent other processes from reading or writing the affected records while this code is running. What it does let you do is selectively co-ordinate around some data, for example to make sure only one background job at a time is allowed to run against a record, while allowing your user-facing application to keep reading and writing it. In this situation, using database-level locks would make data unavailable to end users while your background job is using it, which may be undesirable if you have long-running jobs.

We’re now well-armed with a knowledge of the tools databases provide for tackling race conditions, and we can move on to examine data races that occur in web applications – that’s what we’ll do in the next instalment.

The If Works

by James Coglan

Reading and writing, part 2: files and databases