Race conditions in DAML Apps?

The application in question is a Java Spring Boot API server (using the Java bindings rather than the HTTP JSON API), and our issue is that certain commands that execute consuming choices are racing, resulting in one or more commands failing, that cannot be tolerated when this system moves in to production.
Our main blocker is that not all commands may race. We would implement something like a queue system to ensure that one command can only begin after another has finished, but there is a concern regarding limiting the performance of the system by enforcing that wait on commands that may not need it.

2 Likes

One option for handling this is some kind of locking, e.g., in the form of a contract key that you check first. You can do that at the granularity that you deem acceptable for your application. I wrote up a full example of how you can do something like this recently at GitHub - digital-asset/ex-daml-contention.

4 Likes

Contention is inherent to distributed systems. If you could eliminate all contention from a system, you would no longer need a distributed ledger since consistency would come for free. So you have to accept some contention. The question is how we can minimise it.

Reducing contention in DAML

If you think of DAML in terms of databases, where templates are tables and contracts are rows, DAML does row-level locking. Ie if you write to a row, that row gets locked and other attempts to write to the row get rejected. Design your DAML with that in mind. E.g. imagine you have an address book on ledger:

template AddressBook
  with
    admin : Party
    addresses : Map Party Address
  where
    signatory admin
    observer (keys addresses)

    choice ChangeAddress : ContractId AddressBook
      with
        party : Party
        newAddress : Address
      controller party
      do
         assert (isSome (lookup party addresses))
         create this with
           addresses = insert party newAddress addresses

The entire address book is stored in one database row. Any address change locks the entire book. Decouple the addresses by using a nonconsuming choice and storing addresses in individual contracts.

template AddressBook
  with
    admin : Party
    parties : Set Party
  where
    signatory admin
    observer parties

    nonconsuming choice ChangeAddress : ContractId AddressEntry
      with
        party : Party
        newAddress : Address
      controller party
      do
         exerciseByKey @AddressEntry (admin, party) ChengeAddressEntry with newAddress

template AddressEntry
  with
    admin : Party
    party : Party
    address : Address
  where
    signatory admin
    key (admin, party) : (Party, Party)
    maintainer key._1
    controller party can
       ChangeAddressEntry : ContractId AddressEntry
         with
           newAddress : Adress
         do
            create this with address = newAddress

Now different parties can update their addresses simultaneously. Reads like fetch and nonconsuming choices do not contend with each other.

Reducing contention client-side

Let’s say we have a simple transferrable token, and Alice and Bob run automation send a large number of them back and forth.

template T
  with
    o : Party
   where
     signatory o

     controller o can
       Transfer : ContractId T
        with
           newOwner : Party
        do
          create this with o = newOwner

Alice implements her automation as follows:

  1. Query the JSON API
  2. For every T, asynchronously exercise transfer to Bob
  3. Repeat

Let’s say Alice starts with 1000 Ts. On the first iteration, Alice fires off 1000 Transfer commands and immediately re-queries the JSON API. At best a few of the 1000 transactions will have been committed so the JSON API will still return almost 1000 of the contracts. Alice fires off more commands for those. And again, and again. All of them will fail because they contend with the first batch of commands.

One solution is to go synchronous. Wait for commands to return. But that will slow things down a lot. So instead, Alice’s automation needs to track what’s in-flight.

  1. Start with an empty “pending set” of Contract IDs
  2. Query the JSON API
  3. For every T not in the pending set, asynchronously exercise transfer to Bob and write the cid to the pending set. When the asynchronous call returns, remove the cid from the pending set again.
  4. Repeat from 2.

Now Alice will not send repeat commands so there is no contention anymore.

Here the task is easy as it’s clear what commands contend with what others. In practice, the developer has to have a handle on that and devise a locking mechanism that reduces such contention. That means you need to know which contracts each command could archive.

5 Likes