We have a choice
Foo on an instance of a template
A that looks up by key (
lookupbykey) a singleton instance of a template
B. It then exercises a choice on B which recycles the instance of B (with new data but same contract key). We submit back to back over grpc two commands
Foo on two distinct instances of template
A (one Foo on each).
Sometimes we notice the following exception:
io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Reason: [InvalidLookup(com.digitalasset.daml.lf.transaction.Node$GlobalKey@1d5be94,None,Some(AbsoluteContractId(#29:7))), DuplicateKey(com.digitalasset.daml.lf.transaction.Node$GlobalKey@1d5be94)] and we see only one of the two commands has successfully gone through.
Is this a race condition we are noticing?
I would expect the code to be racy but the losing command should just lookup the updated instance of contract
B and never report a
The behavior you are seeing derives from the fact that you have two commands being interpreted concurrently, competing to resolve a contract key to a contract id.
If either command is interpreted before the other one’s result is committed to the ledger, they will resolve the contract key to the same contract id (the one of the contract originally having the key).
After the interpretation has processed your command and turned it into a transaction, those two transactions must be sequenced to check that the state they ultimately produce is consistent.
When this happens though, the first command to be committed will have effectively changed the contract id to which the contract key lookup is resolved to, making the other transaction inconsistent, causing it to be rejected.
Note the this behavior is not deterministic, since it can be the case that at times the transaction caused by one of the two commands is committed before the other one performs the lookup by key, leading to both commands succeed.
What you are experiencing is @Ratko_Veprek’s main criticism of Contract Keys: They look like and feel like they reduce contention, but in reality they just hide it leading to unintuitive errors. Exercising consuming choices leads to contention, there is no way around that. With Contract Keys, we have merely moved the resolution of CIDs form the client application to the Participant Node doing the command interpretation.
The UX would be much better if we did so.
The race condition arises, because we perform the lookup by key before the exact position of the command on the ledger is known. In fact, it is evaluated on an outdated version of the ledger.
We could fix that by moving the evaluation of “lookup by key” to the conflict detection phase. As a result (omitting some explanations here), we would also have to perform DAML interpretation as part of conflict detection.
Unfortunately, conflict detection is already a bottleneck and by adding “lookup by key” (and consequently DAML interpretation), we would make it substantially worse.
So we are sacrificing UX to get a higher throughput.
Note that the same happens if you configure a lower transaction isolation level on a database.