CONTRACT_NOT_FOUND with Daml Script across multiple nodes

Background

I am running a single Daml script which submits commands against two different participant nodes. I am using open source Canton 2.7.6 and three Docker containers (participant1, participant2, and mydomain.)

image

I am doing this with the party_participants field of a participants.config, using daml script --participant-config.

The setup works great, except…

Problem

I am getting a CONTRACT_NOT_FOUND when bob tries to submit his proposal:

  debug "Alice creating a bid request..."
  bidRequest <- submit alice do createCmd BidRequest with owner = alice, painter = bob

  debug "Bob submitting a proposal..."
  bid <- submit bob do exerciseCmd bidRequest Propose with amount = 1000.0

  debug "Alice accepting the proposal..."
  contract <- submit alice do exerciseCmd bid Accept

I believe it is a race condition. Inserting sleep (seconds 5) between the submits avoids the CONTRACT_NOT_FOUND exceptions.

I expected that by the time participant1’s Ledger API returns a contract id, that I could submit the contract id against participant2’s Ledger API.

Questions

  • What is causing this race condition?
  • What can make this script work today? For example, is there a Canton config setting that would remove this race condition?
  • Is this something that we should “fix” to better support Daml scripting across multiple nodes?
1 Like

I just found this post by @cocreature.

Note that there is no automatic synchronisation between participants however. So if you submit a transaction on one participant it will not block until that transaction is visible on the other participant. You can do things like poll for a contract id created in that transaction to appear on the other participant via queryContractId to synchronize manually.

waitForCid : Template t => Party -> ContractId t -> Script ()
waitForCid p cid = do
  r <- queryContractId p cid
  case r of
    None -> do
      sleep (seconds 1)
      waitForCid p cid
    Some _ -> pure ()

I opened Easier multi-participant synchronization in Daml Script · Issue #10618 · digital-asset/daml · GitHub to improve the UX for synchronization in Daml Script.

1 Like

What is causing this race condition?

In your example, you submit against participant1. The submit call returns once the transaction has been committed and participant1 has processed it. However, participant1 and participant2 are not synchronized so participant2 might not have processed the transaction fully yet.

What can make this script work today? For example, is there a Canton config setting that would remove this race condition?

I think waitForCid is your best option for making this work in Daml Script.

Is this something that we should “fix” to better support Daml scripting across multiple nodes?

I’d be slightly cautious here: This asynchronous behavior is a fundamental thing you need to pay attention to in distributed setups. What I’ve found to work quite well is that once you’re actually working with multiple participants, drop Daml script (which is great for testing but not really suited for orchestrating complex distributed deployments) and instead do initialization from your app backends where each backend only talks to one participant. That way it becomes very explicit and you can add retries as needed for synchronization.

1 Like

Would the situation be any different if the two parties were both signatories?

No, both participants have to confirm the transaction in that case for it to be committed but that does still not ensure that the participant has ingested the transaction in the right store after the submission on the other participant completes.

Thanks, @cocreature, for the help!

I have distilled what I have learned in an article Multi-participant Daml Scripts.