Background
I am running a single Daml script which submits commands against two different participant nodes. I am using open source Canton 2.7.6 and three Docker containers (participant1
, participant2
, and mydomain
.)
I am doing this with the party_participants
field of a participants.config
, using daml script --participant-config
.
The setup works great, except…
Problem
I am getting a CONTRACT_NOT_FOUND
when bob tries to submit his proposal:
debug "Alice creating a bid request..."
bidRequest <- submit alice do createCmd BidRequest with owner = alice, painter = bob
debug "Bob submitting a proposal..."
bid <- submit bob do exerciseCmd bidRequest Propose with amount = 1000.0
debug "Alice accepting the proposal..."
contract <- submit alice do exerciseCmd bid Accept
I believe it is a race condition. Inserting sleep (seconds 5)
between the submits avoids the CONTRACT_NOT_FOUND
exceptions.
I expected that by the time participant1’s Ledger API returns a contract id, that I could submit the contract id against participant2’s Ledger API.
Questions
- What is causing this race condition?
- What can make this script work today? For example, is there a Canton config setting that would remove this race condition?
- Is this something that we should “fix” to better support Daml scripting across multiple nodes?
1 Like
I just found this post by @cocreature.
Note that there is no automatic synchronisation between participants however. So if you submit a transaction on one participant it will not block until that transaction is visible on the other participant. You can do things like poll for a contract id created in that transaction to appear on the other participant via queryContractId
to synchronize manually.
waitForCid : Template t => Party -> ContractId t -> Script ()
waitForCid p cid = do
r <- queryContractId p cid
case r of
None -> do
sleep (seconds 1)
waitForCid p cid
Some _ -> pure ()
I opened Easier multi-participant synchronization in Daml Script · Issue #10618 · digital-asset/daml · GitHub to improve the UX for synchronization in Daml Script.
1 Like
What is causing this race condition?
In your example, you submit against participant1. The submit
call returns once the transaction has been committed and participant1 has processed it. However, participant1 and participant2 are not synchronized so participant2 might not have processed the transaction fully yet.
What can make this script work today? For example, is there a Canton config setting that would remove this race condition?
I think waitForCid
is your best option for making this work in Daml Script.
Is this something that we should “fix” to better support Daml scripting across multiple nodes?
I’d be slightly cautious here: This asynchronous behavior is a fundamental thing you need to pay attention to in distributed setups. What I’ve found to work quite well is that once you’re actually working with multiple participants, drop Daml script (which is great for testing but not really suited for orchestrating complex distributed deployments) and instead do initialization from your app backends where each backend only talks to one participant. That way it becomes very explicit and you can add retries as needed for synchronization.
1 Like
Would the situation be any different if the two parties were both signatories?
No, both participants have to confirm the transaction in that case for it to be committed but that does still not ensure that the participant has ingested the transaction in the right store after the submission on the other participant completes.
Thanks, @cocreature, for the help!
I have distilled what I have learned in an article Multi-participant Daml Scripts.