Completed commands remain in-flight

Hi Discuss,

I face an issue with in-flight commands bookkeeping in what I believe is a valid use case. Affected Daml version: 1.18.1. Suppose the following code:

template PostalService
  with
    letters : [(Address, Letter)]
  where
    choice TryDeliverLetters : ()
     do
       result <- tryDeliverLetters letters
       case getUnsuccessful result of
         [] -> pure ()
         someLetters -> create $ DeliveryPending someLetters
          
template DeliveryPending
  with
    letters : [(Address, Letter)]
  where
   -- exercised by trigger when the global address store is updated
   -- achives itself conditionally
   nonconsuming choice RetryDeliverLetters : ()
    do
      result <- tryDeliverLetters letters
      case getUnsuccessful result of
        [] -> archive self
        someLetters 
          | length someLetters < length letters -> do
              archive self
              create $ DeliveryPending someLetters
          | otherwise -> pure () -- to avoid the trigger picking up the contract again and again

---- trigger:

trigger = Trigger {
   ...
   , registeredTemplates = [ registeredTemplate @GlobalAddressStore, registeredTemplate @DeliveryPending ]
   , rule = const retryDelivery
   ...
  }

retryDelivery =  do
  pendingDeliveries <- query @DeliveryPending
  forA_ pendingDeliveries \(id, _) -> dedupExercise id RetryDeliverLetters

The success of delivering letters depends on some other contract, which holds the list of all known addresses (say, of a country).
RetryDeliverLetters is exercised by a trigger whenever the global address store is updated. Most of the time, RetryDeliverLetters doesn’t change the ledger and thus the exercise command sent by the trigger remains in-flight indefinitely. Ultimately, exercise command is sent once, then dedupExercise becomes no-op.

I believe this behavior relates to the following bug report: [BUG] Triggers show already completed commands as in-flight · Issue #12233 · digital-asset/daml · GitHub

My questions: is this an anti-pattern of how triggers are supposed be used? Or is it a use case not covered yet by the Daml runtime?

We have 2 workarounds for the behavior of dedupExercise and how in-flight commands are handled:

  1. use emitCommands instead of dedupExercise
  2. pass a dummy argument to RetryDeliverLetters so that dedupExercise doesn’t realize that we are actually sending the same command:
    nonconsuming choice RetryDeliverLetters : ()
    with
       dummyArgToDistinguishCalls : ContractId GlobalAddressStore
    do
      result <- tryDeliverLetters letters
      case getUnsuccessful result of
        [] -> archive self
        someLetters 
          | length someLetters < length letters -> do
              archive self
              create $ DeliveryPending someLetters
          | otherwise -> pure () -- to avoid the trigger picking up the contract again and again
    

In both cases, the in-flight commands container retains commands indefinitely and keeps growing (leaking memory).

The fact that they never stop being marked as in flights is a bug as pointed out in the issue you found. I’ve reached out to the team and hopefully we can get this addressed soonish.

However, that doesn’t mean your trigger has no issues:
Generally triggers try to transform the current ACS to some target state.

So there must be some condition in the current ACS (e.g. the existence of a contract) that results in the trigger sending a command.

However, if your command does not change the ACS your trigger will just submit the same command again. The commands in flight temporarily stop that from happening until the command has completed but afterwards it will trigger again.

So triggers emitting commands that do not change the ACS in some form is pretty much always problematic unless you know that it will modify the ACS eventually (thereby stopping the trigger from resubmitting the command).

How you work around that somewhat depends on your app. How often do you want to resubmit? If the answer is every 5 minutes, write a contract on the ledger everytime you submitted. Then check that 5 minutes have passed and until you resubmit again. That way every submission does change the ACS by modifying that contract.

So there must be some condition in the current ACS (e.g. the existence of a contract) that results in the trigger sending a command.

The condition supposed to be “updating” a GlobalAddressStore contract (via a consuming choice). We only have one active GlobalAddressStore contract at all times.

However, if your command does not change the ACS your trigger will just submit the same command again.

I added a couple of debug calls to the trigger to see what makes the trigger react. The application seemed to submit one command per DeliveryPending contracts, as intended, when GlobalAddressStore is updated. Do I miss something?

The condition supposed to be “updating” a GlobalAddressStore contract (via a consuming choice). We only have one active GlobalAddressStore contract at all times.

The conditions should be derived from the ACS not from events. Triggers are more state based than event based. This is important to support restarts.

I added a couple of debug calls to the trigger to see what makes the trigger react. The application seemed to submit one command per DeliveryPending contracts, as intended, when GlobalAddressStore is updated. Do I miss something?

In general, you’re best off assuming that the rule can trigger at any point and view the fact that it only triggers at certain points in time as an optimization. The specific case that is going to cause issues here is that the rule gets retriggered after the trigger saw a (potentially empty) transaction. So in your example, it sends a command, it gets an empty transaction for that command (assuming the ledger bug is fixed) and then if the state has not changed it will resubmit the same command.