Asynchronous event processing failed for event batch with sequencer counters

Question: How do we reconnect a participant node when something like the following happens?


We have an enterprise participant node (v. 2.7.3) that disconnected from the domain after nearly a day of running healthy. Two similar participant nodes are still connected to the domain node and running fine. The domain node seems healthy.

When we try to reboot the participant node, the bootstrap script tries to connect to the domain and we see the following error in the logs:

ERROR: Asynchronous event processing failed for event batch
with sequencer counters 3657 to 3756.

Here are more surrounding warnings and error messages:

WARN: "SYNC_SERVICE_ALARM(5,21cf290d):
Request RequestId(2024-02-29T20:51:42.811465Z)
with failed activeness check is approved.",

ERROR: "A task failed with an exception. FinalizeRequest(
   timestamp = 2024-02-29T20:51:43.349214Z,
   sequencerCounter = 3658,
   requestTimestamp = 2024-02-29T20:51:42.811465Z,
   rc = 1157,
   commitTime = 2024-02-29T20:51:43.349214Z)",

ERROR: "A task failed with an exception. CheckActivenessAndLock(
    timestamp = 2024-02-29T20:51:43.651595Z,
    sequencerCounter = 3659,
    rc = 1158)",

ERROR: "Transaction: Failed to process result",

ERROR: "Asynchronous event processing failed for event batch
with sequencer counters 3657 to 3756.",

WARN: "Closing resilient sequencer subscription due to error:
  HandlerError(ApplicationHandlerException(
    first sequencer counter = 3657,
    last sequencer counter = 3756,
    java.lang.RuntimeException: Request RequestId(2024-02-29T20:51:42.811465Z)
    with failed activeness check is approved.
    at com.digitalasset.canton.participant.protocol.validation.TransactionValidationResult.commitSet(TransactionValidationResult.scala:75) at com.digitalasset.canton.participant.protocol.TransactionProcessingSteps.$anonfun$getCommitSetAndContractsToBeStoredAndEventApproveConform$1(TransactionProcessingSteps.scala:1313)))",

ERROR: "SYNC_SERVICE_DOMAIN_DISCONNECTED(4,14dc0f16):
   Domain 'company-domain' fatally disconnected
   because of handler returned error:
   ApplicationHandlerException(
      first sequencer counter = 3657,
      last sequencer counter = 3756,
      java.lang.RuntimeException: Request RequestId(2024-02-29T20:51:42.811465Z)
      with failed activeness check is approved. 
      at com.digitalasset.canton.participant.protocol.validation.TransactionValidationResult.commitSet(TransactionValidationResult.scala:75) at com.digitalasset.canton.participant.protocol.TransactionProcessingSteps.$an...",

WARN: "Timeout 10 seconds expired, but tasks still running.
Shutting down forcibly",

WARN: "Participant 'qa-machine' registered: true",

WARN: "SYNC_SERVICE_ALARM(5,21cf290d):
Request RequestId(2024-02-29T20:51:42.811465Z)
with failed activeness check is approved.",

ERROR: "A task failed with an exception. CheckActivenessAndLock(
    timestamp = 2024-02-29T20:51:43.651595Z,
    sequencerCounter = 3659,
    rc = 1158)",

ERROR: "A task failed with an exception.  FinalizeRequest(
    timestamp = 2024-02-29T20:51:43.349214Z,
    sequencerCounter = 3658,
    requestTimestamp = 2024-02-29T20:51:42.811465Z,
    rc = 1157,
    commitTime = 2024-02-29T20:51:43.349214Z)",

ERROR: "Transaction: Failed to process result",

ERROR: "Asynchronous event processing failed for event batch
with sequencer counters 3657 to 3756.",

ERROR: "Sequencer subscription failed",

ERROR: "SYNC_SERVICE_INTERNAL_ERROR(4,1cfe4a4e):
The domain failed to startup due to an internal error",

WARN: "Timeout 10 seconds expired, but tasks still running.
Shutting down forcibly",

ERROR: "Request c.d.c.p.a.v.DomainConnectivityService/ConnectDomain by /127.0.0.1:39040:
failed with INTERNAL/An error occurred.
Please contact the operator and inquire about
   the request 1cfe4a4eb3f76e163502e84241db1f63
   with tid 1cfe4a4eb3f76e163502e84241db1f63",

ERROR: "Request failed for qa-machine. 
    GrpcServerError: INTERNAL/An error occurred.
    Please contact the operator and inquire about
       the request 1cfe4a4eb3f76e163502e84241db1f63
       with tid 1cfe4a4eb3f76e163502e84241db1f63
    Request: ConnectDomain(Domain 'company-domain',false)
       Command ParticipantAdministration$domains$.connect
       invoked from bootstrap.canton:22",

ERROR: "Bootstrap script terminated with an error: 
com.digitalasset.canton.console.CommandFailure:
Command execution failed.",
even more details
[
  {
    "message": "SYNC_SERVICE_ALARM(5,21cf290d): Request RequestId(2024-02-29T20:51:42.811465Z) with failed activeness check is approved.",
    "logger": "com.digitalasset.canton.participant.protocol.TransactionProcessingSteps:participant=qa-machine/domainId=domain::1220ccc407dc",
    "thread": "canton-env-execution-context-123",
    "level": "WARN"
  },
  {
    "message": "A task failed with an exception. FinalizeRequest(timestamp = 2024-02-29T20:51:43.349214Z, sequencerCounter = 3658, requestTimestamp = 2024-02-29T20:51:42.811465Z, rc = 1157, commitTime = 2024-02-29T20:51:43.349214Z)",
    "logger": "com.digitalasset.canton.data.TaskScheduler:participant=qa-machine/domain-alias=company-domain/NaiveRequestTracker",
    "thread": "canton-env-execution-context-133",
    "level": "ERROR",
    "stackTrace": "java.lang.RuntimeException: Request RequestId(2024-02-29T20:51:42.811465Z) with failed activeness check is approved.  at com.digitalasset.canton.participant.protocol.validation.TransactionValidationResult.commitSet(TransactionValidationResult.scala:75) at com.digitalasset.canton.participant.protocol.TransactionProcessingSteps.$anonfun$getCommitSetAndContractsToBeStoredAndEventApproveConform$1(TransactionProcessingSteps.scala:1313) "
  },
  {
    "message": "A task failed with an exception. CheckActivenessAndLock(timestamp = 2024-02-29T20:51:43.651595Z, sequencerCounter = 3659, rc = 1158)",
    "logger": "com.digitalasset.canton.data.TaskScheduler:participant=qa-machine/domain-alias=company-domain/NaiveRequestTracker",
    "thread": "canton-env-execution-context-134",
    "level": "ERROR",
    "stackTrace": "java.lang.RuntimeException: Request RequestId(2024-02-29T20:51:42.811465Z) with failed activeness check is approved.  at com.digitalasset.canton.participant.protocol.validation.TransactionValidationResult.commitSet(TransactionValidationResult.scala:75) at com.digitalasset.canton.participant.protocol.TransactionProcessingSteps.$anonfun$getCommitSetAndContractsToBeStoredAndEventApproveConform$1(TransactionProcessingSteps.scala:1313)"
  },
  {
    "message": "Transaction: Failed to process result",
    "logger": "com.digitalasset.canton.participant.protocol.TransactionProcessor:participant=qa-machine/domainId=domain::1220ccc407dc",
    "thread": "canton-env-execution-context-60",
    "level": "ERROR",
    "stackTrace": "java.lang.RuntimeException: Request RequestId(2024-02-29T20:51:42.811465Z) with failed activeness check is approved.  at com.digitalasset.canton.participant.protocol.validation.TransactionValidationResult.commitSet(TransactionValidationResult.scala:75) at com.digitalasset.canton.participant.protocol.TransactionProcessingSteps.$anonfun$getCommitSetAndContractsToBeStoredAndEventApproveConform$1(TransactionProcessingSteps.scala:1313) "
  },
  {
    "message": "Asynchronous event processing failed for event batch with sequencer counters 3657 to 3756.",
    "logger": "com.digitalasset.canton.sequencing.client.SequencerClientImpl:participant=qa-machine/domainId=domain::1220ccc407dc",
    "thread": "canton-env-execution-context-60",
    "level": "ERROR",
    "stackTrace": "java.lang.RuntimeException: Request RequestId(2024-02-29T20:51:42.811465Z) with failed activeness check is approved.  at com.digitalasset.canton.participant.protocol.validation.TransactionValidationResult.commitSet(TransactionValidationResult.scala:75) at com.digitalasset.canton.participant.protocol.TransactionProcessingSteps.$anonfun$getCommitSetAndContractsToBeStoredAndEventApproveConform$1(TransactionProcessingSteps.scala:1313) "
  },
  {
    "message": "Closing resilient sequencer subscription due to error: HandlerError(ApplicationHandlerException( first sequencer counter = 3657, last sequencer counter = 3756, java.lang.RuntimeException: Request RequestId(2024-02-29T20:51:42.811465Z) with failed activeness check is approved.  at com.digitalasset.canton.participant.protocol.validation.TransactionValidationResult.commitSet(TransactionValidationResult.scala:75) at com.digitalasset.canton.participant.protocol.TransactionProcessingSteps.$anonfun$getCommitSetAndContractsToBeStoredAndEventApproveConform$1(TransactionProcessingSteps.scala:1313)))",
    "logger": "com.digitalasset.canton.sequencing.client.ResilientSequencerSubscription:participant=qa-machine/domainId=domain::1220ccc407dc",
    "thread": "canton-env-execution-context-134",
    "level": "WARN"
  },
  {
    "message": "SYNC_SERVICE_DOMAIN_DISCONNECTED(4,14dc0f16): Domain 'company-domain' fatally disconnected because of handler returned error: ApplicationHandlerException( first sequencer counter = 3657, last sequencer counter = 3756, java.lang.RuntimeException: Request RequestId(2024-02-29T20:51:42.811465Z) with failed activeness check is approved.  at com.digitalasset.canton.participant.protocol.validation.TransactionValidationResult.commitSet(TransactionValidationResult.scala:75) at com.digitalasset.canton.participant.protocol.TransactionProcessingSteps.$an...",
    "logger": "com.digitalasset.canton.participant.sync.CantonSyncService:participant=qa-machine",
    "thread": "canton-env-execution-context-60",
    "level": "ERROR"
  },
  {
    "message": "Timeout 10 seconds expired, but tasks still running. Shutting down forcibly",
    "logger": "com.digitalasset.canton.participant.protocol.TransactionProcessor:participant=qa-machine/domainId=domain::1220ccc407dc",
    "thread": "canton-env-execution-context-60",
    "level": "WARN"
  },
  {
    "message": "Participant 'qa-machine' registered: true",
    "logger": "console",
    "thread": "main",
    "level": "WARN"
  },
  {
    "message": "SYNC_SERVICE_ALARM(5,21cf290d): Request RequestId(2024-02-29T20:51:42.811465Z) with failed activeness check is approved.",
    "logger": "com.digitalasset.canton.participant.protocol.TransactionProcessingSteps:participant=qa-machine/domainId=domain::1220ccc407dc",
    "thread": "canton-env-execution-context-60",
    "level": "WARN"
  },
  {
    "message": "A task failed with an exception.  CheckActivenessAndLock(timestamp = 2024-02-29T20:51:43.651595Z, sequencerCounter = 3659, rc = 1158)",
    "logger": "com.digitalasset.canton.data.TaskScheduler:participant=qa-machine/domain-alias=company-domain/NaiveRequestTracker",
    "thread": "canton-env-execution-context-60",
    "level": "ERROR",
    "stackTrace": "java.lang.RuntimeException: Request RequestId(2024-02-29T20:51:42.811465Z) with failed activeness check is approved.  at com.digitalasset.canton.participant.protocol.validation.TransactionValidationResult.commitSet(TransactionValidationResult.scala:75) at com.digitalasset.canton.participant.protocol.TransactionProcessingSteps.$anonfun$getCommitSetAndContractsToBeStoredAndEventApproveConform$1(TransactionProcessingSteps.scala:1313) "
  },
  {
    "message": "A task failed with an exception.  FinalizeRequest(timestamp = 2024-02-29T20:51:43.349214Z, sequencerCounter = 3658, requestTimestamp = 2024-02-29T20:51:42.811465Z, rc = 1157, commitTime = 2024-02-29T20:51:43.349214Z)",
    "logger": "com.digitalasset.canton.data.TaskScheduler:participant=qa-machine/domain-alias=company-domain/NaiveRequestTracker",
    "thread": "canton-env-execution-context-60",
    "level": "ERROR",
    "stackTrace": "java.lang.RuntimeException: Request RequestId(2024-02-29T20:51:42.811465Z) with failed activeness check is approved.  at com.digitalasset.canton.participant.protocol.validation.TransactionValidationResult.commitSet(TransactionValidationResult.scala:75) at com.digitalasset.canton.participant.protocol.TransactionProcessingSteps.$anonfun$getCommitSetAndContractsToBeStoredAndEventApproveConform$1(TransactionProcessingSteps.scala:1313) "
  },
  {
    "message": "Transaction: Failed to process result",
    "logger": "com.digitalasset.canton.participant.protocol.TransactionProcessor:participant=qa-machine/domainId=domain::1220ccc407dc",
    "thread": "canton-env-execution-context-135",
    "level": "ERROR",
    "stackTrace": "java.lang.RuntimeException: Request RequestId(2024-02-29T20:51:42.811465Z) with failed activeness check is approved.  at com.digitalasset.canton.participant.protocol.validation.TransactionValidationResult.commitSet(TransactionValidationResult.scala:75) at com.digitalasset.canton.participant.protocol.TransactionProcessingSteps.$anonfun$getCommitSetAndContractsToBeStoredAndEventApproveConform$1(TransactionProcessingSteps.scala:1313) "
  },
  {
    "message": "Asynchronous event processing failed for event batch with sequencer counters 3657 to 3756.",
    "logger": "com.digitalasset.canton.sequencing.client.SequencerClientImpl:participant=qa-machine/domainId=domain::1220ccc407dc",
    "thread": "canton-env-execution-context-123",
    "level": "ERROR",
    "stackTrace": "java.lang.RuntimeException: Request RequestId(2024-02-29T20:51:42.811465Z) with failed activeness check is approved.  at com.digitalasset.canton.participant.protocol.validation.TransactionValidationResult.commitSet(TransactionValidationResult.scala:75) at com.digitalasset.canton.participant.protocol.TransactionProcessingSteps.$anonfun$getCommitSetAndContractsToBeStoredAndEventApproveConform$1(TransactionProcessingSteps.scala:1313) "
  },
  {
    "message": "Sequencer subscription failed",
    "logger": "com.digitalasset.canton.sequencing.client.SequencerClientImpl:participant=qa-machine/domainId=domain::1220ccc407dc",
    "thread": "canton-env-execution-context-123",
    "level": "ERROR",
    "stackTrace": "com.digitalasset.canton.sequencing.client.SequencerClientSubscriptionException: Handling of sequencer event failed with error: ApplicationHandlerException( first sequencer counter = 3657, last sequencer counter = 3756, java.lang.RuntimeException: Request RequestId(2024-02-29T20:51:42.811465Z) with failed activeness check is approved.  at com.digitalasset.canton.participant.protocol.validation.TransactionValidationResult.commitSet(TransactionValidationResult.scala:75) at com.digitalasset.canton.participant.protocol.TransactionProcessingSteps.$anonfun$getCommitSetAndContractsToBeStoredAndEventApproveConform$1(TransactionProcessingSteps.scala:1313) "
  },
  {
    "message": "SYNC_SERVICE_INTERNAL_ERROR(4,1cfe4a4e): The domain failed to startup due to an internal error",
    "logger": "com.digitalasset.canton.participant.sync.CantonSyncService:participant=qa-machine",
    "thread": "canton-env-execution-context-123",
    "level": "ERROR",
    "stackTrace": "com.digitalasset.canton.sequencing.client.SequencerClientSubscriptionException: Handling of sequencer event failed with error: ApplicationHandlerException( first sequencer counter = 3657, last sequencer counter = 3756, java.lang.RuntimeException: Request RequestId(2024-02-29T20:51:42.811465Z) with failed activeness check is approved.  at com.digitalasset.canton.participant.protocol.validation.TransactionValidationResult.commitSet(TransactionValidationResult.scala:75) at com.digitalasset.canton.participant.protocol.TransactionProcessingSteps.$anonfun$getCommitSetAndContractsToBeStoredAndEventApproveConform$1(TransactionProcessingSteps.scala:1313) "
  },
  {
    "message": "Timeout 10 seconds expired, but tasks still running. Shutting down forcibly",
    "logger": "com.digitalasset.canton.participant.protocol.TransactionProcessor:participant=qa-machine/domainId=domain::1220ccc407dc",
    "thread": "canton-env-execution-context-123",
    "level": "WARN"
  },
  {
    "message": "Request c.d.c.p.a.v.DomainConnectivityService/ConnectDomain by /127.0.0.1:39040: failed with INTERNAL/An error occurred. Please contact the operator and inquire about the request 1cfe4a4eb3f76e163502e84241db1f63 with tid 1cfe4a4eb3f76e163502e84241db1f63",
    "logger": "com.digitalasset.canton.networking.grpc.ApiRequestLogger:participant=qa-machine",
    "thread": "canton-env-execution-context-133",
    "level": "ERROR"
  },
  {
    "message": "Request failed for qa-machine.  GrpcServerError: INTERNAL/An error occurred. Please contact the operator and inquire about the request 1cfe4a4eb3f76e163502e84241db1f63 with tid 1cfe4a4eb3f76e163502e84241db1f63 Request: ConnectDomain(Domain 'company-domain',false) Command ParticipantAdministration$domains$.connect invoked from bootstrap.canton:22",
    "logger": "com.digitalasset.canton.console.EnterpriseConsoleEnvironment",
    "thread": "main",
    "level": "ERROR"
  },
  {
    "message": "Bootstrap script terminated with an error: com.digitalasset.canton.console.CommandFailure: Command execution failed.",
    "logger": "com.digitalasset.canton.ServerRunner",
    "thread": "main",
    "level": "ERROR"
  }
]

The solution was to rollback the participant’s underlying Postgres database to a previous state. Some maintenance work on the underlying Postgres database had placed the participant in an invalid state.