When is RESOURCE_EXHAUSTED returned by the command submission/completion services?

When you submit-and-wait for a command, the ledger API explicitly declares that RESOURCE_EXHAUSTED may be returned in certain circumstances.

service CommandService {

  // Submits a single composite command and waits for its result.
  // Returns ``RESOURCE_EXHAUSTED`` if the number of in-flight commands reached the maximum (if a limit is configured).
  // Propagates the gRPC error of failed submissions including DAML interpretation errors.
  rpc SubmitAndWait (SubmitAndWaitRequest) returns (google.protobuf.Empty);

No similar circumstances are documented for the pair of command submission and completion services.

// 1) ``INVALID_PARAMETER`` gRPC error on malformed payloads and missing required fields.
// 2) Failure communicated in the gRPC error.
// 3) Failure communicated in a Completion.
// 4) A Checkpoint with ``record_time`` > command ``mrt`` arrives through the Completion Stream, and the command's Completion was not visible before. In this case the command is lost.
  // Identifies the exact type of the error.
  // For example, malformed or double spend transactions will result in a ``INVALID_ARGUMENT`` status.
  // Transactions with invalid time time windows (which may be valid at a later date) will result in an ``ABORTED`` error.
  // Optional
  google.rpc.Status status = 2;

However, @cocreature pointed out that there seems to be a chain of code in Sandbox that would yield RESOURCE_EXHAUSTED from the command submission service.

  private def handleSubmissionResult(result: Try[SubmissionResult])(
      implicit loggingContext: LoggingContext,
  ): Try[Unit] = result match {
    case Success(Acknowledged) =>
      logger.debug("Success")
      Success(())

    case Success(Overloaded) =>
      logger.info("Back-pressure")
      Failure(Status.RESOURCE_EXHAUSTED.asRuntimeException)

Yet that is only used from deduplicateAndRecordOnLedger, which would seem to imply much more evaluation than the command submission service should, at minimum, carry out. @cocreature reasonably suggested that I’m reading too much into it, but I think it is at least equally likely that Overloaded is an impossible state in the above function, handled to satisfy the exhaustiveness checker.

In a sense, that is all besides the point, because my goal is to be a well-behaved ledger API client, not one entangled with the foibles of this particular server. That yields two questions:

(1) When is it permissible for these services to return RESOURCE_EXHAUSTED? May a Ledger API server yield this from the submission call, as an element in the completion stream, or one of these at its discretion?

(2) What is required for these services to return with respect to RESOURCE_EXHAUSTED? For example, if a service accepts a command submission, is it then constrained in this respect with regard to related responses from the command completion service?

For context, the above arose during #7820.

4 Likes

This is how ledger API services work with respect to RESOURCE_EXHAUSTED errors, to the best of my knowledge:

Command submission service

If the submission service returns RESOURCE_EXHAUSTED, it means the command submission was not accepted by the backing ledger due to back-pressure. There will therefore be no element in the completion stream. The client should back off exponentially and retry.

For reference, the submission service does:

  1. Validate the gRPC request
  2. Deduplicate the command (i.e., reject it if the “same” command was already submitted through this participant)
  3. Execute the DAML command (including choosing a transaction time)
  4. Send the resulting transaction to the backing ledger

The ledger API server does not backpressure itself when there are too many commands piling up in the first three steps (it should). However, the backing ledger can return Overloaded in the last step above. In that case, the submission service returns RESOURCE_EXHAUSTED.

See WriteService. submitTransaction() and SubmissionResult.Overloaded

Command completion service

The completion stream does not contain RESOURCE_EXHAUSTED elements.

See CompletionFromTransaction

Command service

The command service returns RESOURCE_EXHAUSTED when:

  • The command service itself is overloaded (too many concurrent commands)
  • The submission service returned RESOURCE_EXHAUSTED (the command service uses the command submission service and forwards any error returned by the submission service)

In both cases, the client should back off exponentially and retry.

See HandleOfferResult

4 Likes

Note that the documentation of ledger API error codes is being improved right now: #7844. Feel free to comment on the PR if you have any suggestions.

2 Likes

Very useful, thanks @Robert_Autenrieth.

1 Like

@Robert_Autenrieth Thanks for sharing this. Can you share a bit more information in regards to this

* The command service itself is overloaded (too many concurrent commands)

In this case, what error the user will observe?