Canton - Sequencer time [..] Exceeding max-sequencing-time / Response message for request X timed out

Hi team,

While I was doing some performance test in a Canton Setup with my participant nodes throwing thousand of transactions for multiple times.

My participant nodes start to get stuck and not the transactions are not completely processed.

On Domain side I have the warnings:

WARN  c.d.c.d.s.s.WritePayloadsFlow$:sequencer=sequencer1 tid:392ca77f7f3b794316b4207f4b3e5d8a - The sequencer time [2022-02-23T15:45:33.834257Z] has exceeded the max-sequencing-time of the send request [2022-02-23T15:45:33.695600Z]: deliver[message-id:af486a0b-bf71-4e40-9eba-206b42b1694a]
WARN  c.d.c.d.s.s.WritePayloadsFlow$:sequencer=sequencer1 tid:3b704311cf86d004554deb2e1100cf98 - The sequencer time [2022-02-23T15:45:33.836715Z] has exceeded the max-sequencing-time of the send request [2022-02-23T15:45:33.714168Z]: deliver[message-id:eef7bc75-0b6e-43fd-92ef-f5d64649845a]

While on the participant node side I have:

WARN  c.d.c.p.p.TransactionProcessor:participant=pNode/domain=domain tid:c4c1e7341370e106417340637fccbc36 - Response message for request [4002] timed out at 2022-02-23T15:45:34.130792Z
WARN  c.d.c.p.p.TransactionProcessor:participant=pNode/domain=domain tid:d7a0816af224ad7068e8ff410da77a58 - Response message for request [4001] timed out at 2022-02-23T15:45:34.130792Z

I’m not sure if this is related to the sequencer that has timed out or on participant node side…

Should I increase the timeout on the sequencer side ? Or in participant node side or in both ? Will you have an example of code on how you can increase the timeout on sequencer and on participant node ? If the root cause is related to something else, what would be the good approach to solve this?

Thanks for your continuous support,

Jean-Paul

1 Like

You can do the following:

  1. Increase timeouts at the domain. Here you can read how to increase the participantResponseTimeout: Operational Processes — Canton 1.0.0-SNAPSHOT documentation
    You probably also want to increase the mediatorReactionTimeout; that works similarly.
  2. Configure some resource limits at the participant: Canton Console — Canton 1.0.0-SNAPSHOT documentation

For our own performance tests, we have developed a performance runner (with two fixed workflows). The performance runner measures the time to process a command and increases or decreases the load accordingly. Depending on what you want to test, you can reuse our performance runner?

Please reach out, if you have further questions.
Matthias

2 Likes

Thanks a lot for the various resources !

I just attempted to set the participant and mediator timeout.

@ domain.service.update_dynamic_parameters(
    _.copy(
      participantResponseTimeout = TimeoutDuration.ofSeconds(60),
      mediatorReactionTimeout = TimeoutDuration.ofSeconds(60),
    )
  )
cmd0.sc:1: value update_dynamic_parameters is not a member of object ammonite.predef.ArgsPredef.domain.service
val res0 = domain.service.update_dynamic_parameters(

Can you please give an example of how you run it?

Cheers,
Jean-Paul

Dynamic configuration of domain parameters is very recent. If it does not yet exist, you need to set this through the static config of the domain:

canton {
  domain-managers {
    mydomain {
      domain-parameters {
        participant-response-timeout = 60s
        mediator-reaction-timeout = 60s
      }
    }
  }
}
1 Like

This is perfect, thanks!

As I’m proceeding with some test, I’m seeing that my domain keeps showing this sequencer warning and that it is unable to accept additional queries…

Is there a way to stop or “flush” the sequencer stack ?
What should we do when this happen ?

Thanks and regards,
Jean-Paul

I suppose that by sequencer warning, you mean:
The sequencer time ... has exceeded the max-sequencing-time of the send request ..., right?

You should try to avoid that situation, because the participant is still consuming resources to prepare the request, but the request will never result in a successful command execution. So the system is wasting resources, resulting in an unnecessarily low throughput.

If your application submits commands at a lower rate, the problem should go away.

1 Like