Canton DB migration timeout DEADLINE_EXCEEDED

Matheus · October 21, 2021, 3:17pm

Hi,

When starting up canton for the first time on an empty DB it usually runs the migration for the public and ledger_api schemas.

One or the other time, it takes a bit longer and when almost finishing the ledger_api schema migrations it renders an error complaining that the deadline of 53s was exceeded:

e[0;39me[34mINFO  o.f.c.i.c.DbMigrate - Migrating schema "ledger_api" to version "42 - Convert hash indices"
e[0;39me[34mINFO  o.f.c.i.c.DbMigrate - Migrating schema "ledger_api" to version "43 - explicit compression"
e[0;39me[34mINFO  o.f.c.i.c.DbMigrate - Migrating schema "ledger_api" to version "44 - offset as text"
e[0;39me[34mINFO  c.d.c.n.g.ApiRequestLogger:participant=participantIssuer tid:f26c9bb5b43388b9d57168c68fed13aa - Request c.d.c.i.a.v.InitializationService/InitId by /10.131.2.220:46390: cancelled
e[0;39me[1;31mERROR c.d.c.c.EnterpriseConsoleEnvironment - Request failed for participantIssuer.
  GrpcClientGaveUp: DEADLINE_EXCEEDED/deadline exceeded after 53.999618057s. [remote_addr=/0.0.0.0:5012]
  Request: InitId(di-execution,1220a6851a09e8fea4016f03caf33dc3aff2f3057614ae7c91e683852d18632e1649)
e[0;39me[1;31mERROR c.d.c.ServerRunner - Command execution failed.

e[0;39me[1;31mERROR c.d.c.ServerRunner - Unexpected error while running server:
e[0;39me[34mINFO  c.d.c.ServerRunner - Exception causing error is:
e[0;39mcom.digitalasset.canton.console.CommandFailure:

Can we somehow extend this timeout? looks like it’s a Grpc thing but the documentation I found online was limited.

Thanks in advance.

Matheus

MatthiasSchmalz · October 21, 2021, 4:49pm

You can change timeouts either statically or at runtime.

Statically: Change the config parameter parameters.timeouts.console = 10m.

At runtime: Call console.set_command_timeout(10.minutes) to change the timeout.

If that does not work for you, please get back to me with the exact commands you are running.

Matheus · October 25, 2021, 8:37am

Hi @MatthiasSchmalz ,

Thanks for the reply. I tried to change it statically but I got an error complaining that for parameters.timeouts.console it expects an object but found a string for 10m.

Then I went with the runtime approach, and added console.set_command_timeout(10.minutes) in the bootstrap script.

It seems to accept it, but I wonder since we’re talking about DB migrations if that is invoked before the migration, or if the migration runs before the participant is first started?

MatthiasSchmalz · October 25, 2021, 11:31am

Hi @Matheus

Right, the config parameter should actually be:
parameters.timeouts.console.bounded = 10m

Sorry about the mixup!
The documentation of our config parameters can be found here:
https://www.canton.io/docs/dev/scaladoc/com/digitalasset/canton/config/CantonCommunityConfig.html

The DB migration is applied for each node individually and it is applied when the node is started. Note that nodes may be started automatically before the bootstrap script is applied. To avoid that, you need to:

Start Canton with --manual-start=true.
Start nodes in the bootstrap script. (E.g., nodes.local.start().)
Reconnect participants to domains in the bootstrap script. (Something like myParticipant.domains.reconnect().)

jaypeeda · April 19, 2022, 12:56pm

Hi,

I encountered the same issue.
Is the solution to shut down the pNode and execute a recovery on that pNode to its state?

Cheers,
Jean-Paul

Rafael_Guglielmetti · April 19, 2022, 1:07pm

Hi @jaypeeda ,
I am not sure I understand why you want to execute a recovery and how it is related to the original question (since the original question mention an issue at startup).

In which context are you seeing the timeout?
Have you tried Matthias solution above (extending the timeouts and disabling auto-starting of the nodes)?

Best,

Rafael

jaypeeda · April 19, 2022, 1:33pm

Hi Raphael,

I created a new topic so it’s more relevant:

Cheers,

Jean-Paul

Topic		Replies	Views
Canton `participant.db.migrate` questions Questions damlhub , canton	8	416	October 4, 2022
Participant node running for multiple days with dataload - DEADLINE_EXCEEDED Questions canton	4	204	April 20, 2022
Canton participant node warn message: DB_STORAGE_DEGRADATION Questions canton	3	227	January 18, 2022
Running Canton Participant without the database user having elevated permissions Questions canton	8	335	January 26, 2022
Canton participant node error "RejectedExecutionException" Questions canton	3	409	January 19, 2022

Canton DB migration timeout DEADLINE_EXCEEDED

Related topics