Hi Team,
Checking the Canton documentation here, I’ve been able to enable health checks for multiple Canton Participant nodes.
However, the documentation do not mention health checks for Canton Domains. Checking the Scaladoc’s for the health monitoring API (see here), it seems we only have configuration for checking participant nodes. I’ve worked around this by utilising netcat
to check that the public port is available on the Canton Domain node.
However, it would be useful to expose a configurable http endpoint as part of a Canton Domain setup.
Brian
Hi Brian,
The health check uses a “ping” Daml transaction (if configured to use a ping-based check, like in the docs). All Daml transactions run by Canton have to go through a Canton domain, so if you’re only using one domain then the ping is going to check the health of that domain.
I’ve worked around this by utilising netcat
to check that the public port is available on the Canton Domain node.
I’d be interested to know why you need to check the public port is available. Is it for debugging, to check that a running system is still healthy, or perhaps for coordinating a deployment? There might be something else to help.
Hi Phoebe,
Thanks for your reply.
In my use case we are running Canton in a Kubernetes environment, with the domain and multiple participant nodes all running in their own segregated container (also running in their own dedicated pod). Kubernetes needs to know the health of a container in order to keep the target running state for a given configured service. As the domain is running separate from the participants, I need a way to let Kubernetes know if each running container is healthy or not. Kubernetes uses what’s called a livenessProbe
configuration in order to check if a container is healthy and I’ve configured this to run a netcat -z <domain_host> <domain_public_port>
. I took the opinion that the public port is more critical to monitor for a running Canton network than the admin port as participants connect to the public port.
This domain configuration is running thus far as expected - its just the canton solution provided for participants is very convenient and exposes a very useful HTTP endpoint -
monitoring.health {
server {
address = 0.0.0.0
port = 10019
}
check {
type = ping
participant = hubparticipant1
interval = 30s
}
}
1 Like
Hi Brian,
The health endpoint is associated with a Canton process rather than with any specific Canton node. The docs you linked show configuring a ping-based health check, which requires a participant node, but you can also configure an “always-healthy health check” that will return a 200 as long as the health check service is up (docs).
You should be able to add a health endpoint to your domain deployment, as long as you use the always-healthy check. This doesn’t give you much more information than checking whether the public API port is up, but it might make your deployment a bit simpler.
The config for the always-healthy check should look like this:
canton {
monitoring.health {
server {
port = 7000
}
check {
type = always-healthy
}
}
Is this the kind of thing that you’re looking for?
I don’t think there’s a more comprehensive way to check that domain components are behaving correctly in isolation at the moment – the standard check is performing a ping from a participant.
2 Likes
Thanks a lot @Phoebe_Nichols - I think the always-healthy
will solve my problem
I do think it would be rather useful to be able to check the individual components health in isolation as I can envisage production running domains/participants separating the services in their own containers to improve HA and allow for horizontal scaling (ie, the domains sequencer).
1 Like