Canton - Domain Manager Health Check

Maksym_Zhovanyk · January 17, 2023, 5:44pm

Hi Team,
I am trying configure HTTP health check for HA domain manager setup(new 2.5.0 version). I configured it according to Release Notes:

        monitoring {
            health {
                server {
                    address = 0.0.0.0
                    port = 8000
                }
            check.type = is-active
            }
        }

But after start domain manager I got next errors:

IsActive health check must be configured with the participant name to check as there are many participants configured for this environment

Could you please advise what can be wrong? Would be great if you provide example for domain manager 2.5.0 health check configs.
Thanks.

bernhard · January 24, 2023, 1:09pm

You probably have a config like this:

canton
{
        monitoring {
            health {
                server {
                    address = 0.0.0.0
                    port = 8000
                }
            check.type = is-active
            }
        }
  participants {
    participant1 {
    ...
    }
    participant2 {
    ...
    }
  }
}

It’s complaining because having two nodes configured means you need two health endpoints configured. So try this

canton
{
  participants {
    participant1 {
    ...
           monitoring {
            health {
                server {
                    address = 0.0.0.0
                    port = 8000
                }
            check.type = is-active
            }
        }
    }
    participant2 {
    ...
        monitoring {
            health {
                server {
                    address = 0.0.0.0
                    port = 8001
                }
            check.type = is-active
            }
        }
    }
  }
}

Maksym_Zhovanyk · January 24, 2023, 1:33pm

@bernhard thanks for reply but you provided configs for participant. I asked about Domain Manager version 2.5.0+ where was implemented health check mechanism for enterprise version(according to release notes).
Thanks.

bernhard · January 24, 2023, 2:06pm

Oh, sorry, but it’s the same deal. Rather than having the monitoring block directly under canton, move it under the domains.mydomain block. If you could share more of your domain manager config, I could be more specific.

Maksym_Zhovanyk · January 25, 2023, 8:28am

@bernhard
we have for example three nodes for domain-manager(1 active 2 passive) and each of them has next config:

canton {
    domain-managers {
        domainmanager {
            storage = ${_shared.storage}
            storage.config.properties.databaseName = "db"
            init.domain-parameters.unique-contract-keys = false
            replication.enabled = true
            admin-api {
                port = 4001
                address = 0.0.0.0
            }
        }
    }
    monitoring {
        health {
            server {
                address = 0.0.0.0
                port = 8001
            }
            check.type = is-active
        }
    }

And you said that need move monitoring section to domainmanager section if we are talking about this example?

bernhard · January 25, 2023, 10:27am

Yes. It should be

canton {
    domain-managers {
        domainmanager {
            storage = ${_shared.storage}
            storage.config.properties.databaseName = "db"
            init.domain-parameters.unique-contract-keys = false
            replication.enabled = true
            admin-api {
                port = 4001
                address = 0.0.0.0
            }
            monitoring {
                health {
                    server {
                        address = 0.0.0.0
                        port = 8001
                    }
                    check.type = is-active
                }
            }
        }
    }

Maksym_Zhovanyk · January 27, 2023, 2:46pm

@bernhard
it does not work.
I am getting error:

Unknown key monitoring

2023-01-27 14:42:58,580 [main] INFO  c.d.canton.CantonEnterpriseApp$ - Starting Canton version 2.5.1
2023-01-27 14:42:59,976 [main] INFO  c.d.canton.CantonEnterpriseApp$ - Config field at storage.max-connections is deprecated. Please use storage.parameters.max-connections instead.
2023-01-27 14:43:00,148 [main] ERROR c.d.canton.CantonEnterpriseApp$ - GENERIC_CONFIG_ERROR(8,0): Cannot convert configuration to a config of class com.digitalasset.canton.config.CantonEnterpriseConfig. Failures are:
  at 'canton.domain-managers.domainmanager.monitoring':
    - (/canton/data/domainmanagers/domainmanager.conf: 13) Unknown key.
 err-context:{location=CantonConfig.scala:1467}
2023-01-27 14:43:00,168 [main] ERROR c.d.canton.CantonEnterpriseApp$ - An error occurred after parsing a config file that was obtained by merging multiple config files. The resulting merged-together config file, for which the error occurred, was written to '/tmp/canton-config-error-8203431665000491271.conf'.

bernhard · January 30, 2023, 2:10pm

Hi @Maksym_Zhovanyk , I was actually way off about how this works. Monitoring in general is done per process, not per node, which is why it sits below the canton namespace directly. But the is-active setting is per node since it reflects whether the node is active or not in a HA setup.

A setup where multiple processes of a single HA setup run in the same process is currently not well supported. You can still set the is-active, but you have to specify which of the nodes in the process to report on. Eg

canton {
  monitoring {
        health {
            server {
                address = 0.0.0.0
                port = 8001
            }
            check {
                type = is-active
                node = mydomain
            }
        }
  }

  participants {
    participant1 {
      storage.type = memory
      admin-api.port = 5012
      ledger-api.port = 5011
      monitoring {
        health {
            server {
                address = 0.0.0.0
                port = 8002
            }
            check {
                type = is-active
            }
        }
      }

    }
    participant2 {
      storage.type = memory
      admin-api.port = 5022
      ledger-api.port = 5021
      monitoring {
        health {
            server {
                address = 0.0.0.0
                port = 8003
            }
            check {
                type = is-active
            }
        }
      }
    }
  }
  mediators {
    mymediator {

      admin-api {
        address = localhost
        port = 3031
      }
    }
  }
  sequencers {
    mysequencer {

      public-api {
        address = localhost
        port = 3010
      }

      admin-api {
        address = localhost
        port = 3011
      }
    }
  }

  domain-managers {
    mydomain {

      admin-api {
        address = localhost
        port = 3001
      }

    }
  }
}

Note that the activeness of participants, mediator, domain manager and sequencer are all independent from each other, though. They can fail over as individual nodes. So the activeness endpoint won’t be that useful.

We are working to improve the health checks in general, but in the meantime, you should run one node per process for HA setups.

bernhard · January 30, 2023, 3:32pm

As an addendum, note that every node exposes activeness through its status API: canton/status_service.proto at d29737615cd7f8cf08ac5411f07a52f12ad3c803 · digital-asset/canton · GitHub

Topic		Replies	Views
Canton Domain - Health Checks Questions canton	4	518	January 21, 2022
Health check for remote Daml Canton Mediator Questions canton	8	258	January 10, 2023
Questions regarding the Canton and DAML interoperation Questions daml	5	476	June 2, 2020
Not able to connect to canton domain Questions canton	9	760	April 7, 2021
Connecting to Participant node on Canton Questions canton	1	527	October 20, 2020

Canton - Domain Manager Health Check

Related topics