We’re setting up monitoring for Canton domain, part of the job is to create dashboards based on the metrics scraped from Canton components. Looking at all the metrics available in docs and wondering whether there is a set of the most valuable/recommended metrics one should include?
Thanks for any suggestions!
Yes, we know. We are working on an improvement that will label “key metrics” that you should observe appropriately, such that you can distinguish them from metrics that are more useful to debug detailed behaviour.
Which metrics you want to monitor depends on your taste. The ones I find important are:
- canton..sequencer-client.delay and load
But if you are a bit patient, then @simon might point you to a new piece of documentation that gives concrete recommendations.
Hi Piotr. Metris are important for monitoring and there are two levels.
The first level would be infrastructure related metrics like: CPU utilization, virtual memory paging, disk IO rates / latencies, JVM garbage collection, etc. These first level metrics are key to monitoring for capacity and health.
The second level of metrics include what @Ratko_Veprek mentions but these are very fine grained. We are working on metrics that support the SRE Golden signals methodology (aka “REDS”). Development work is progressing quickly and I was wondering if you might be willing to evaluate the Beta version and provide feedback?
I gathered the infrastructure level metrics from canton jvm_metrics as well as additional node_exporter for host metrics. The more fine grained like requests, errors, saturation etc. is what i’m after - seems you’re already on that following RED/Golden signals so i’d gladly evaluate your beta version.