Recommendations for using a custom Canton logback.xml file during production

Hi all,

Is there a general recommendation on creating a custom logback.xml file for usage in a Canton application during production?

I’d like to implement a custom encoder pattern without missing out on any information which might be required for debugging in the event of an error.

Referencing the default logback.xml normally used by Canton, I notice that certain information such as

  • tid
  • context
  • err-context
    are defined in the entity entityCorrelationIdTrailingSpaceReplace within the logback.xml file as seen below:
    <!ENTITY entityCorrelationIdTrailingSpaceReplace "&#x0025;replace(tid:&#x0025;mdc{trace-id} ){'tid: ', ''}- &#x0025;msg&#x0025;replace(, context: &#x0025;marker){', context: $', ''}&#x0025;replace( err-context:&#x0025;mdc{err-context} ){' err-context: ', ''}&#x0025;n"> ]>

Question:
Would utilizing the entityCorrelationIdTrailingSpaceReplace entity in a custom encoder pattern be sufficient to ensure complete capture of information relevant for Canton troubleshooting purposes in all Canton nodes (Participant, Sequencer, Mediator and Domain-manager)?
Are there any additional configurations that have to be included in the custom logback.xml file?
The main goal is to ensure that all important debugging information is captured within the produced operation logs.

Example of intended custom logback appender:

<appender name="FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
    <file>log/${canton-node}-log.log</file>
        <rollingPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy">
            <fileNamePattern>logback-${canton-node}.%d{yyyy-MM-dd}.%i.log.gz</fileNamePattern>
            <maxFileSize>10MB</maxFileSize>
            <maxHistory>30</maxHistory>
            <totalSizeCap>10GB</totalSizeCap>
        </rollingPolicy>
    <encoder>
      <pattern> %d{"yyyy-MM-dd'T'HH:mm:ss,SSSXXX",UTC} [%logger{10}] &entityCorrelationIdTrailingSpaceReplace; [%thread] %level [%file:%line] -%kvp- %msg%n</pattern>
    </encoder>
  </appender>

Thank you

Thank you for you questions and providing more information offline, Julius.

Via testing we have confirmed that the entityCorrelationIdTrailingSpaceReplace specification does indeed preserve the trace-id that is essential for troubleshooting.

In summary for troubleshooting the following fields are important to include:

  • datetime (for example “2024-01-30 18:09:11,016”). The format can be reconfigured, but the full date and time up to millisecond precision should be present.
  • thread (“[pool-7-thread-1]”)
  • log level (“INFO”). Ideally production should run at INFO level for the com.digitalasset and com.daml loggers
  • logger name (“c.d.c.p.s.c.StateCache:participant=participant1”). Ideally the logger name should not be truncated.
  • trace-id (“tid:35b7b82f801ca9d83a14731b4bc13511”)
  • log message (“Updated cache with a batch of [ContractId(…) → Active(…)] at Offset(Bytes(…))”). Even though the example includes a few “…”, the log message should ideally also not be truncated.

Thank you,
– Oliver

1 Like