Demo pruning with shorter time span

nycnewman · January 22, 2024, 11:23pm

Is there an example configuration to allow a demonstration of pruning with (very) short pruning interval (thinking 10 of minutes not hours)? There seem to be a number of parameters replating to deduplication, and contract acknowledgement that control the minimum period that can be used.

So ideal state would be a demo setup with domain and one or more participants, where contracts can be created and archived and pruning demonstrated to show removal from PCS.

Judy_Wu · January 29, 2024, 2:46am

I am having a similar question here while trying to demonstrate pruning participant node with the manual pruning GRPC API endpoint (call from java bindings against sandbox).

I am trying to cope with 2 main problems

I cannot set the pruneUpTo offset to the offset of last transaction (last-1 seems to be allowed)
Participant cannot prune at specified offset due to max deduplication duration of 10s error
After setting the config participants.sandbox.init.ledger-api.max-deduplication-duration from default 164h to 10 second

Wonder if anyone can provide me some pointers!
Thanks so much in advance

Judy_Wu · January 30, 2024, 3:45am

Thanks @oliverse and Cristina for the offline session.
I’d like to summarise the issues that I have encountered and the canton configs that need to be set for pruning relatively new data on ledger for testing or other purposes.

Below are example short-term pruning configs provided by @oliverse

1. pruning Participant (canton config)

# ensure that the participants don't need to keep history for more than 5m for duplicate request detection
# default was 7 days, so for short term pruning, this config need to be override
canton.participants.<participant node name>.init.ledger-api.max-deduplication-duration = <a time span shorter than (pruning timestamp - tx popluated timestamp)> 
# for example
canton.participants.sandbox.init.ledger-api.max-deduplication-duration = 5m

2. pruning Sequencer and Mediator (canton config)


# to be able to prune sequencers, have each sequencer client request a time proof every 5 minutes
# note that if there is a risk that any of participants does not produce domain traffic within 10 minutes,
# ensure that they specify time-tracker.min-observation-duration of 5 minutes when connecting to the domain
# so that participants also request regular time proofs and don't block sequencer pruning.
canton.domains.<domain name>.time-tracker.min-observation-duration = 5m
# for example
canton.domains.local.time-tracker.min-observation-duration = 5m
# have the sequencer clients from all nodes acknowledge more frequently to ensure that the sequencer can prune.
canton.participants.<participant name>.sequencer-client.acknowledgement-interval = 5m
canton.domains.<domain name>.sequencer-client.acknowledgement-interval = 5m

# cause the database sequencer to perform frequent checkpoints, so that lack of checkpoints doesn't prevent pruning.
canton.domains.<domain name>.sequencer.reader.checkpoint-interval = 5m

# cut off unauthenticated participants after 5 minutes as they only use unauthenticated requests during onboarding;
# otherwise the sequencer would only cut them off after 24 hours by default which would prevent pruning for the first day.
canton.domains.<domain name>.sequencer.pruning.unauthenticated-member-retention = 5m

Note that sequencer pruning config cannot bet set without below canton domain config setting
storage.type = postgres and sequencer.type = database

3. possible errors encountered

Error 1

Exception in thread "main" io.grpc.StatusRuntimeException: FAILED_PRECONDITION: OFFSET_OUT_OF_RANGE(9,d8a6eb30): prune_up_to needs to be before ledger end <ledger end offset>

This error indicates that the pruned offset we set was too large and it had to be smaller than a certain offset.
I bumped into this error when I tried to use the end/last offset on ledger when trying to prune.
According to @oliverse this is an existing issue and there’s a fix in progress.

Error 2

Participant cannot prune at specified offset due to max deduplication duration of <your dedup time in canton by setting canton.participants.<participant node name>.init.ledger-api.max-deduplication-duration>

The above error indicates that you are pruning transaction that is still within the dedup duration, and thus you are not able to prune them.

During my testing, I have shortened the config settings to 1 second and pruned after 1 second has long passed. Yet, the same error still occurred. As it turns out, the participant “time/clock” will not tick if there’s no events. The solution for my testing setup was to simply create some more transaction/event after the target pruned transaction when dedup time has elapsed.

Error 3

Participant cannot prune at specified offset due to no suitable offset for domain <your domain name>

This error pops out once in a while, during my testing and was somehow resolved after below steps

put domain onto postgres database
set the participant dedup time config to a lower value (as described above)
create other transaction after the intended prune offset when dedup time elapsed

Error 4

GENERIC_CONFIG_ERROR(8,0): Cannot convert configuration to a config of class com.digitalasset.canton.config.CantonEnterpriseConfig. Failures are:
  at 'canton.domains.local.sequencer':
    - (simple-topology.conf: 39) Key not found: 'type'.
 err-context:{location=CantonConfig.scala:1603}

When trying to set sequencer pruning config as described above, this error will occur if we have domain storage setup in memory. Moving the domain onto database can solve the issue.

Error 5

PARTICIPANT_PRUNED_DATA_ACCESSED(9,8b9b8a46): Transactions request from  to <your request end offset> precedes pruned offset <last pruned offset>

Above indicates that we are trying to read pruned transaction by giving the wrong offset range in while requesting transaction stream. If happens to me when I was trying to read transaction stream from Ledger Begin with below code

ledgerClient.getTransactionsClient()
                .getTransactions(LedgerOffset.LedgerBegin.getInstance(),
                        new FiltersByParty(Map.of(someParty, NoFilter.instance)), false);

The proper way of getting transaction stream from a pruned ledger would be, read ACS for snapshot and subscribe to transaction from the offset returned by the read-ACS call (basically the offset that ACS state ends at)

Topic		Replies	Views
Canton ledger pruning: Request Frequency of Time-Proofs Questions canton , pruning	1	174	April 20, 2023
Canton Ledger Pruning Questions canton	4	676	April 8, 2022
Deduplication over the last 24 hours Questions daml	1	227	September 14, 2022
Pruning impact on ledger API connections Questions canton , pruning	3	230	May 22, 2023
Observations after pruning Questions daml	2	252	May 23, 2022