There are three angles to this I see from the question and the comments so far:
1. Can you verify that you have a complete set of data for a party Alice?
Short answer: No. Imagine a topology where Alice is hosted on two nodes P1 and P2, and your ledger is running on an infrastructure in which a fork is possible. If you have a network partition separating P1 and P2 and there is a fork, P1 and P2 have no mutual knowledge of what’s going on on the other side and the term “complete set of data for a party Alice” loses meaning.
Less drastically, the Partitioned Ledger Topology allows for partitioned ledgers without forking. In that scenario it’s possible to have to participants P1 and P2 both hosting Alice, but part of different sets of partitions. A single participant does not have enough information to ascertain whether its data for Alice is complete. It can only ensure it’s complete for the partitions it’s a member of.
So: The best you can ask for is that you can verify completeness of Alice’s data on the partition/fork that a given participant is a member of.
2. Can I verify that a Participant is giving me a complete set of data for the partition/fork it belongs to?
The best you can hope for is that you can verify completeness up to a point as due to whatever delays or latencies you may not have caught up to the latest yet. But even a point is imprecise. DAML Ledgers are not linearly ordered, they only enforce causality ordering. There’s a PR open documenting this in great detail.
Furthermore, a single Party (and this a single Participant) only has partial knowledge of the Ledger. Eg if Alice and Bob alternately create Foo
contracts which only the creator is a stakeholder on, Alice doesn’t have any knowledge of Bob’s activity and vice versa.
In the same vein, if Alice, Bob and Charlie keep moving an IOU issued by Doris around in a circle, each single party only sees disconnected inward and outward transfers. They can’t correlate the events. If you add the whole partitioning topic, you can imagine this sequence:
- Alice on P1 sends to Bob on P2
- Bob on P2 sends to Charlie on P3
- Charlie on P3 sends to Alice on P4 (on a different partition)
- Alice on P4 sends to Bob on P2
- Bob on P2 sends to Alice on P1
What does Alice on P1 see? Just an outward transfer to Bob and an inward transfer from Bob. P1 has no information about the fact that any Iou passed though Alice on P4.
What the DAML Ledger model does allow you to verify is that given an event/action, you can verify that the subgraph that led up to that event/action is valid, which includes a degree of completeness. Ie Alice on P1 in the above can verify that because both Doris and Bob gave appropriate authority the outward and inward transfers are valid and there is no information missing that Alice is entitled to.
This kind of verification is enabled by the TransactionService in tree mode.
3. Can my client application check that it didn’t miss a transaction that the Participant does know?
The scenario here would be that a client application subscribes to the transaction service, and crashes a offset 7. It restarts, re-subscribes, gets transactions from offset 17 and misses information in between.
Rather than going for verification, the idea of the DAML Ledger API is that the client keeps track of the last offset it has seen. Ie after completing the processing of the transaction at offset 7, it should write 7 to it’s own persistence. If it now crashes, it knows to resubscribe from 7.
Clients without persistence need to be able to restart from current state. Ie they need to subscribe to the Active Contract Service first, which gives the offset at which that contract set was valid. Then the client subscribes from that offset.
The Ledger API gives the guarantee that the data returned in those usages is complete. As @stefanobaghino-da said, that’s something you need to trust.
Technically we could emit some hashes, to make it look fancy, but it wouldn’t add any guarantees. Eg imagine we kept a table which for each Party and offset kept a hash #(Party, Offset) in such a way that if for Alice offset 11 follows offset 7, the #(Alice, 11) is the hash of [#(Alice, 7), 11]. The client app could now “verify” that the hashes all match nicely, but since they are Participant generated, you are still just trusting the participant to do its job.