Verifying that the transaction stream is sending all transactions a Party is entitled to see?

Is there any means for a ledger api client application to verify that the Participant hasn’t - perhaps accidentally - missed sending a transaction for the subscribed Party?

I’m often asked how a client application or participant node operator can verify that they have all the transactions that a party is entitled to.

2 Likes

You should never test your upstream components. The question is akin to “how do I test that the database really sent me all records that match my where clause”.

If you think you spotted a bug, you should try to create a minimal repro and open a PR (or at least raise the issue on GitHub). :slightly_smiling_face:

5 Likes

I am thinking about @Piyush_Bedi’s question. I think a service similar to TransactionService but returning current ledger state checksum instead of (stream GetTransactionsResponse) and (stream GetTransactionTreesResponse) would be useful for clients.

E.g. if I have a long running TransactionTree stream on the client, I would want to have some guarantees, checks in place that assure me that I did not accidentally drop a transaction because of runtime exception or some other bug on my side.

If there were:

  1. a documented deterministic algorithm to calculate transaction stream checksum and
  2. a gRPC endpoint to request this checksum from the ledger

this would have provided a way to validate the current state on the client-side without re-subscribing from the ledger begin.

2 Likes

There are three angles to this I see from the question and the comments so far:

1. Can you verify that you have a complete set of data for a party Alice?

Short answer: No. Imagine a topology where Alice is hosted on two nodes P1 and P2, and your ledger is running on an infrastructure in which a fork is possible. If you have a network partition separating P1 and P2 and there is a fork, P1 and P2 have no mutual knowledge of what’s going on on the other side and the term “complete set of data for a party Alice” loses meaning.

Less drastically, the Partitioned Ledger Topology allows for partitioned ledgers without forking. In that scenario it’s possible to have to participants P1 and P2 both hosting Alice, but part of different sets of partitions. A single participant does not have enough information to ascertain whether its data for Alice is complete. It can only ensure it’s complete for the partitions it’s a member of.

So: The best you can ask for is that you can verify completeness of Alice’s data on the partition/fork that a given participant is a member of.

2. Can I verify that a Participant is giving me a complete set of data for the partition/fork it belongs to?

The best you can hope for is that you can verify completeness up to a point as due to whatever delays or latencies you may not have caught up to the latest yet. But even a point is imprecise. DAML Ledgers are not linearly ordered, they only enforce causality ordering. There’s a PR open documenting this in great detail.

Furthermore, a single Party (and this a single Participant) only has partial knowledge of the Ledger. Eg if Alice and Bob alternately create Foo contracts which only the creator is a stakeholder on, Alice doesn’t have any knowledge of Bob’s activity and vice versa.
In the same vein, if Alice, Bob and Charlie keep moving an IOU issued by Doris around in a circle, each single party only sees disconnected inward and outward transfers. They can’t correlate the events. If you add the whole partitioning topic, you can imagine this sequence:

  1. Alice on P1 sends to Bob on P2
  2. Bob on P2 sends to Charlie on P3
  3. Charlie on P3 sends to Alice on P4 (on a different partition)
  4. Alice on P4 sends to Bob on P2
  5. Bob on P2 sends to Alice on P1

What does Alice on P1 see? Just an outward transfer to Bob and an inward transfer from Bob. P1 has no information about the fact that any Iou passed though Alice on P4.

What the DAML Ledger model does allow you to verify is that given an event/action, you can verify that the subgraph that led up to that event/action is valid, which includes a degree of completeness. Ie Alice on P1 in the above can verify that because both Doris and Bob gave appropriate authority the outward and inward transfers are valid and there is no information missing that Alice is entitled to.

This kind of verification is enabled by the TransactionService in tree mode.

3. Can my client application check that it didn’t miss a transaction that the Participant does know?

The scenario here would be that a client application subscribes to the transaction service, and crashes a offset 7. It restarts, re-subscribes, gets transactions from offset 17 and misses information in between.

Rather than going for verification, the idea of the DAML Ledger API is that the client keeps track of the last offset it has seen. Ie after completing the processing of the transaction at offset 7, it should write 7 to it’s own persistence. If it now crashes, it knows to resubscribe from 7.

Clients without persistence need to be able to restart from current state. Ie they need to subscribe to the Active Contract Service first, which gives the offset at which that contract set was valid. Then the client subscribes from that offset.

The Ledger API gives the guarantee that the data returned in those usages is complete. As @stefanobaghino-da said, that’s something you need to trust.

Technically we could emit some hashes, to make it look fancy, but it wouldn’t add any guarantees. Eg imagine we kept a table which for each Party and offset kept a hash #(Party, Offset) in such a way that if for Alice offset 11 follows offset 7, the #(Alice, 11) is the hash of [#(Alice, 7), 11]. The client app could now “verify” that the hashes all match nicely, but since they are Participant generated, you are still just trusting the participant to do its job.

3 Likes

Regarding the N3:

After the crash restart, how would you know for sure that your application managed to “persist” the last processed offset? What if offset persistence and transaction processing is not an atomic operation? You might end up processing all events up to offset 17, but crash on persisting offset 17 and end up with offset 7 persisted as the last seen offset.

I would want to be able to run the client-side integrity check after every client application crash, unless my client application designed that it guarantees atomic operations.

1 Like

Yes, but is that something the Ledger API could actually help with in any way? I don’t see how.

Ledger API could provide a service to simplify integrity checks on the client side. So client does not have to consume all transactions from the ledger begin when it is not 100% sure that the client state is consistent.

We should also document Ledger API best practices, explaining why it is important that ledger transaction processing and ledger offset update/persistence are handled as one atomic operation.

I think this would be down to the user. In an ACID model, such as most SQL databases, one could typically expect the user to trust that their database wouldn’t COMMIT half a DB transaction, in which they processed a DAML transaction and then updated their offset.

In a looser model, you still tend to have certain kinds of atomic guarantees. For example, in many document databases (CouchDB, Elasticsearch, MongoDB, etc.), you can expect each document to be atomic. This means you can write the latest offset into each document that is changed, as well as writing to a “latest offset” document. If you do this, you can then check the offset in each document that would be manipulated and discard any transactions with an older offset.

Of course, this only works for mutations, not creations or deletions in the general case. You might be able to mitigate this by using the transaction ID as the document ID, meaning that duplicate creations will fail and duplicate deletions will no-op, but this really depends on what you’re storing and for which purpose.

At the end of the day, if you’re trusting to a non-atomic data store with eventual consistency, you’re always going to have issues with inconsistent data at some point. These issues usually need to be addressed based on your domain model; there’s no right answer.

Well, maybe the right answer is “use PostgreSQL”, but I don’t want to dictate that. :wink: