Clarification on "party on two nodes"

Regarding Party on Two Node please clarify the following:

  • What is the correct terminology? Is it party migration? Based on the docs it is a bit confusing.
  • My understanding of migrating ACS is that it’s basically dump from one participant and load in another. Is that correct?
  • What is the recommended approach?
    1. Stop all participants (i.e. P1)
    2. Migrate ACS
    3. Bring up participants (i.e. P1 and P2)

Party migration is the term I’ve usually seen used for the whole process of delegation + ACS copy. I sometimes use ACS migration when I’m specifically only talking about the ACS part of that.

Your understanding of ACS migration is correct.

I don’t think you need to stop any participants for this. If you follow the docs they explain the important steps:

  1. Disconnect the target from all domains after you requested delegation.
  2. Select the right offset at which you export the ACS on the source participant.
  3. Import the ACS on the target participant.
  4. Reconnect the target participant to domains.

My idea behind of stopping P1 was that to my understanding there should be no activity on it while migration is in progress. Stopping it seemed like a good idea to prevent that.

I don’t think that is a requirement. You pick the offset when you export the ACS. Concurrent activity is fine because that will have happened after the offset.

If you really stop it, I don’t think you can actually download the snapshot. You could disconnect it from the domain.

Something is missing here for me.

  • Suppose I have the ACS of [A, B, C] on P1.
  • I start a download, and in the meantime a contract D is created. The download will contain the original ACS, i.e. [A, B, C].
  • I upload this to P2.
  • I connect P2 to the domain. I think the party will not see D on P2.

At least essentially this was the original problem that we needed to solve using this whole migration.

P2 resumes reading from the domain at the offset where you did the party delegation. So it will see D.

If that is the case can’t I somehow specify a “ledger begin” offset and effectively forget this whole ACS migration stuff?

No, p2 resumes reading from the point where the party has been delegated. That’s why it’s important that you get the export at exactly that offset.

If both P1 and P2 can see from a point in time the same contracts (using the migrated party), that means there is some synchronisation mechanism between participants.

If this is the case I can imagine an admin call like p2.synchroniseFrom(offset).

Remember, the contracts are not stored on the domain. They are stored entirely with the participants. We indeed need a synchronisation mechanisms for party migration, and we need to be careful to not break a participant. If Alice is currently on p1 and is additionally added to p2, but you don’t tell p2 about the contracts that Alice has (e.g. p1 crashes while p2 downloads), then p2 will receive confirmation requests which it can’t process as it doesn’t know about the contracts, so it will deem them to be malicious, while they are valid for p1. In short, you won’t be able to use p2 until the export process has completed, which means we have to work out additional synchronisation to make things work without the risk of breaking nodes.

As an MVP, we established this feature with a file-based import / export that requires the target participant to be offline until the ACS has been imported to avoid that race condition. This enables the principal feature for now. Of course, eventually, we’ll make it a convenient automated process with high security, but that is not just a trivial change, so it will take some time.

This feature is closely connected with Distributed recovery of participant data (High-Level Requirements — Daml SDK 2.5.3 documentation), so will be tackled together.

2 Likes