Missed transactions on the Transaction Stream due to disconnection to Participant Node

  1. How does a Java application listening to a participant node Transaction Stream determine if the connection to the node is lost? Is there a specific exception to look for that is thrown immediately upon disconnection?
  2. Given no exception seems to be thrown when this connection is lost intermittently, and that the transaction offsets are not incremental, how to ensure that no Transactions are missed due to the connection being dropped intermittently?

You can use offsets to this end: if you persist the last received offset and provide it in a follow-up call to resume, the initial offset is exclusive, so that you can receive the first item after the last known one. This allows to transparently resume a stream from the client side regardless of whether the problem is an intermittent connection or a problem on either side of it. See for the example the documentation of GetTransactions for the begin field:

Beginning of the requested ledger section. This offset is exclusive: the response will only contain transactions whose offset is strictly greater than this. Required

1 Like

But this approach is to resubscribe to the event stream when the Java app knows the connection is dropped - or if the Java app has crashed and is trying to recover. Then in those cases it can provide the last processed offset to resume the reading. But how can the Java app detect that the connection has dropped intermittently? (say for 2-3 seconds, causing some transactions to be lost on the read stream due to this)

The way I would deal with it is probably by applying a strategy that makes sense for the expected scenarios while giving enough visibility in case of unexpected conditions. An exponential back-off strategy should be relatively easy to implement, but you might want to evaluate whether for your use case you might want something more nuanced and feature-rich, such as components in libraries like Hystrix and resilience4j (both of which should have components that natively work against RxJava, which is what our Java bindings are based on).

To be clear: you should have guarantees at the transport levels that you will not randomly drop packages over a faulty connection (gRPC works over HTTP2/TCP). So handling offsets over reconnections and using relevant retry strategies should be all you need to do to ensure you are consuming a stream in its entirety.

1 Like