Hello, we have a java application that listens using “getTransactionTrees” for specific transactions for external processing. When running on sandbox the Flowable thread seems to stay open forever(I have tested it up to 30 min between transactions), but when connected to DAMLHub it seems to only stay open 5 min between transactions before any new transactions do not fire. If I restart the application the it of course sees the new transactions and processes them. I have tried to change both the idleTimeout and keepAliveTime settings on a NettyChannelBuilder and neither seemed to change the 5 min. Is there some other setting that I’m missing or is this just a limitation of DAMLHub that we need to solve for? Thanks.
For performance reasons, we can’t keep connections open indefinitely that do not have any traffic flowing. Depending on what sits between your client and Daml Hub (or any ledger implementation), it’s also not uncommon for intermediate proxies and/or firewalls to also close connections with no activity.
I definitely recommend planning for disconnects anyway, because those could happen for any reason. You may also want to consider uploading a trigger if the specific thing you’re trying to do can be handled purely by looking at the state of the ledger alone.
Yeah we handle disconnects so that’s not a problem. This doesn’t seem to throw anything when it stop listening though. Is there something I need to do so that it throws something when it stops? Code looks like its running just fine still, its just when a new event happens it doesn’t kick off after the 5 min. I would hate to have to base it off a timer to check if there are any new events.
That problem is essentially why we disconnect after a few minutes of idle activity; in the absence of any traffic, neither side can determine whether a connection is actually open.
These are some design proposals on the gRPC libraries to try to balance the competing concerns of keeping clients connected while not overwhelming servers:
- proposal/A8-client-side-keepalive.md at 6070a6b5cd1c014e0b5be54701ca07b8fb1c128c · grpc/proposal · GitHub
- proposal/A9-server-side-conn-mgt.md at 6070a6b5cd1c014e0b5be54701ca07b8fb1c128c · grpc/proposal · GitHub
An admittedly brute-force way of keeping the connection open would be to simply send traffic periodically; it could be something as simple as a periodic TransactionService.GetLedgerEndRequest
(which is relatively cheap) just to convince both the client stack and the server (here, Daml Hub), that the connection is still alive and being used.
Separately we’ll take a look to see if there are things we can do from our side to improve this, including recommended settings on the client.
Thank you! These suggestions will deff help get me down the right path.
Can you help me understand what the problem is here and how you detected it?
Does the following match what happened:
- you start to read transactions from a Daml Hub Ledger API server connection
- some transactions start to flow in
- no transactions flow in for (at least) 5 more minutes because there’s no actual relevant data to read
- some command that should cause a transaction to pop up in the transaction stream is sent
- the
Flowable
doesn’t receive anything and theonError
handler doesn’t either
@dtanabe If the scenario I described above is what happened, then I’m confused. I would assume that if the server cuts down the connection (for any reason, including some idle timeout) the client would see at least that the connection has been closed (and the onError
handler to be triggered). What am I getting wrong?
100% Correct. After the 5 min of inactivity the code is still running and when the new transaction comes through it is not popping up in the transaction stream and the “doOnError” method never fires at any point during that time frame.
Thanks for the feedback, @RyanMedlen, it’s really appreciated.
@dtanabe I’m confused. How do you cut the connection? As mentioned before, regardless of the way in which the connection is closed, the error handler should be invoked and the Flowable
terminated. Can you help understand under what circumstances this would not happen?
Another question, hopefully the last one: do you also have an onComplete
handler? Does that also appear not be triggered?
I do not. Let me implement that and get back to you.
Another question: to what value did you set the keepAliveTime
? I would assume than anything lower than 5 minutes (say, 1 minute) would send the keep alive signal with enough frequency to prevent the connection to be dropped on the Hub’s side. (gRPC doc)
I misunderstood what that setting did. I thought that was how long it would keep alive the connection not the frequency it would send a command to keep it alive. I will try setting it to 1 min and see if that works.
I implemented the “OnComplete” handler and it never got triggered.
I will do some longer tests but I set the “keepAliveWithoutCalls” to true and “keepAliveTime” to 1 minute and it is working! Thank you for the support on this.
@stefanobaghino-da So I did ran into a issue on the longer runs. I am handling onError so I’m not sure why thats throwing but I will dig deeper into that. So it’s not as easy as those settings overall. Seems to be a step in the right direction though.
io.reactivex.exceptions.OnErrorNotImplementedException: The exception was not handled due to missing onError handler in the subscribe() method call. Further reading: Error Handling · ReactiveX/RxJava Wiki · GitHub | io.grpc.StatusRuntimeException: INTERNAL: RST_STREAM closed stream. HTTP/2 error code: INTERNAL_ERROR