HTTP-JSON API load balancing recommendations

The HTTP-JSON API service is the recommended way to start accessing ledger state. Since it relies upon the upstream transaction and active-contract-set services to get data and maintain a cache one could horizontally scale this component.

  1. In that case is a session persistent load-balancer the appropriate proxy?
  2. Are there other considerations such as maybe due to which servers send commands upstream to the ledger?

From the description I read on the documentation you shared about session persistence, the mechanism is based on the IP from which the request originates. This could lead to improvements on specific workloads, but is not necessarily a good fit in the general case (since it’s tied to a very low-level concept – the IP address – which is not necessarily well-correlated with the access pattern you might expect in certain scenarios – e.g. if the bulk of requests comes from behind a proxy, the IP address might not be helpful and possibly end up skewing the load balancing).

The only suggestion that I can recommend in the general case (and that I would still recommend evaluating for your specific scenario) is related to how the HTTP JSON API service retrieves data from the ledger, which is that it does so lazily upon receiving a query and asking for the smallest possible amount of data to serve that specific query. This means that party and template, the only two dimensions by which you can effectively query the Ledger API, are two possible sharding keys that you can use to minimize the replication of data across the query stores that back each HTTP JSON API service instance in your system. This also means that you would need to perform load balancing based on the request payload to effectively implement this strategy.

Other load balancing strategies might be valuable for your specific use case, but it’s worth keeping in mind the trade-offs that are being made and the only one you can consciously make at a general level is the one I described above.