Dazl Connection problem to port localhost 6865

I am experimenting with DAZL on a Windows VM and have some simple python code developed.

import asyncio
import datetime
import logging
import time
import uuid
import dazl

dazl.setup_default_logger(logging.INFO)
EPOCH = datetime.datetime.utcfromtimestamp(0)
def main():
    with dazl.simple_client('http://localhost:6865', 'Operator') as client:
                 contract = { 'operator' : 'Operator', 'party' : 'Bank',}
                 client.ready()
                 client.submit_create('Main.Party', contract)

  if __name__ == '__main__':
main()

When I run the Python script I get the following error log

λ poetry run python ./pythontest/test.py
[WARNING] 2020-10-08 09:50:11,290 | root    | setup_default_logger being called more than once!
[   INFO] 2020-10-08 09:50:11,290 | dazl    | Starting a simple_client with to http://localhost:6865 with party 'Operator'...
[   INFO] 2020-10-08 09:50:11,295 | dazl    | Starting an event loop on a background thread 'dazl:main-000002589566e6c0'...
[   INFO] 2020-10-08 09:50:11,301 | dazl    | No server_port configuration was specified, so metrics and other stats will not be served.
[   INFO] 2020-10-08 09:50:11,304 | dazl    | Network bots coroutine started.
[   INFO] 2020-10-08 09:50:11,304 | dazl    | Party Operator bots coroutine started.
[   INFO] 2020-10-08 09:50:11,305 | dazl    | Establishing a connection to http://localhost:6865 on party Operator...
Traceback (most recent call last):
  File "./pythontest/test.py", line 36, in <module>
    main()
  File "./pythontest/test.py", line 26, in main
    client.ready()
  File "C:\Local\pypoetry\Cache\virtualenvs\pythontest-PlasE2q--py3.8\lib\site-packages\dazl\client\api.py", line 1567, in ready
    return self._impl.invoker.run_in_loop(lambda: self._impl.ready())
  File "C:\Local\pypoetry\Cache\virtualenvs\pythontest-PlasE2q--py3.8\lib\site-packages\dazl\scheduler\_invoker.py", line 99, in run_in_loop
    return execute_in_loop(self.loop, func, timeout=timeout)
  File "C:\Local\pypoetry\Cache\virtualenvs\pythontest-PlasE2q--py3.8\lib\site-packages\dazl\util\asyncio_util.py", line 211, in execute_in_loop
    result = q.get(timeout=timeout_in_seconds)
  File "C:\Python38\lib\queue.py", line 178, in get
    raise Empty
_queue.Empty

I have run netstat on this VM

netstat -na | findstr "6865"
  TCP    127.0.0.1:6865         0.0.0.0:0              LISTENING
  TCP    127.0.0.1:6865         127.0.0.1:53148        ESTABLISHED
  TCP    127.0.0.1:6865         127.0.0.1:53150        ESTABLISHED
  TCP    127.0.0.1:53148        127.0.0.1:6865         ESTABLISHED
  TCP    127.0.0.1:53150        127.0.0.1:6865         ESTABLISHED

Any suggestions on troubleshooting this issue ?

2 Likes

Can you first try running the navigator and seeing if you can create this contract through the UI?

2 Likes

Are you running your Python code in the same VM as the Sandbox? By default the Sandbox listens on 127.0.0.1, so if the are running in separate VMs you’ll have to add -a 0.0.0.0 to the parameters you start the Sandbox with to allow it to listen on all addresses.

2 Likes

Contract creation works fine in DAML Navigator.
I added the -a 0.0.0.0 to the sandbox and it was correctly reflected in Sandbox.log

10:07:40.331 [sandbox-akka.actor.default-dispatcher-5] INFO akka.event.slf4j.Slf4jLogger - Slf4jLogger started

10:07:53.403 [pool-1-thread-3] INFO c.d.p.apiserver.LedgerApiServer - Listening on 0.0.0.0:6865 over plain text.

10:07:53.421 [pool-1-thread-2] INFO com.daml.platform.sandboxnext.Runner - Initialized sandbox version 1.5.0 with ledger-id = xxx, port = 6865, dar file = List(..daml\dist\xxx-0.1.0.dar), time mode = wall-clock time, ledger = in-memory, auth-service = AuthServiceWildcard$, contract ids seeding = strong

This may be related or not, but the
Init-script in daml.yaml is not executing.

The script runs fine when I execute it as

daml script

Is there a log file that I can check why it is not running upon initialization of the Ledger ?
Are there any server logs where I can check if the Python script is being received ? This would indicate some network specific issue.

I refactored the code to check on port number and if async provides more insights

import asyncio
import datetime
import logging
import time
import uuid
import dazl
import socket

dazl.setup_default_logger(logging.INFO)

EPOCH = datetime.datetime.utcfromtimestamp(0)

def main():
    a_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    location = ("localhost", 6865)
    result_of_check = a_socket.connect_ex(location)
    if result_of_check == 0:
        print("Port is open")
    else:
        print("Port is not open")
    a_socket.close()
    # url = os.getenv('DAML_LEDGER_URL')
    url = 'http://localhost:6865'
    # party = os.getenv('DAML_LEDGER_PARTY')
    party = 'Operator'

    network = dazl.Network()
    network.set_config(url=url)
    client = network.aio_party(party)

    @client.ledger_ready()
    async def bot_ready(event):
        logging.info(f'user_bot for party {client} is ready')
        contract = { 'operator' : 'Operator', 'party' : 'bank', 'roleType' : 'Lender' }
        return client.submit_create('Main.PartyInvitation', contract)

    network.run_forever()
if __name__ == '__main__':

    main()

This triggers new error messages which the following seems very relevant

File "C:\AppData\Local\pypoetry\Cache\virtualenvs\pythontest-PlasE2q--py3.8\lib\site-packages\dazl\protocols\v1\grpc.py", line 159, in grpc_detect_ledger_id
    raise ConnectionTimeoutError(
dazl.model.core.ConnectionTimeoutError: connection timeout exceeded: 120.0 seconds

Full log below

λ poetry run python ./pythontest/test.py -v
Port is open
[   INFO] 2020-10-09 11:36:20,321 | dazl    | No server_port configuration was specified, so metrics and other stats will not be served.
[   INFO] 2020-10-09 11:36:20,325 | dazl    | Network bots coroutine started.
[   INFO] 2020-10-09 11:36:20,325 | dazl    | Party Operator bots coroutine started.
[   INFO] 2020-10-09 11:36:20,326 | dazl    | Establishing a connection to http://localhost:6865 on party Operator...
Traceback (most recent call last):
  File "./pythontest/test.py", line 58, in <module>
    main()
  File "./pythontest/test.py", line 55, in main
    network.run_forever()
  File "C:\AppData\Local\pypoetry\Cache\virtualenvs\pythontest-PlasE2q--py3.8\lib\site-packages\dazl\client\api.py", line 318, in run_forever
    self._impl.invoker.loop.run_until_complete(self.aio_run(*coroutines))
  File "C:\Python38\lib\asyncio\base_events.py", line 616, in run_until_complete
    return future.result()
  File "C:\AppData\Local\pypoetry\Cache\virtualenvs\pythontest-PlasE2q--py3.8\lib\site-packages\dazl\client\api.py", line 334, in aio_run
    await self._impl.aio_run(*coroutines)
  File "C:\AppData\Local\pypoetry\Cache\virtualenvs\pythontest-PlasE2q--py3.8\lib\site-packages\dazl\client\_network_client_impl.py", line 149, in aio_run
    await runner.run()
  File "C:\AppData\Local\pypoetry\Cache\virtualenvs\pythontest-PlasE2q--py3.8\lib\site-packages\dazl\client\_network_client_impl.py", line 412, in run
    await self._init([party_impls[party] for party in uninitialized_parties], prev_offset)
  File "C:\AppData\Local\pypoetry\Cache\virtualenvs\pythontest-PlasE2q--py3.8\lib\site-packages\dazl\client\_network_client_impl.py", line 479, in _init
    await gather(*(party_impl.connect_in(self.pool) for party_impl in party_impls))
  File "C:\AppData\Local\pypoetry\Cache\virtualenvs\pythontest-PlasE2q--py3.8\lib\site-packages\dazl\protocols\autodetect.py", line 59, in connect
    ledger = await self.ledger()
  File "C:\AppData\Local\pypoetry\Cache\virtualenvs\pythontest-PlasE2q--py3.8\lib\site-packages\dazl\protocols\autodetect.py", line 156, in _main
    for i, metadata in enumerate(_monitor_ledger_network(conn)):
  File "C:\AppData\Local\pypoetry\Cache\virtualenvs\pythontest-PlasE2q--py3.8\lib\site-packages\dazl\protocols\autodetect.py", line 193, in _monitor_ledger_network
    ledger_id = grpc_detect_ledger_id(connection)
  File "C:\AppData\Local\pypoetry\Cache\virtualenvs\pythontest-PlasE2q--py3.8\lib\site-packages\dazl\protocols\v1\grpc.py", line 159, in grpc_detect_ledger_id
    raise ConnectionTimeoutError(
dazl.model.core.ConnectionTimeoutError: connection timeout exceeded: 120.0 seconds
[  ERROR] 2020-10-09 11:38:21,458 | asyncio | Task was destroyed but it is pending!
task: <Task pending name='Task-3' coro=<BotCollection._main() running at C:\Users\Local\pypoetry\Cache\virtualenvs\pythontest-PlasE2q--py3.8\lib\site-packages\dazl\client\bots.py:378> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x00000202D8842A60>()]>>
[  ERROR] 2020-10-09 11:38:21,461 | asyncio | Task was destroyed but it is pending!
task: <Task pending name='Task-2' coro=<BotCollection._main() running at C:\AppData\Local\pypoetry\Cache\virtualenvs\pythontest-PlasE2q--py3.8\lib\site-packages\dazl\client\bots.py:378> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x00000202D88429A0>()]>>
[  ERROR] 2020-10-09 11:38:21,463 | asyncio | Task was destroyed but it is pending!
task: <Task pending name='Task-5' coro=<Bot._main() running at C:\AppData\Local\pypoetry\Cache\virtualenvs\pythontest-PlasE2q--py3.8\lib\site-packages\dazl\client\bots.py:130> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x00000202D8C4CCA0>()]> cb=[LongRunningAwaitable._future_finished()]>

This is very perplexing…are all of these processes running in the same VM (Sandbox, Navigator, Python script)?

The error messages point very strongly to a simple connectivity problem, but the test you added to see if you can connect to the socket seems to disprove that theory—that would have definitely pointed to an obvious network connectivity problem. Separately the rest of your Python code looks perfectly fine. What version of dazl do you happen to be running, and what version of the Sandbox/DAML SDK?

Thanks @dtanabe for looking into this

All processes run on the same VM (Win10)
Not sure if it would have anything to do with this problem, but the environment has a proxy defined (HTTP-PROXY and HTTPS-PROXY) in the environment variables as otherwise DAML would run extremely slow

This is SDK 1.15 and Dazl version 7.2

I don’t believe the HTTP proxy variables would have an effect on dazl; there is an outside chance that Google’s gRPC Python library might use them for something (which dazl depends on) but probably unlikely. Could you try unsetting them in just the shell that you’re launching dazl from to see if that actually does make a difference?

I’m unfortunately running out of ideas; I’ll try to think of places where improved logging might help better point out where the problem potentially is. I’ll also get to work on spinning up a Windows VM… :slight_smile:

@dtanabe The gRPC C++ library which forms the base for the python bindings does implement support for http_proxy, see grpc/http_proxy.cc at 3201d7f7bfca6e6d3883da710c105d3e8167e0a3 · grpc/grpc · GitHub

@cocreature ooh interesting; then perhaps that IS what is to blame…

I have tested the code by removing the HTTP_PROXY and HTTPS_PROXY environment variables but noticed no difference in the outcome.

Is there anything I could add to the code to further troubleshoot the Proxy theory ?

After a new fresh install on a Win10 laptop (not a VDI) I was able to confirm that the issue was related to the HTTP_PROXY and HTTPS_PROXY settings.
Removing the proxies solved the issue but is only a temporary work around for me.
Would you be able to code a more permanent solution for this issue ?

Assuming you meant removing the proxy settings completely from the machine is not an option, would you be able to only disable it for the relevant Python process?

Replacing

python script.py --args

with

https_proxy= python script.py --args

should work: looking at the code @cocreature linked, it will stop looking after finding the https_proxy variable set (i.e. it will not even look at the http_proxy one) and then it will interpret the empty string as “don’t use a proxy”.

Of course, any other way to set the variable just for that process would also work.

Alternatively, the same code seems to honour the no_grpc_proxy and no_proxy variables. In particular, using the no_grpc_proxy one would allow you to disable the proxy for the gRPC calls only, i.e. the same Python script could still make http_proxy-respecting calls otherwise. The format of the no_proxy variable (and I assume the no_grpc_proxy one follows suit) is a comma-delimited list of hostnames.

1 Like

Adding an argument that would allow to bypass the proxy would be a good solution

1 Like

I’ve just opened a PR that exposes this argument a little more directly (though the links that @cocreature and @Gary_Verhaegen have linked to will work with any version of dazl, and indeed, any non-Java implementation of the gRPC libraries as they all use the same underlying Google library).

I’ll update this thread when this PR is merged in and a new release is cut (probably some time tomorrow). Thank you for confirming the proxy was indeed the problem!

I don’t have a great way of testing whether or not this flag has any effect unfortunately, so apologies in advance if this ends up taking a few more rounds to figure out!

1 Like

Released as 7.3.1 on PyPi. Thanks again for raising!

1 Like