"No access to: class..." error when attempting to start a domain in Canton

Hi.

I’m working on a canton deployment using canton-enterprise-0.27.0 within a docker container and I’m facing some issues when attempting to start a domain.

Canton starts as expected, but when running domain start the following error comes up and Canton crashes:

ERROR c.d.c.e.EnterpriseEnvironment - A fatal error has occurred in canton-env-execution-context. Terminating immediately.
java.lang.BootstrapMethodError: java.lang.IllegalAccessError: Class 'scala.util.Random' no access to: class 'scala.collection.immutable.LazyList$State'
        at scala.util.Random.alphanumeric(Random.scala:252)
        at com.digitalasset.canton.crypto.PseudoRandom$.randomAlphaNumericString(Random.scala:55)
        at com.digitalasset.canton.identity.IdentityElementId$.generate(IdentityTransaction.scala:354)
        at com.digitalasset.canton.identity.IdentityTransaction$.createAdd(IdentityTransaction.scala:966)
        at com.digitalasset.canton.environment.CantonNodeBootstrapBase.auth(CantonNodeBootstrap.scala:257)
        at com.digitalasset.canton.domain.DomainBootstrap.$anonfun$autoInitializeIdentity$10(Domain.scala:141)
        at com.digitalasset.canton.domain.DomainBootstrap$$Lambda$2468/00000000E834BCB0.apply(Unknown Source)
        at cats.data.EitherT.$anonfun$flatMap$1(EitherT.scala:391)
        at cats.data.EitherT$$Lambda$2333/000000007FF8F0F0.apply(Unknown Source)
        at scala.concurrent.impl.Promise$Transformation.run(Promise.scala:434)
        at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1413)
        at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:300)
        at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1067)
        at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1703)
        at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:168)
Caused by: java.lang.IllegalAccessError: Class 'scala.util.Random' no access to: class 'scala.collection.immutable.LazyList$State'
        at java.lang.invoke.MethodHandle.sendResolveMethodHandle(MethodHandle.java:1084)
        at java.lang.invoke.MethodHandle.getCPMethodHandleAt(Native Method)
        at java.lang.invoke.MethodHandle.getAdditionalBsmArg(MethodHandle.java:867)
        at java.lang.invoke.MethodHandle.resolveInvokeDynamic(MethodHandle.java:946)
        ... 15 common frames omitted
Caused by: java.lang.IllegalAccessException: Class 'scala.util.Random' no access to: class 'scala.collection.immutable.LazyList$State'
        at java.lang.invoke.MethodHandles$Lookup.checkClassAccess(MethodHandles.java:449)
        at java.lang.invoke.MethodHandles$Lookup.accessCheckArgRetTypes(MethodHandles.java:686)
        at java.lang.invoke.MethodHandle.sendResolveMethodHandle(MethodHandle.java:1058)
        ... 18 common frames omitted

Apparently it is not able to reach one of the classes but I wonder why that might be. Is there a dependency missing?

In the image we have the Jar and a config file, is anything else needed?

Thanks!
Matheus

2 Likes

Hi @Matheus

Can you help us reproduce the error? I tried it on my (osx) laptop, but it seems to work fine:

ratkoveprekcmg8wl:~ ravep$ docker run -it  digitalasset-canton-enterprise-docker.jfrog.io/digitalasset/canton-enterprise:0.27.0
Compiling /canton/(console)
   _____            _
  / ____|          | |
 | |     __ _ _ __ | |_ ___  _ __
 | |    / _` | '_ \| __/ _ \| '_ \
 | |___| (_| | | | | || (_) | | | |
  \_____\__,_|_| |_|\__\___/|_| |_|

  Welcome to Canton!
  Type `help` to get started. `exit` to leave.


@ mydomain.start()

@ mydomain.health.status
res1: com.digitalasset.canton.health.admin.data.NodeStatus[com.digitalasset.canton.health.admin.data.DomainStatus] = Domain id: mydomain::1220678b58f0c61e5adff797abd83553d4755bccbaaf83b2a8fe8b9f27e6b3599ec1
Uptime: 13.11351s
Ports:
        public: 5018
        admin: 5019
Connected Participants: None
Sequencer: Some(SequencerHealthStatus(isActive = true))

Thanks!
Ratko

1 Like

Hi @Ratko_Veprek
We are not using the docker image available in DA’s jFrog, but creating one using the artifacts in the canton enterprise zip.
Apparently I was using the wrong java version(8), switching to java 11 I only get this while running bin/canton -c participant.conf --debug:

INFO  c.d.c.CantonEnterpriseApp$ - Starting Canton version 0.27.0
INFO  c.d.c.e.EnterpriseEnvironment - Deriving 1 as number of threads from 'sys.runtime.availableProcessors()'. Please use '-Dscala.concurrent.context.numThreads' to override.
INFO  c.d.c.e.EnterpriseEnvironment - Creating ForkJoinPool with parallelism = 2 (instead of 1) to avoid starvation.
INFO  a.e.s.Slf4jLogger - Slf4jLogger started
INFO  c.d.c.e.EnterpriseEnvironment - Deriving 1 as number of threads from 'sys.runtime.availableProcessors()'. Please use '-Dscala.concurrent.context.numThreads' to override.
INFO  c.d.c.e.EnterpriseEnvironment tid:0ac02f867df67e90f90b81c34b390402 - Manual start requested.

then the terminal seems to finish the process without any errors. Is this expected? how can I check if it is running or not in this variant?

Thanks again.
Matheus

1 Like

Hi @Matheus

Please note that we test against Java 11 and require you to use 11 or higher (compatibility matrix in https://github.com/digital-asset/canton/releases/tag/v0.27.0). This is due to a bug in the JRE libraries pre Java11 which leads to premature thread termination of the ForkJoinPool.

To your question:

  1. If you start ./bin/canton -c participant.conf, it should start in interactive mode and you should get to the console prompt.
  2. are you running ./bin/canton -c participant.conf within docker? This way you are starting the interactive mode and this only works in docker if you run it with docker run -it ....
  3. Alternatively for docker, you should invoke ./bin/canton daemon -c participant.conf. You can then use a remote console to connect to your node.
  4. if you are not running it in docker, then please check the log file log/canton.log what the last sign of life was.
  5. When running in docker, you might want to use --log-profile=container in order to turn on logging to STDOUT as JSON. The --log-... command line arguments allow you to tune the logging output.

Please let me know what you find.

Best,
Ratko

2 Likes

Hi @Ratko_Veprek

Thanks for the reply. Starting with the command mentioned in 0. this is the result after some seconds:

sh-4.4$ ./bin/canton -c examples/01-simple-topology/simple-topology.conf 
sh-4.4$
  1. We’re running this on a Kubernetes pod that runs docker images, unfortunately docker is not present in the image for us to run it from within it.

When providing the canton-enterprise:0.27.0 image from Digital Asset’s repository “directly” to be run in the pod this error pops up in the terminal:

standard_init_linux.go:219: exec user process caused: exec format error
  1. This is the content in log/canton.log for the variant mentioned in 0.:
2021-09-08 10:41:34,313 [main] INFO  c.d.canton.CantonEnterpriseApp$ - Starting Canton version 0.27.0
2021-09-08 10:41:35,098 [main] INFO  c.d.c.e.EnterpriseEnvironment - Deriving 1 as number of threads from 'sys.runtime.availableProcessors()'. Please use '-Dscala.concurrent.context.numThreads' to override.
2021-09-08 10:41:35,101 [main] INFO  c.d.c.e.EnterpriseEnvironment - Creating ForkJoinPool with parallelism = 2 (instead of 1) to avoid starvation.
2021-09-08 10:41:35,369 [canton-env-execution-context-19] INFO  akka.event.slf4j.Slf4jLogger - Slf4jLogger started
2021-09-08 10:41:35,510 [main] INFO  c.d.c.e.EnterpriseEnvironment - Deriving 1 as number of threads from 'sys.runtime.availableProcessors()'. Please use '-Dscala.concurrent.context.numThreads' to override.
2021-09-08 10:41:35,699 [main] INFO  c.d.c.e.EnterpriseEnvironment tid:a778f4e96ce78237234707d1ee46ef79 - Manual start requested.
  1. running it with --log-profile=container results in:
sh-4.4$ bin/canton -c examples/01-simple-topology/simple-topology.conf --log-profile=container
INFO  c.d.c.CantonEnterpriseApp$ - Starting Canton version 0.27.0
INFO  c.d.c.e.EnterpriseEnvironment - Deriving 1 as number of threads from 'sys.runtime.availableProcessors()'. Please use '-Dscala.concurrent.context.numThreads' to override.
INFO  c.d.c.e.EnterpriseEnvironment - Creating ForkJoinPool with parallelism = 2 (instead of 1) to avoid starvation.
INFO  a.e.s.Slf4jLogger - Slf4jLogger started
INFO  c.d.c.e.EnterpriseEnvironment - Deriving 1 as number of threads from 'sys.runtime.availableProcessors()'. Please use '-Dscala.concurrent.context.numThreads' to override.
INFO  c.d.c.e.EnterpriseEnvironment tid:30bda102eacae04df462e42593ba4bfa - Manual start requested.
sh-4.4$

I believe it’s also worth mentioning that we are running on s390x architecture, and we are using an openjdk 11 image for building “our” canton-enterprise image.

Cheers,
Matheus

1 Like

Does by any chance starting canton with

bin/canton -v --no-tty -c examples/01-simple-topology/simple-topology.conf

make any difference?

Based on the output I’d say it fails to initialise the “interactive console”.

For prod ops, I would anyway recommend to run the console separate from the process. In that case you would start with

./bin/canton daemon -v -c examples/01-simple-topology/simple-topology.conf --bootstrap=examples/01-simple-topology/simple-ping.canton

If the latter works, then it’s the interactive console for sure.

2 Likes

the first command didn’t provide any changes to the behaviour.

The second however rendered an exeption:

ERROR c.d.c.ServerRunner - An internal error occurred while executing script.
java.lang.ExceptionInInitializerError: null
        at ammonite.Main$.apply$default$5(Main.scala:68)
        at com.digitalasset.canton.console.HeadlessConsole$.createDefaultAmmoniteOptions(HeadlessConsole.scala:59)
        at com.digitalasset.canton.console.HeadlessConsole$.$anonfun$apply$1(HeadlessConsole.scala:46)
        at better.files.Dispose.apply(Dispose.scala:81)
        at com.digitalasset.canton.console.HeadlessConsole$.apply(HeadlessConsole.scala:45)
        at com.digitalasset.canton.ConsoleScriptRunner$.run(Runner.scala:169)
        at com.digitalasset.canton.ServerRunner.startWithBootstrap$1(Runner.scala:51)
        at com.digitalasset.canton.ServerRunner.$anonfun$run$3(Runner.scala:53)
        at com.digitalasset.canton.ServerRunner.$anonfun$run$3$adapted(Runner.scala:53)
        at scala.Option.fold(Option.scala:263)
        at com.digitalasset.canton.ServerRunner.run(Runner.scala:53)
        at com.digitalasset.canton.CantonAppDriver.delayedEndpoint$com$digitalasset$canton$CantonAppDriver$1(CantonAppDriver.scala:145)
        at com.digitalasset.canton.CantonAppDriver$delayedInit$body.apply(CantonAppDriver.scala:28)
        at scala.Function0.apply$mcV$sp(Function0.scala:39)
        at scala.Function0.apply$mcV$sp$(Function0.scala:39)
        at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
        at scala.App.$anonfun$main$1(App.scala:76)
        at scala.App.$anonfun$main$1$adapted(App.scala:76)
        at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:563)
        at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:561)
        at scala.collection.AbstractIterable.foreach(Iterable.scala:919)
        at scala.App.main(App.scala:76)
        at scala.App.main$(App.scala:74)
        at com.digitalasset.canton.CantonAppDriver.main(CantonAppDriver.scala:28)
        at com.digitalasset.canton.CantonEnterpriseApp.main(CantonEnterpriseApp.scala)
Caused by: java.lang.IllegalArgumentException: requirement failed: ? is not an absolute path
        at scala.Predef$.require(Predef.scala:338)
        at os.Path.<init>(Path.scala:430)
        at os.Path$.apply(Path.scala:388)
        at os.package$.<clinit>(package.scala:19)
        ... 25 common frames omitted
ERROR c.d.c.ServerRunner - Unexpected error while running server: null
1 Like

Oh dear. I suspect that you may have hit a bug with our script support on this architecture which may require a patch to fix.

I assume running the above without the --bootstrap-script is successful? (but with the problem that we can’t connect participants to domains)

I’ll check in with the team in the meantime.

1 Like

So, I’ve tried out to run Canton on S390 by using the LinuxOne Community Cloud of IBM:

[linux1@cantontest canton-enterprise-0.27.0]$ cat /proc/cpuinfo  | grep IBM/S390
vendor_id       : IBM/S390
[linux1@cantontest canton-enterprise-0.27.0]$ ./bin/canton -c examples/01-simple-topology/simple-topology.conf
   _____            _
  / ____|          | |
 | |     __ _ _ __ | |_ ___  _ __
 | |    / _` | '_ \| __/ _ \| '_ \
 | |___| (_| | | | | || (_) | | | |
  \_____\__,_|_| |_|\__\___/|_| |_|

  Welcome to Canton!
  Type `help` to get started. `exit` to leave.


@ nodes.local.start()


@ participants.all.domains.connect_local(mydomain)


@ participant1.health.ping(participant2)
res2: concurrent.duration.Duration = 2757 milliseconds

@ participant1.health.ping(participant2)
res3: concurrent.duration.Duration = 431 milliseconds

It looks healthy. So from a first glance, it does look to work and I don’t observe any anormalities.

Now, going back to the exceptions you’ve reported. The exception

at ammonite.Main$.apply$default$5(Main.scala:68)

says that this line of code here is throwing:

Which brings me to the question: do you run this in a way so that the process does not have any access to the disk?

The ammonite console we are embedding in Canton will cache a few files on your file system and it seems that a trivial operation such as determining the absolute path of the current directory fails with an exception.

Can you confirm that the file system is writable and discoverable to the jvm process?

1 Like

So, I can now reproduce more or less the issue you have by running Canton in a read-only docker setup:

docker run --read-only --rm -it -v $PWD/log:/canton/log digitalasset/canton-enterprise:dbg -c log/nope.conf,examples/01-simple-topology/simple-topology.conf --log-level-stdout=TRACE --log-level-root=TRACE

Ammonite requires the ability to create temporary files in order to compile the scripts that we pass as bootstrap or the commands that we type into the console. This is very deep down in the library (it uses java.nio.file.Files.createTempFile) and we can not work around it. Even if I turn off caching of the compiled artefacts (which is the first error you have been hitting), the issue will persist.

However, @davidpadbury’s workaround from above is actually fine. So I guess you have the following options:

  1. Start Canton in daemon mode: ./bin/canton daemon -c <config> but do not pass a --bootstrap=... file and do not start it in interactive mode.
  2. Make the system less locked down and give Canton a home where it can create temporary files.

If you omit the --bootstrap option, you can still use a remote console that has access to the process admin-api in order to run scripts against the node in daemon mode. Have a look at examples/03-advanced-configuration/remote for how to configure a remote participant.

Cheers,
Ratko

3 Likes

I removed any access control to the file system but the behavior didn’t change, so I went with option 1.

Then canton seems to launch in daemon but how do I actually run the script from the participant node if it runs on an identical setup, and also has to be run in daemon mode?

Thanks,
Matheus

Hey @Matheus,

I think in short to run the canton console it needs a writable file system. The process running the console can be remote from the servers themselves, but needs a writable file system for the period it’s running commands somewhere.

Could you share a little more about your setup? I’m optimistic that maybe we could work around this by using a ram disk of some form as this doesn’t actually need to write much, so perhaps is achievable with tmpfs via docker or a plain mount.

David.

1 Like

Hi @davidpadbury
We’re running it in a Kubernetes cluster, the image we build is run in a container within a pod.

There is the possibility of mounting some persistent volumes to which the application can write to, do you know the path it attempts to write to? I’d be happy to give it a try.

Thanks in advance,
Matheus

Hey @Matheus,

Okay - thanks for the info.

So I’d suggest creating a writable volume and mounting that into your pod+container. Looks like you could also just create a tmpfs volume easily in k8 that should be sufficient for these purposes.

The tricky bit is that the chunk of code that is failing is using a java API to create temporary files and directories. I think we can actually control where those are based by setting the java system property java.io.tmpdir. We can do this when launching canton either directly or with the docker container by setting the environment variable:

JAVA_OPTS="-Djava.io.tmpdir=/path/to/your/writeable/mount"

Let me know how that works for you. Good luck!

David.

1 Like

Hi @davidpadbury ,
Unfortunately not much difference, in the logs below the mounted volume is in /home/app/tmp (canton resides in /home/app), Canton didn’t launch in interactive mode, but from the terminal I was able to create files under /home/app/tmp. Also, the JAVA_OPTS variable is set:

sh-4.4$ pwd
/home/app
sh-4.4$ ls
LICENSE.txt  bin   daml  demo        di-execution-integration.dar  di-execution-test.dar  lib  start-demo-win.cmd  start.sh                   tmp
README.md    conf  dars  deployment  di-execution-main.dar         examples               log  start-demo.command  third-party-licenses.html
sh-4.4$ echo $JAVA_OPTS
-Djava.io.tmpdir=/home/app/tmp
sh-4.4$ bin/canton -c examples/01-simple-topology/simple-topology.conf 
sh-4.4$ mkdir tmp/folder
sh-4.4$ echo "testfile" > tmp/folder/test.txt
sh-4.4$ cat tmp/folder/test.txt 
testfile

Is there anything I missed?

Thanks once again,
Matheus

Oh dear. What output did you get when canton launched and could you share the log file?

Actually on that thought… When launching canton that way it will try writing a log file to log/canton.log which is maybe problematic given that location won’t be writable. Can you try adding --log-file-name=$PWD/tmp/canton.log just to make sure that ends up somewhere writable too?

1 Like

That is the weird part to me, there is no output it executes and “finishes” after a second or two.

I ran it again with --log-file-name=$PWD/tmp/canton.log :

sh-4.4$  bin/canton -c examples/01-simple-topology/simple-topology.conf --log-file-name=$PWD/tmp/canton.log
sh-4.4$ cat $PWD/tmp/canton.log
2021-09-10 17:47:22,411 [main] INFO  c.d.canton.CantonEnterpriseApp$ - Starting Canton version 0.27.0
2021-09-10 17:47:23,141 [main] INFO  c.d.c.e.EnterpriseEnvironment - Deriving 1 as number of threads from 'sys.runtime.availableProcessors()'. Please use '-Dscala.concurrent.context.numThreads' to override.
2021-09-10 17:47:23,143 [main] INFO  c.d.c.e.EnterpriseEnvironment - Creating ForkJoinPool with parallelism = 2 (instead of 1) to avoid starvation.
2021-09-10 17:47:23,343 [canton-env-execution-context-19] INFO  akka.event.slf4j.Slf4jLogger - Slf4jLogger started
2021-09-10 17:47:23,392 [main] INFO  c.d.c.e.EnterpriseEnvironment - Deriving 1 as number of threads from 'sys.runtime.availableProcessors()'. Please use '-Dscala.concurrent.context.numThreads' to override.
2021-09-10 17:47:23,485 [main] INFO  c.d.c.e.EnterpriseEnvironment tid:cdc2b1bbc3fa55e17a0aec41db02fa97 - Manual start requested.

Great… I assume the process is exiting unsuccessfully? Could you echo $? immediately after canton finishes?

Could you also add --debug to see if we can capture anything useful in the log?

Sorry about this :grimacing:

1 Like

No need to apologise, I am really greatful for the support :slight_smile:

the output running with --debug and echo $? afterwards:

sh-4.4$  bin/canton -c examples/01-simple-topology/simple-topology.conf --log-file-name=$PWD/tmp/canton.log --debug
INFO  c.d.c.CantonEnterpriseApp$ - Starting Canton version 0.27.0
INFO  c.d.c.e.EnterpriseEnvironment - Deriving 1 as number of threads from 'sys.runtime.availableProcessors()'. Please use '-Dscala.concurrent.context.numThreads' to override.
INFO  c.d.c.e.EnterpriseEnvironment - Creating ForkJoinPool with parallelism = 2 (instead of 1) to avoid starvation.
INFO  a.e.s.Slf4jLogger - Slf4jLogger started
INFO  c.d.c.e.EnterpriseEnvironment - Deriving 1 as number of threads from 'sys.runtime.availableProcessors()'. Please use '-Dscala.concurrent.context.numThreads' to override.
INFO  c.d.c.e.EnterpriseEnvironment tid:2e24416c2789880eb605da153f34cc21 - Manual start requested.
sh-4.4$ echo $?
1

And nothing more in the logs?