DAML image for multi platforms

Hello,
I am working on a DAML project and recently switched to an M1 based mac.

Running a docker compose configuration with the daml sandbox and multiple other services I noticed that the daml sandbox command in the entry point of my container throws a segmentation fault.

daml sandbox --dar daml-*.dar -c sandbox.conf 
>  8 Segmentation fault      daml sandbox --dar daml-*.dar -c sandbox.conf 

Is there any plan for a multi platform or arm-based image?
Any simple way of running the daml image on arm-based machines?

Thanks in advance!


The details of the setup :arrow_down: :

My docker file for the daml sandbox container

ARG SDK_VERSION=2.0.0
FROM digitalasset/daml-sdk:${SDK_VERSION}

USER root
RUN apt-get update
RUN apt-get install netcat -y

My docker compose file:

version: '3.8'

services:

  daml-sandbox:
    container_name: daml-sandbox
    platform: linux/arm64    # <--- I tried with and without platform specification
    build:
      context: ../../.
      dockerfile: Dockerfile
    image: daml
    command: ./run-sandbox.sh
    ports:
      - "6865:6865"
      - "10012:10012"
      - "10019:10019"

My run-sandbox.sh entry point

daml sandbox --dar daml-*.dar -c sandbox.conf  

My sandbox.conf:

canton {
  participants {
    sandbox {
      ledger-api {
        port = 6865
        address = 0.0.0.0
      }
      admin-api {
        port = 10012
        address = 0.0.0.0
      }
    }
  }
  monitoring.health {
    server {
      address = 0.0.0.0
      port = 10019
    }

    check {
      type = ping
      participant = sandbox
      interval = 30s
    }
  }
}

Hi @francescobenintende,

We do not currently provide M1-ready Docker images. Your best bet at this point is to build them yourself; as a starting point, the Dockerfile for these images is here.

I am facing the same segmentation issue. I have tried to build a docker image . I updated the docker file, basically changed the first line

FROM eclipse-temurin:11 -> FROM arm64v8/eclipse-temurin:11

And I have the same issue with both versions of eclipse.

I also tried interactively installing SDK

atttempt 1 : arm64v8/eclipse-temurin:11

ashokraj@ashokrajwhx5j3 ~ % docker run -it --entrypoint sh arm64v8/eclipse-temurin:11
# 
# curl -sSL https://get.daml.com/ | sh
Determining latest SDK version...
Latest SDK version is 2.2.0
Downloading SDK 2.2.0. This may take a while.
######################################################################## 100.0%
Extracting SDK release tarball.
/tmp/tmp.SkEM23Mdu1/sdk/install.sh: line 5:    34 Segmentation fault      "$DIR/daml/daml" install "$DIR" $@
FAILED TO INSTALL!
# exit

Attempt 2 with eclipse-temurin:11

ashokraj@ashokrajwhx5j3 ~ % docker run -it --entrypoint sh eclipse-temurin:11

Unable to find image 'eclipse-temurin:11' locally

11: Pulling from library/eclipse-temurin

Digest: sha256:e41c0ff25711398daa22288822866e51818707431edd6e515752951b7dd2dcad

Status: Downloaded newer image for eclipse-temurin:11

# curl -sSL https://get.daml.com/ | sh

Determining latest SDK version...

Latest SDK version is 2.2.0

Downloading SDK 2.2.0. This may take a while.

######################################################################## 100.0%

Extracting SDK release tarball.

/tmp/tmp.h21YjZqR4F/sdk/install.sh: line 5: 34 Segmentation fault "$DIR/daml/daml" install "$DIR" $@

FAILED TO INSTALL!

Am i doing something wrong here? What is the solution?

Note: Chip is Apple M1 pro and Docker is the one for Mac with apple chip.

Couple of possibilities here. Daml SDK with VS Code integration is most likely failing as there are amd64 binaries in the SDK for IDE integration. Almost certain that Docker does not handle virtualization within a Docker images.

Try using the --platform linux/amd64 flag to docker run as suggested here Run x86 (Intel) and ARM based images on Apple Silicon (M1) Macs? - Docker Desktop for Mac - Docker Community Forums

Following @sormeter dvice, I added this env variable to make sure the platform is always linux/amd64
export DOCKER_DEFAULT_PLATFORM=linux/amd64
and my docker compose has two nodes - one for canton and another for SDK.
The SDK one is still throwing the segmentation fault.
Here is my docker compose

version: '3'

services:
  connect.node:
    image: digitalasset/canton-open-source:latest
    ports:
      - "4011:10011"
      - "4012:10012"
    tty: true # map output to terminal
    environment:
      - CANTON_ALLOCATE_PARTIES=alice;bob
      - CANTON_CONNECT_DOMAINS=mydomain#http://localhost:10018
    command: ["daemon", 
      "-c" , "${CANTON_CONFIG:-config/participant.conf,config/domain.conf}",
      "--bootstrap=config/bootstrap.canton"]
    volumes:
      - ./config:/canton/config

  connect.navigator:
    # This image works ok
    image: digitalasset/daml-sdk:2.1.1
    # Navigator will not load the parties with this image
    # image: digitalasset/daml-sdk:2.1.0-snapshot.20220325.9626.0.4a483381
    # This command works with the snapshot but not the 2.0.0 release
    # command: sh -c "daml ledger navigator --host connect.node --port 10011 --port 4000"
    # We have to use this command with the 2.0.0 release but even then the navigator will not show the parties
    command: sh -c "daml navigator server connect.node 10011"
    ports:
      - "4000:4000"
    depends_on:
      - connect.node
    links:
      - connect.node

And here is the error am getting

 ⠿ Container simplestdeployment-connect.node-1       Recreated                                                                                                                                                                                       0.1s
 ⠿ Container simplestdeployment-connect.navigator-1  Recreated                                                                                                                                                                                       0.1s
Attaching to simplestdeployment-connect.navigator-1, simplestdeployment-connect.node-1
simplestdeployment-connect.navigator-1  | Segmentation fault
simplestdeployment-connect.navigator-1 exited with code 139

I’m a bit confused by this last message. It seems to say things work with 2.1.1, but not prior versions. Is that the case? If so, what is preventing you from using 2.1.1 (or better yet, 2.2.0)?

I’m also on Apple M1 and running into the same error (FAILED TO INSTALL!) when trying to install the SDK. Were you able to get past this error?

Housekeeping note. The answer to the post by @stephenwsun is available on this thread.

I’ve been struggling with the same issue and have tried all similar approaches but for some reason it keeps segfaulting. DAML runs fine natively on osx so what could be the issue here? If I use full arm stack or a full x86 stack with rosetta it should work both ways. What can we do to debug this further and see where the issue is and try to get around it?

Thanks for any advice!

Hi @daquino,

This thread is starting to be a bit confusing. Would you mind opening a new thread that explains precisely what you have tried that you expected to work and how it failed instead? Please include all relevant version numbers if possible.

I think it would be better to keep this thread? It has plenty of the exact info needed.

The simplest explanation is that daml segfaults when running in containers on osx but runs fine natively on osx.

Examples using official containers from docker hub.

2.4.0:

$ docker run --platform linux/amd64 -it --rm -v "$PWD:/data" digitalasset/daml-sdk:2.4.0 sh -c 'cd /data && daml test'

latest image I see on dockerhub:

$ docker run --platform linux/amd64 -it --rm -v "$PWD:/data" digitalasset/daml-sdk:2.6.0-snapshot.20221226.11190.0.71548477 sh -c 'cd /data && daml test'
Segmentation fault

As stated above many of us tried to build the container as arm64 as well but that didn’t help at all.

Running natively on osx it works fine:

$ daml test

Test Summary

src/main/daml/Setup.daml:setup: ok, 0 active contracts, 0 transactions.
src/main/daml/Tests/IssuanceTest.daml:issuer_scenario: ok, 11 active contracts, 11 transactions.
src/main/daml/Tests/PlanSecurityIssuanceTest.daml:issuer_scenario: ok, 8 active contracts, 7 transactions.
test coverage: templates 75%, choices 35%

I tried with different

I believe part of this relates to the following:

  • Native daml-sdk is running as a native MacOS binary. Rosetta will handle the Intel to M1 translation to allow daml binaries to run:
file  ~/.daml/bin/daml
/Users/edwardnewman/.daml/bin/daml: Mach-O 64-bit executable x86_64
  • Docker on MacOS uses a (hidden) Linux VM to run the containers and therefore expects a linux/ image. The daml-sdk are only linux/amd64. So my assumption is that there is some issue with Docker and the translation from linux/amd64 to linux/arm64 that causes the seg fault. I think Docker uses QEMU and depending on which emulation is installed this may not be 100% compatible.

I have tried on Podman as well so this seems like a fundamental issue with the QEMU emulator:

podman inspect digitalasset/daml-sdk:2.2.1 | jq '.[].Architecture'
"amd64"

I suspect this requires retooling to build arm64 images of Daml in addition to amd64.

Is the use of Docker for running the developer tools (Daml SDK) solely a concern around local installation or something else?

Yes, OSX on ARM has driven a lot of folks crazy. Rosetta is supposed to automatically kick in and just work.

A few of us above also built ARM images and the same thing still happens. In that case Rosetta shouldn’t be needed at all.

Using the same Dockerfile used to build the official images with simply the base image changed to be ARM based:

Changed first line:

FROM arm64v8/eclipse-temurin:11
#FROM eclipse-temurin:11

Build forcing the right --platform setting:

docker build --platform linux/arm64/v8 -f Dockerfile.arm -t test .
.....
 > [4/5] RUN curl https://get.daml.com | sh -s $VERSION     && printf "auto-install: false\nupdate-check: never\n" >> /home/daml/.daml/daml-config.yaml:
#7 0.521 Determining latest SDK version...
#7 0.723 Latest SDK version is 2.5.0
#7 0.727 Downloading SDK 2.5.0. This may take a while.
######################################################################## 100.0%
#7 62.77 Extracting SDK release tarball.
...
...
...
#7 70.46 /tmp/tmp.Nq3Itsxri6/sdk/install.sh: line 5:    33 Segmentation fault      "$DIR/daml/daml" install "$DIR" $@
#7 70.46 FAILED TO INSTALL!
------
executor failed running [/bin/sh -c curl https://get.daml.com | sh -s $VERSION     && printf "auto-install: false\nupdate-check: never\n" >> /home/daml/.daml/daml-config.yaml]: exit code: 139

You can see that fails trying to daml install with a segfault.

So this issue is likely that whilst you are creating a linux/arm64 base image the binaries pulled down via the get.daml.com are really linux/amd64. You might confirm this but running your image in a bash shell and then doing file on the binaries.

That could be a good idea. I guess most of us were under the impression that it would detect the architecture and pull down the right versions since it worked natively on osx. I’ll give it a shot.

So I looked at the install script get get.daml.com and it appears that it only checks the version and OS to install for (windows, linux, mac) and then pulls that from github releases (Releases · digital-asset/daml · GitHub).

Downloading the images manually to check I see:

# sdk-2.6.0-snapshot.20221226.11190.0.71548477
21:31 $ file daml/daml
daml/daml: Mach-O 64-bit executable x86_64

Based on how the linux version is ran I checked both these files:

$ file daml/lib/ld-linux-x86-64.so.2
daml/lib/ld-linux-x86-64.so.2: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), static-pie linked, BuildID[sha1]=db50353a26600bb848b9a5541b1506e0a24cb34b, not stripped

$ file daml/lib/daml
daml/lib/daml: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter ld-undefined.so, for GNU/Linux 2.6.32, stripped

So it seems like there is no explicit ARM support even in OSX? I was expecting to see a Universal binary for osx but I guess even on native osx it must be using Rosetta…?

We currently only publish Intel versions of the SDK (amd64/x86_64).

On M1 macs, the SDK runs through Rosetta. You can easily see that in daily use by the delay there is when a new version of the SDK is run for the first time (that’s the time it takes Rosetta to “translate” the binary). Or, of course, using file on the binaries under ~/.daml.

We do not currently publish any arm64 binaries for any platform. So far our experience has been that the Intel SDK works well under Rosetta, and the use-case for an SDK Docker image remains elusive. Demand for arm64 support on Linux is also very low outside of the Docker use-case.

Ok that makes sense.

Although I would say the use case for docker in general is very strong.

We have existing docker-compose configurations that we’re using for all of our workflows for our other products which really unifies and streamlines our development. We use the same workflow to deploy on our shared instances, ci pipelines, as we do locally via compose. It automates and encapsulates our workflow instead of relying on local laptop state.

I think the next question to figure out is how do we debug why it’s segfualting inside of docker but not natively on osx. A lot of software has gone through this similar issue with osx’s move to arm so it shouldn’t be too hard to figure out. Rosetta is kicking in and working for the rest of the container and all binaries inside of it but something is failing when it comes to Daml.

Any idea how we could capture a stack trace or something along those lines to figure out exactly which instructions its failing on?

Thanks for attention on this so far! I was worried we’d never get any traction on this so I’m happy to see some dialogue here.

1 Like

My understanding is that Docker runs in a stealthy Linux VM, and therefore Rosetta has nothing to do with it. I assume it is segfaulting because it is a Linux amd64 binary running on a Linux arm64 (virtual) machine.

It’s a Haskell binary and the Haskell runtime is not big on stack traces as far as I’m aware. I’m not a proficient Haskell developer myself so there may be techniques I’m not aware of. But I don’t think this is the right direction of investigation; I think what we’d want here is to build the SDK for Linux arm64. We don’t currently have that setup (we do not have Linux arm64 machines as part of our CI fleet and Haskell (GHC) does not, as far as I know, do cross compilation), so I can’t comment on how hard or easy it would be. Maybe it’s just a matter of running the normal build on an arm64 Linux machine; maybe that won’t work at all and there are lots of changes to make.

Can you elaborate on that? Specifically, why do you need the SDK as part of that workflow?