Should we maintain a daml 'base' or 'prelude-next' library?

Luciano · November 4, 2020, 10:15am

In the Prelude we have a number of modules under Next.

I was wondering if it were worth having a separate pure-DAML git repository to test out possible extension that would be candidates for inclusion to the Prelude.Next in future.

The purpose of this would be to foster discussion between the DA language team and community, so there is some guidance on what would/wouldn’t be accepted, taking into account for instance performance and other subtleties that DAML users may not be aware of. And of course, also to reduce code duplication.

I ask this because I find myself re-writing (mainly Haskell) standard library modules for use in my projects. It would be nice if we could package things like monad transformers and other category-typeclasses e.g. Alternative into a common package.

Perhaps there already exists a process for this internally that I’m not aware of? In which case, I’d be curious to know more

bernhard · November 4, 2020, 12:49pm

I personally think the Next packages were an error and I don’t think we should do this again in the main Standard Library. With the introduction of Generic Maps, we’ll go back to DA.Map and DA.Set.

I’m not at all opposed to anyone starting open source extension libraries. You should think about stability and versioning of such a project, though. Do you do SemVer on the whole extension library, meaning whenever a contributor makes any breaking change you bump the major version of the whole thing? How do you distribute this stuff? For something standard-library-like, you probably want to use dependencies, not data-dependencies, but then you need to match SDK versions so if you distribute compiled packages, you’d need to do a release per SDK-release.

I don’t have the answers to these questions in the absence of a more fully-featured DAML dependency management solution. Maybe the best way to start would just be an open source repo with a single DAML project containing source files you can copy into your own project?

Luciano · November 4, 2020, 12:56pm

Glad to hear this. I wasn’t looking forward to DA.Next.Next.Map

I’m glad you brought this up, as it hadn’t even crossed my mind.

I think it’s definitely worth thinking more about the points you brought up before doing this. Otherwise I fear it will disolve into a mess…

What’s your opinion on (eventual) inclusion into the SDK? Or would it be better to keep this altogether separate?

Martin_Huschenbett · November 4, 2020, 1:18pm

I’d like to add one more point to think about: You probably don’t want to have any data types that are intended to end up on a ledger in such a library. If you change your library or have to recompile it with a different version of the SDK, you will end up with a different package ID and hence the types are considered different.

For daml-stdlib, we jump through quite some hoops to get stable names for type like Either. More precisely, we put each such type in its own package without anything else around it and never recompile it. (In fact, we never compile it at all but rather write the DAML-LF without having any DAML for it.)

Luciano · November 4, 2020, 1:24pm

I’m somewhat confused.

Does this mean that any contracts (on-ledger) using this library would need to be ‘upgraded’ if we release a new version of the library, even if we haven’t changed the data types we’re using?

I thought this was something that was addresses just before the 1.0 release, but maybe I’m confusing SDK cross-versioning with library cross-versioning - I thought they were the same thing?

But perhaps I’m completely off the mark. Could you explain a bit more?

Gary_Verhaegen · November 5, 2020, 11:16am

The full type for a piece of DAML data on the ledger is packageid:module:typename. So if you change a function in a package, that also changes the full name fo every single data type defined in that package. This in turn means that new code (compiled against the new version of the library) will not be compatible with existing code (compiled against the old version of the library and deployed on the ledger), as in the types will not match.

So if you have a data type

data User = User { name: Text }

(to keep things simple) in your library, and a template

template IOU
  with
    owner: Party
    owner_name: User
  [...]

and then you want to create a new IOU_v2, and you want to create an upgrade template from v1 to v2, you’d try to write something like:

template IOU_v2
  with
    owner: Party
    owner_name: User
 [...]
  
template IOU_v1_v2_upgrade
  [...]
  preconsuming choice Upgrade: ContractId IOU_v2
    with old: ContractId IOU, actor: Party
    controller actor
    do
      c1 <- fetch @IOU old
      archive old
      create IOU_v2 with
        owner = c1.owner
        owner_name = c1.owner_name -- <- this does not typecheck
        [...]
  [...]

Even if the definition of User has not changed between the two versions of the library, if IOU had been compiled with one version and the package containing IOU_v2 and IOU_v1_v2_upgrade has been compiled with a different version, the two User types will be considered different, because though the type is structurally the same, its full type includes the package id which will be different.

Leonid_Rozenberg · November 5, 2020, 2:22pm

Hopefully a quick question, but, does it have to be this way where the compatibility is determined this way? Is there a deeper reason that prevents a more content addressable based system.

Martin_Huschenbett · November 5, 2020, 6:47pm

@Luciano I assume you’re asking whether your existing active contracts need to be upgraded when you want them to use the new version of your library. Please correct me if I misunderstood your question.

The answer is an emphatic “Yes, that’s exactly what it means” and I would argue that’s exactly what you want in a DAML-like smart contract setting. If it wasn’t like this, you could change the potential consequences of a contract after its signatories have agreed to it by virtue of changing a function in your library, which then changes the behavior of the underlying template. However, that is at odds with one of the fundamental principles of DAML: If you sign a contract, you authorize all its potential consequences, which might make you a signatory of further contracts without your explicit signature being required.

Does that make sense and does it answer your question?

Martin_Huschenbett · November 5, 2020, 8:11pm

@Leonid_Rozenberg Well, the question is quick indeed but I’m afraid a full answer wouldn’t be. I’ll do my best to keep it brief nevertheless.

There is no deep reason why the content addressing scheme couldn’t be more fine-grained, in a fashion similar to the Unison language. In fact, we had a very fine-grained scheme in the past with DAML 0.x. This was before the days of our code generation tools. Without such tools, the content addressing was not particularly pleasant to use since every template pretty much had its own hash and you constantly had to update these hashes in your client applications during development. IIRC, this was one of the main reasons why we changed to the very coarse-grained scheme we have now.

However, a fine-grained content addressing scheme is not valuable in its own right. It is only valuable if it gives you a certain hash stability guarantee: recompiling a package after a modification would leave the hashes of all entities that are not (transitively) impacted by the modification unchanged. Ideally, this would even be the case if you changed your compiler version. Besides the circumstance that this would be hard to achieve with our current GHC-based setup, there are implications of such an approach that might be undesirable when it comes to performance.

Obviously, DAML is a language that allows for very sophisticated abstractions that don’t translate well into how processors execute programs. Thus, achieving a decent execution performance for DAML, both in terms of runtime and memory consumption, requires a fair amount of code optimizations. Such optimizations can happen in two places: in the compiler or in the runtime.

If these optimizations were performed in the compiler, then improving an existing optimization or adding a new one is very likely to change the hashes of all (value-level) entities in a certain package. This pretty much defeats the purpose of a fine-grained content addressing scheme.

Performing the optimizations in the runtime is a very risky endeavor. As described in my answer above, changing the semantics of an existing active contract is a big no-go for DAML-like smart contracts. In other words, two different versions of the runtime must give all code they both understand the exact same semantics. In fact, this is also a prerequisite for being able to validate transactions submitted by other network participants or transactions that were recorded in the past. Such strong backward-compatibility requirements make a runtime that is as “dumb” as possible very desirable: every optimization you perform in the runtime carries the risk of accidentally changing the semantics of existing contracts. IMO, minimizing this risk is crucial for positioning DAML as a secure smart contract language.

Obviously, implementing optimizations in the compiler carries exactly the same risk of accidentally introducing semantic bugs. However, the impact of such bugs is significantly smaller. First of all, we consider DAML-LF the ultimate source of truth regarding the meaning of a contract since DAML-LF is what is deployed to the ledger and executed by the runtime. Second, if the compiler starts producing different DAML-LF for the same DAML than in the past, the content addresses necessarily change as well. This means that the semantics of existing contracts on a ledger remain completely unchanged even if we accidentally introduce compiler bugs after the deployment of their underlying packages.

I would summarize everything said above as:

Stable fine-grained content addressing, sophisticated abstraction capabilities, decent execution performance, stable semantics of active contracts - pick three!

By making DAML a Haskell-like language, we’ve clearly picked the second. IMHO, not having the first one is “only” unpleasant, not having (or at least being able to achieve) the latter two would be unacceptable. Well, maybe that qualifies as a deeper reason why we don’t have a very fine-grained content addressing scheme in DAML.

Luciano · November 6, 2020, 7:25am

Thanks @Martin_Huschenbett for that comprehensive answer - you’ve made it pretty clear that achieving cross-version compatibility is a very challenging task. I now understand your original concern much better.

Could you then share your team’s long-term view for cross-version compatibility of DAML?

To elaborate, and going back to the OP, how would you envision DAML libraries interacting across organizations? One of the big goals for DLT is to write secure programs that cross organizational boundaries. Having every participant agree on library versions that they are using is a big impediment: in both a bilateral setting, and even more so in a multilateral setting.

[edit: Just reading about data-dependency and seems like this is DAML’s approach for the time being]

Luciano · November 6, 2020, 8:00am

For anybody reading this, I thought to mention that there is an in-depth explanation on this topic in the docs.

It appears to give a partial solution to depending across different libraries. Sorry @Martin_Huschenbett and @bernhard I wasn’t aware of this at all!

bernhard · November 6, 2020, 8:13am

I believe that the challenges discussed here will not be solved by better content addressing (for the reasons @Martin_Huschenbett laid out), but by improving DAML(-LF)'s type system. DAML-LF is

Nominally Typed, meaning that the “name” of the type matters, not it’s structure. Ie you can’t use data Foo = Foo interchangeably with () even though both are the unit type.
Statically typed, meaning all types can be inferred at compile time.
Strongly typed, meaning you can’t sidestep the type system by doing casts or similar.
Monomorphic, meaning in the compiled DAML-LF, there are no type parameters. It’s impossible to do anything generically in DAML-LF. All the generic types you write in DAML are compiled down to monomorphic types in DAML-LF.

The compound effect of this is that when you write a template or a choice, you have to specify the exact identifier of the type of the arguments. The link between a type and its dependencies is completely rigid and if you want to switch out a type, you have to switch out all other types that depend on it.

It’s this lack of genericism on the ledger, and the resulting rigidity of the dependency graph that we have to address, not the content addressing scheme.

The most likely route is to allow some sort of polymorphism on the ledger. Using typeclasses as interfaces would be one way to do this.

class IsParty a where
  getParty : Party

exists a . (IsParty a) => template T
  with
    args : a
  where
    signatory (getParty args)

    controller (getParty args) can
       exists b . (IsParty b) => Invite : ContractId T_Invite
         with
           other : b
         do
            create T_Invite with ..

You can see here how the parameters a and b are left as “holes”. There is no implementation of the class IsParty anywhere. Both the template as well as the choice are existential types.

This would completely remove the rigidity. A supplier of an implementation of IsParty and a consumer like T merely have to point to the same interface definition IsParty. Furthermore, if you upgrade your interface to IsPartyV2, you just have to create instances for IsPartyV2 for old data types.

So why haven’t we done this a long time ago? I think there are two reasons.

The first one is something @Martin_Huschenbett and @Remy could probably elaborate on. My understanding is that keeping type parameters on the ledger, which can of course be deeply nested, could significantly increase storage costs and impact performance.
It’s quite difficult to build secure smart contracts with such features. All the re-entrancy exploits on public blockchain relied exactly on the fact that important contracts had abstracted their inputs through interfaces. Once you’ve done that, anyone can implement the interface and pass their own thing in. Imagine we had an Asset interface. It would sound like a good idea for me to write a contract saying “anyone that wants can give me assets”. Assets are good by definition, right? Maybe so, but our usual smart contract definition of an asset as a fungible transferrable token doesn’t capture the assettyness. Liabilities behave the same way. So if someone implemented the Asset interface on a smart contract representing “a barrel of toxic waste”, they could now transfer that to me.

I’ve got some vague ideas how to solve 2.: Essentially we have to make the interface permissioned. Rather than using just a type-class, we’d use a kind of “signed typeclass”. Ie all signatories of T in the above would have to have signed that the types that get passed into a and b are valid instances of IsParty. And that’s pretty much where I am in my thinking at the moment. The mechanism of signing typeclass instances, and how these signatures get checked needs to be thought through still.

Topic		Replies	Views
DAR' package ids Questions daml , damlhub	10	345	May 24, 2022
How to create independent Smart Contracts in Daml? Questions	7	667	July 29, 2021
Which is the best structure for a multi-dar project? Questions	2	486	August 23, 2022
Will there be naming conflicts if I upgrade my contracts on Daml on Fabric (or any ledger)? Questions daml-on-fabric , upgrading	20	891	March 26, 2021
Complete Walk-through Installing DAML/VS Code on Debian Buster (10) Tutorials and Guides linux	12	1522	April 5, 2021

Should we maintain a daml 'base' or 'prelude-next' library?

Related topics