Unintuitive behaviour of `group` and `groupOn`

I’ve noticed that group and groupOn behave rather unintuitively:

  let list = [ "A", "B", "A", "A", "C" ]
  let grouped = group list
  debug grouped

returns:

[["A"],["B"],["A","A"],["C"]]

The docs say the following:

The ‘group’ function groups equal elements into sublists such that the concatenation of the result is equal to the argument.

So the docs do say - albeit in a bit of an odd way - that only successive equal elements will be grouped. Intuitively, I would’ve expected for any equal elements in the list to be grouped (but then the statement about concatenation returning the original argument doesn’t hold anymore). Similarly, I would expect the groupOn function to group all elements that return the same key, regardless of where they occur in the list. Furthermore, I’d expect it to return tuples that contain the keys as well as the elements, because usually I’d want to interate over the keys to process the groups.

I’m wondering what the rationale behind the current implementation is, I can’t think of a use case from the top of my head where I’d use it the way it’s implemented right now. Reason I’m asking this is that I’ve just spent far too long debugging an issue that came down to that confusion, and I’m guessing I might not be the only one running into it.

2 Likes

I’ve had a similar experience and internalized a need to sort before calling those functions.

But once you realize that you have to "O(nlog n) " for your group functionality, I eschew the list processing and just build a Map k [a] instead.

1 Like

@georg
This

groupIntoMap : Ord k => (a -> k) -> (a -> v) -> [a] -> Map k [v]
groupIntoMap k v = foldl update M.empty
  where update m a =
          let k' = k a
              v' = v a
              c = O.fromOptional [] (M.lookup k' m)
          in
          M.insert k' (v'::c) m

is the latest version of grouping function I have started using.

1 Like

While I’m not entirely familiar with the history here, I suspect the somewhat unsatisfying answer is that we inherited this function from Haskell.

There are some arguments for that implementation:

Currently we have the type group : Eq a => [a] -> [[a]]. Given that there is no Ord constraint you cannot sort so if you want to implement something that groups across the whole list you end up with a quadratic implementation which is no good.

Now you might argue that everything has an Ord constraint anyway and we should have just added one. However, until LF 1.14 this was no true for contract ids which are clearly very common in Daml

Open to suggestions here. We could either add a groupSort or some version of the groupMap.

1 Like