--- lastmod: "2021-10-23T18:57:58.0000000+01:00" author: patrick categories: - programming comments: true date: "2021-10-19T00:00:00Z" title: Crates (existentials in F#) summary: "An introduction to the crate pattern for representing existential quantification in F#." --- The usual blog post introducing crates in F# is [G-Research's](https://www.gresearch.co.uk/article/squeezing-more-out-of-the-f-type-system-introducing-crates/). However, I believe that post approaches things from a much more abstract perspective than is useful. Instead, I will approach them by starting from the concrete. (The crate pattern is also called [skolemization](https://en.wikipedia.org/wiki/Skolem_normal_form), by the way.) # The problem crates solve Say we want to represent lists in some kind of [defunctionalised]({{< ref "2020-03-04-defunctionalisation" >}}) way. ```fsharp type ListBuilder<'a> = | OfList of 'a list | Cons of 'a * 'a ListBuilder | Concat of 'a ListBuilder list module ListBuilder = let toList (lb : 'a ListBuilder) : 'a list = failwith "implement" ``` Because we don't need to reify each of the individual input lists inside a `ListBuilder`, in principle the following could be a much more efficient way to construct the specified list (assuming we define appropriate module methods to deal with currying): ```fsharp [ ListBuilder.ofList [ 1 ; 2 ; 3 ] ListBuilder.cons 3 [ 3 ; 6 ; 8 ] ListBuilder.concat [ ListBuilder.ofList [ 1 ; 3 ] ; ListBuilder.ofList [ 10 ; 12 ] ] ] |> ListBuilder.concat |> ListBuilder.toList ``` For example, if we did this purely in `List`-space, we'd have to reify the concatenated `[ 1 ; 3 ; 10 ; 12 ]`, whereas in principle we could avoid doing that here: we stand a chance of directly emitting the single output list using the compiler's built-in magical mutable-tail cons lists which are very fast. This is all very well, but what about the following extension to the API? ```fsharp type ListBuilder<'a> = | OfList of 'a list | Cons of 'a * 'a ListBuilder | Concat of 'a ListBuilder list | Map of ('b -> 'a) * 'b ListBuilder ``` This won't compile, because we've got this type parameter `'b` which we're not generic on. (The structure I've written down is essentially trying to be a [generalised algebraic datatype](https://en.wikipedia.org/wiki/Generalized_algebraic_data_type).) # How do we approach this kind of problem? Let's drop down a level of abstraction. To represent sum types in F#, it's very easy: ```fsharp type MySumType = | Foo of int | Bar of string let toString (input : MySumType) : string = match input with // hooray, the language lets us do our desired operation on the object | Foo i -> sprintf "%i" i | Bar s -> s ``` But C# doesn't let you do this! (Well, in ten years C# will have grown to subsume all other programming languages. But as of this writing, it can't.) How do you approach exactly the same problem in C#? The [canonical trick](https://blog.ploeh.dk/2018/06/25/visitor-as-a-sum-type/) when your language can't represent the decomposition of an object is to *seal up the object, but teach it how to accept operations from the outside*. We can't take the object to the operation, so we have to take the operation to the object. (This is the "[visitor pattern](https://en.wikipedia.org/wiki/Visitor_pattern)".) I'll present the solution in F# restricted to concepts that are present in C#. ```fsharp type IMySumType = abstract Match<'ret> : (int -> 'ret) -> (string -> 'ret) -> 'ret type FooCase (i : int) = interface IMySumType with member _.Match<'ret> (f : int -> 'ret) (_ : string -> 'ret) = f i type BarCase (s : string) = interface IMySumType with member _.Match<'ret> (_ : int -> 'ret) (g : string -> 'ret) = g s let toString (input : IMySumType) : string = input.Match (sprintf "%i") id ``` Notice what we've done: we have in some sense "taught the discriminated union how to perform operations on itself", to compensate for the fact that the language can't express the notion "open up the object to see what's inside". If you wanted to be *really* object-oriented, you could make an object that represents the operation, and rejig so that `Match` now takes an Operation instead: ```fsharp type Operation<'ret> = abstract OperateOnFoo : int -> 'ret abstract OperateOnBar : string -> 'ret type IMySumType = abstract Match<'ret> : Operation<'ret> -> 'ret type FooCase (i : int) = interface IMySumType with member _.Match<'ret> (operation : Operation<'ret>) : 'ret = operation.OperateOnFoo i type BarCase (s : string) = interface IMySumType with member _.Match<'ret> (operation : Operation<'ret>) : 'ret = operation.OperateOnBar s let toString = { new Operation with member _.OperateOnFoo (i : int) = sprintf "%i" i member _.OperateOnBar (s : string) = s } ``` # A little more general OK, how about the following, where we're now generic? ```fsharp type 'a MySumType = | Foo of int | Bar of 'a let toInt<'a> (input : MySumType<'a>) : int = match input with | Foo i -> i | Bar a -> // what can go here? failwith "implement" ``` Well, the only way this code can make sense is if `toInt` also takes an `'a -> int` function: ```fsharp type 'a MySumType = | Foo of int | Bar of 'a let toInt<'a> (barCase : 'a -> int) (input : MySumType<'a>) : int = match input with | Foo i -> i | Bar a -> barCase a ``` How might we do this in C#? Here's exactly the same transformation as before: ```fsharp type IMySumType<'a> = abstract Match<'ret> : (int -> 'ret) -> ('a -> 'ret) -> 'ret type FooCase<'a> (i : int) = interface IMySumType<'a> with member _.Match<'ret> (f : int -> 'ret) (_ : 'a -> 'ret) = f i type BarCase<'a> (a : 'a) = interface IMySumType<'a> with member _.Match<'ret> (_ : int -> 'ret) (g : 'a -> 'ret) = g a let toInt<'a> (barCase : 'a -> int) (input : IMySumType<'a>) : int = input.Match id barCase ``` Again, for the really object-oriented inclined: ```fsharp type Operation<'a, 'ret> = abstract OperateOnFoo : int -> 'ret abstract OperateOnBar : 'a -> 'ret type IMySumType<'a> = abstract Match<'ret> : Operation<'a, 'ret> -> 'ret type FooCase<'a> (i : int) = interface IMySumType<'a> with member _.Match operation = operation.OperateOnFoo i type BarCase<'a> (a : 'a) = interface IMySumType<'a> with member _.Match operation = operation.OperateOnBar a let toInt<'a> (barCase : 'a -> int) = { new Operation<'a, int> with member _.OperateOnFoo i = i member _.OperateOnBar a = barCase a } ``` # How this applies to the Map case Recall the problematic code: ```fsharp type ListBuilder<'a> = | OfList of 'a list | Cons of 'a * 'a ListBuilder | Concat of 'a ListBuilder list // This is unrepresentable | Map of ('b -> 'a) * 'b ListBuilder ``` Well, the language doesn't give us the tools to "open up" the contents of the Map case. So let's pull the same visiting trick, starting from the top: ```fsharp type IListBuilder<'a> = // This doesn't compile either abstract Match<'ret> : ('a list -> 'ret) -> ('a -> 'a IListBuilder -> 'ret) -> ('a IListBuilder list -> 'ret) -> (('b -> 'a) -> 'b IListBuilder -> 'ret) -> 'ret ``` Well, this looks ghastly and also it doesn't compile (because again there's that `'b` we can't place anywhere), but happily if we try and make it look nicer by moving to the "really object-oriented" world, *both* those problems go away! ```fsharp type Operation<'a, 'ret> = abstract OperateOnOfList : 'a list -> 'ret abstract OperateOnCons : 'a -> 'a IListBuilder -> 'ret abstract OperateOnConcat : 'a IListBuilder list -> 'ret // Ta-da! abstract OperateOnMap<'b> : ('b -> 'a) -> 'b IListBuilder -> 'ret and IListBuilder<'a> = abstract Match<'ret> : Operation<'a, 'ret> -> 'ret type OfListCase<'a> (xs : 'a list) = interface IListBuilder<'a> with member _.Match op = op.OperateOnOfList xs type OfConsCase<'a> (x : 'a, xs : 'a IListBuilder) = interface IListBuilder<'a> with member _.Match op = op.OperateOnCons x xs type OfConcat<'a> (xs : 'a IListBuilder list) = interface IListBuilder<'a> with member _.Match op = op.OperateOnConcat xs type OfMap<'a, 'b> (f : 'b -> 'a, xs : 'b IListBuilder) = interface IListBuilder<'a> with member _.Match op = op.OperateOnMap f xs ``` Now, it's a little upsetting that the user can't operate on a nice `OfList` case without making one of our nonsense interfaces. It would be handy if we could retain the terse F# pattern match in all the cases where the language can represent it. In the following, I just deleted three cases from `Operation`, and three implementors of `IListBuilder`, and created a discriminated union `ListBuilder` which is just "an ordinary ListBuilder, but with an extra case where we have to go through the indirection nonsense". ```fsharp type Operation<'a, 'ret> = abstract OperateOnMap<'b> : ('b -> 'a) -> 'b ListBuilder -> 'ret and IListBuilder<'a> = abstract Match<'ret> : Operation<'a, 'ret> -> 'ret and ListBuilder<'a> = | OfList of 'a list | OfCons of 'a * 'a ListBuilder | OfConcat of 'a ListBuilder list | OfMap of 'a IListBuilder ``` I'll just rename this a bit: ```fsharp type MapCaseEvaluator<'a, 'ret> = abstract Eval<'b> : ('b -> 'a) -> 'b ListBuilder -> 'ret and MapCaseCrate<'a> = abstract Apply<'ret> : MapCaseEvaluator<'a, 'ret> -> 'ret and ListBuilder<'a> = | OfList of 'a list | OfCons of 'a * 'a list | OfConcat of 'a ListBuilder list | OfMap of 'a MapCaseCrate ``` And that is a crate being used as part of the implementation of a GADT.