Files
static-site-pipeline/hugo/content/posts/2021-10-19-crates.md
2023-09-30 18:34:20 +01:00

299 lines
9.7 KiB
Markdown

---
lastmod: "2021-10-23T18:57:58.0000000+01:00"
author: patrick
categories:
- programming
comments: true
date: "2021-10-19T00:00:00Z"
title: Crates (existentials in F#)
summary: "An introduction to the crate pattern for representing existential quantification in F#."
---
The usual blog post introducing crates in F# is [G-Research's](https://www.gresearch.co.uk/article/squeezing-more-out-of-the-f-type-system-introducing-crates/).
However, I believe that post approaches things from a much more abstract perspective than is useful.
Instead, I will approach them by starting from the concrete.
(The crate pattern is also called [skolemization](https://en.wikipedia.org/wiki/Skolem_normal_form), by the way.)
# The problem crates solve
Say we want to represent lists in some kind of [defunctionalised]({{< ref "2020-03-04-defunctionalisation" >}}) way.
```fsharp
type ListBuilder<'a> =
| OfList of 'a list
| Cons of 'a * 'a ListBuilder
| Concat of 'a ListBuilder list
module ListBuilder =
let toList (lb : 'a ListBuilder) : 'a list =
failwith "implement"
```
Because we don't need to reify each of the individual input lists inside a `ListBuilder`, in principle the following could be a much more efficient way to construct the specified list (assuming we define appropriate module methods to deal with currying):
```fsharp
[
ListBuilder.ofList [ 1 ; 2 ; 3 ]
ListBuilder.cons 3 [ 3 ; 6 ; 8 ]
ListBuilder.concat [ ListBuilder.ofList [ 1 ; 3 ] ; ListBuilder.ofList [ 10 ; 12 ] ]
]
|> ListBuilder.concat
|> ListBuilder.toList
```
For example, if we did this purely in `List`-space, we'd have to reify the concatenated `[ 1 ; 3 ; 10 ; 12 ]`, whereas in principle we could avoid doing that here: we stand a chance of directly emitting the single output list using the compiler's built-in magical mutable-tail cons lists which are very fast.
This is all very well, but what about the following extension to the API?
```fsharp
type ListBuilder<'a> =
| OfList of 'a list
| Cons of 'a * 'a ListBuilder
| Concat of 'a ListBuilder list
| Map of ('b -> 'a) * 'b ListBuilder
```
This won't compile, because we've got this type parameter `'b` which we're not generic on.
(The structure I've written down is essentially trying to be a [generalised algebraic datatype](https://en.wikipedia.org/wiki/Generalized_algebraic_data_type).)
# How do we approach this kind of problem?
Let's drop down a level of abstraction.
To represent sum types in F#, it's very easy:
```fsharp
type MySumType =
| Foo of int
| Bar of string
let toString (input : MySumType) : string =
match input with
// hooray, the language lets us do our desired operation on the object
| Foo i -> sprintf "%i" i
| Bar s -> s
```
But C# doesn't let you do this! (Well, in ten years C# will have grown to subsume all other programming languages. But as of this writing, it can't.)
How do you approach exactly the same problem in C#?
The [canonical trick](https://blog.ploeh.dk/2018/06/25/visitor-as-a-sum-type/) when your language can't represent the decomposition of an object is to *seal up the object, but teach it how to accept operations from the outside*.
We can't take the object to the operation, so we have to take the operation to the object.
(This is the "[visitor pattern](https://en.wikipedia.org/wiki/Visitor_pattern)".)
I'll present the solution in F# restricted to concepts that are present in C#.
```fsharp
type IMySumType =
abstract Match<'ret> : (int -> 'ret) -> (string -> 'ret) -> 'ret
type FooCase (i : int) =
interface IMySumType with
member _.Match<'ret> (f : int -> 'ret) (_ : string -> 'ret) =
f i
type BarCase (s : string) =
interface IMySumType with
member _.Match<'ret> (_ : int -> 'ret) (g : string -> 'ret) =
g s
let toString (input : IMySumType) : string =
input.Match (sprintf "%i") id
```
Notice what we've done: we have in some sense "taught the discriminated union how to perform operations on itself", to compensate for the fact that the language can't express the notion "open up the object to see what's inside".
If you wanted to be *really* object-oriented, you could make an object that represents the operation, and rejig so that `Match` now takes an Operation instead:
```fsharp
type Operation<'ret> =
abstract OperateOnFoo : int -> 'ret
abstract OperateOnBar : string -> 'ret
type IMySumType =
abstract Match<'ret> : Operation<'ret> -> 'ret
type FooCase (i : int) =
interface IMySumType with
member _.Match<'ret> (operation : Operation<'ret>) : 'ret =
operation.OperateOnFoo i
type BarCase (s : string) =
interface IMySumType with
member _.Match<'ret> (operation : Operation<'ret>) : 'ret =
operation.OperateOnBar s
let toString =
{ new Operation<string> with
member _.OperateOnFoo (i : int) = sprintf "%i" i
member _.OperateOnBar (s : string) = s
}
```
# A little more general
OK, how about the following, where we're now generic?
```fsharp
type 'a MySumType =
| Foo of int
| Bar of 'a
let toInt<'a> (input : MySumType<'a>) : int =
match input with
| Foo i -> i
| Bar a ->
// what can go here?
failwith "implement"
```
Well, the only way this code can make sense is if `toInt` also takes an `'a -> int` function:
```fsharp
type 'a MySumType =
| Foo of int
| Bar of 'a
let toInt<'a> (barCase : 'a -> int) (input : MySumType<'a>) : int =
match input with
| Foo i -> i
| Bar a -> barCase a
```
How might we do this in C#? Here's exactly the same transformation as before:
```fsharp
type IMySumType<'a> =
abstract Match<'ret> : (int -> 'ret) -> ('a -> 'ret) -> 'ret
type FooCase<'a> (i : int) =
interface IMySumType<'a> with
member _.Match<'ret> (f : int -> 'ret) (_ : 'a -> 'ret) =
f i
type BarCase<'a> (a : 'a) =
interface IMySumType<'a> with
member _.Match<'ret> (_ : int -> 'ret) (g : 'a -> 'ret) =
g a
let toInt<'a> (barCase : 'a -> int) (input : IMySumType<'a>) : int =
input.Match id barCase
```
Again, for the really object-oriented inclined:
```fsharp
type Operation<'a, 'ret> =
abstract OperateOnFoo : int -> 'ret
abstract OperateOnBar : 'a -> 'ret
type IMySumType<'a> =
abstract Match<'ret> : Operation<'a, 'ret> -> 'ret
type FooCase<'a> (i : int) =
interface IMySumType<'a> with
member _.Match operation =
operation.OperateOnFoo i
type BarCase<'a> (a : 'a) =
interface IMySumType<'a> with
member _.Match operation =
operation.OperateOnBar a
let toInt<'a> (barCase : 'a -> int) =
{ new Operation<'a, int> with
member _.OperateOnFoo i = i
member _.OperateOnBar a = barCase a
}
```
# How this applies to the Map case
Recall the problematic code:
```fsharp
type ListBuilder<'a> =
| OfList of 'a list
| Cons of 'a * 'a ListBuilder
| Concat of 'a ListBuilder list
// This is unrepresentable
| Map of ('b -> 'a) * 'b ListBuilder
```
Well, the language doesn't give us the tools to "open up" the contents of the Map case.
So let's pull the same visiting trick, starting from the top:
```fsharp
type IListBuilder<'a> =
// This doesn't compile either
abstract Match<'ret> : ('a list -> 'ret) -> ('a -> 'a IListBuilder -> 'ret) -> ('a IListBuilder list -> 'ret) -> (('b -> 'a) -> 'b IListBuilder -> 'ret) -> 'ret
```
Well, this looks ghastly and also it doesn't compile (because again there's that `'b` we can't place anywhere), but happily if we try and make it look nicer by moving to the "really object-oriented" world, *both* those problems go away!
```fsharp
type Operation<'a, 'ret> =
abstract OperateOnOfList : 'a list -> 'ret
abstract OperateOnCons : 'a -> 'a IListBuilder -> 'ret
abstract OperateOnConcat : 'a IListBuilder list -> 'ret
// Ta-da!
abstract OperateOnMap<'b> : ('b -> 'a) -> 'b IListBuilder -> 'ret
and IListBuilder<'a> =
abstract Match<'ret> : Operation<'a, 'ret> -> 'ret
type OfListCase<'a> (xs : 'a list) =
interface IListBuilder<'a> with
member _.Match op = op.OperateOnOfList xs
type OfConsCase<'a> (x : 'a, xs : 'a IListBuilder) =
interface IListBuilder<'a> with
member _.Match op = op.OperateOnCons x xs
type OfConcat<'a> (xs : 'a IListBuilder list) =
interface IListBuilder<'a> with
member _.Match op = op.OperateOnConcat xs
type OfMap<'a, 'b> (f : 'b -> 'a, xs : 'b IListBuilder) =
interface IListBuilder<'a> with
member _.Match op = op.OperateOnMap f xs
```
Now, it's a little upsetting that the user can't operate on a nice `OfList` case without making one of our nonsense interfaces.
It would be handy if we could retain the terse F# pattern match in all the cases where the language can represent it.
In the following, I just deleted three cases from `Operation`, and three implementors of `IListBuilder`, and created a discriminated union `ListBuilder` which is just "an ordinary ListBuilder, but with an extra case where we have to go through the indirection nonsense".
```fsharp
type Operation<'a, 'ret> =
abstract OperateOnMap<'b> : ('b -> 'a) -> 'b ListBuilder -> 'ret
and IListBuilder<'a> =
abstract Match<'ret> : Operation<'a, 'ret> -> 'ret
and ListBuilder<'a> =
| OfList of 'a list
| OfCons of 'a * 'a ListBuilder
| OfConcat of 'a ListBuilder list
| OfMap of 'a IListBuilder
```
I'll just rename this a bit:
```fsharp
type MapCaseEvaluator<'a, 'ret> =
abstract Eval<'b> : ('b -> 'a) -> 'b ListBuilder -> 'ret
and MapCaseCrate<'a> =
abstract Apply<'ret> : MapCaseEvaluator<'a, 'ret> -> 'ret
and ListBuilder<'a> =
| OfList of 'a list
| OfCons of 'a * 'a list
| OfConcat of 'a ListBuilder list
| OfMap of 'a MapCaseCrate
```
And that is a crate being used as part of the implementation of a GADT.