static-site-pipeline/hugo/content/posts/2018-09-10-don-syme.md

---
lastmod: "2021-09-12T22:47:44.0000000+01:00"
author: patrick
categories:
- programming
comments: true
date: "2018-09-10T00:00:00Z"
title: Chatting with Don Syme about the F# compiler
summary: "Notes from a chat with the creator of F#, about how to contribute to the compiler."
---

In 2018, my team at G-Research had a fruitful meeting with Don Syme about how to contribute to the F# compiler.
I made rough notes at the time.
They may be out of date by now.

# Organisational structure

* *Suggestions* are for debate; Don goes through every three months or so, and approves or rejects or leaves open.
* *RFCs* are for things which are approved in principle - but there's no guarantee that anyone will commit time to them.
* Once they're approved, they kick around the RFC folder until they're implemented, at which point they get bucketed into the appropriate historical record of F# version.

RFCs act as primary documentation, since the technical specification of F# is pretty hard to read.
It should therefore record drawbacks and alternatives that were considered.

The process for getting a feature to "approved in principle" is simply to ask Don.

Features must be complete when they're pushed - no partial features that will be extended later.
(This is so that we resolve feature interactions up front.)

# Why does F# not have functors?

F# doesn't have a functor system because the process of taking functors *out* of code once they're in is very hard.
Don wants to add features in such a way that it's not too hard to unwind a decision to start using them.

# Working on the compiler

Sidetrack: he said that when working on the compiler, all the tools basically needed to be the same versions; is that some kind of design decision?
Is it going to become less true?
I didn't ask this.

You can always push stuff up to a branch on the FSharp repo to run the CI and verify that existing F# code won't be broken by your addition to the language.

Basically everything happens in the typechecker.
"Pretty much every time we try to do things earlier, we regret it."
This is because of the language machinery, which is wedded heavily to the actual code rather than to any other representation.

# Specific notes on adding syntactic sugar for `fun x -> x.Prop`

During typechecking we would expand the `_.Prop` syntax.
Would want to expand to `fun (obj : ^T) -> (^T : (member Prop : 'U) obj)`.
This would add a constraint during typechecking, that would be solved automatically.
But this approach doesn't extend to `.Method` instead of `.Property` - since generics can't be placed on that member.

A naive find-replace of "replace `.Prop` syntactic sugar with `fun x -> x.Prop`" fails because of the example of `.Method(2)` - this can't easily be converted to `fun x -> x.Method(2)`.

There is no ability to have "or"-based constraints.

Implementation strategy for the "replace `.x` with `fun i -> i.x`" feature idea:
1. Find how `_.S` is parsed currently (hint: it's not).
2. Alter parser so that `_.S` is actually parsed to an alternative.
3. Look in the typechecker where that AST gets resolved.
4. Expand the typechecker and try again.
(Keep track of where our operator's syntax may legitimately be used, by checking the `pars.fsy` parse file.)

Pitfall to remember: the artificial name `x` in `fun x -> x.S` needs to not be reported in debug symbols etc.
Specifically during typechecking, Don hasn't made the decision whether it would be typecheck-then-replace, or replace-then-typecheck.

# The F# parser

Useful flags in tooling exist.

* `fsc --tokenize`, which tells us that `_.Length` e.g. doesn't have an expression form: the tokeniser just sees `UNDERSCORE`.
* `fsc --parseonly` to run only the parsing stage.

Specifically to see what happens with our proposed underscore syntax, go to `pars.fsy` to see where the underscore parse token appears, and `lex.fsl` to see what it lexes to.
We discover that `_` is a keyword in the language (see `lexhelp.fs`), and it's declared in `pars.fsy`.
Find usages of it in `pars.fsy`.
Seems reasonable to parse `_` as an identifier (alternative: create a whole new syntax node).

The parser defines a grammar with various types of expression. Atomic (e.g. `1`, `true`, `(blah)`, `1.0`); other expressions (`1+1`, etc).
Atomic expressions have known-in-advance delimitation in the code - e.g. bracketed expressions.
Things which will be parsed along similar code paths to our proposed underscore syntax: `base.blah`, `global.System.Blah`.

# How the typechecker works

The typechecker will first typecheck the thing on the left, and then has a chain of qualifications afterwards (i.e. the type gets refined as we shove more on the right).
E.g. `_.Name.Name2.Name3` is typechecked as "`_`" and then refined with "`.Name`", and so on.
Type-checking and name resolution are interleaved.

At the end, after typecheck, `FindUnsolved.fs` infers the rest of the types.

# How to go about implementing applicative computation expression builders
This is [issue #579](https://github.com/fsharp/fslang-suggestions/issues/579).
*Editor's note: this is now implemented as of F# 5.*

Tomas Petricek had a wide-ranging change ("joinads") which would have encompassed this feature plus many more, but it was rejected, because it tried to change the semantics of pattern matching.

Question of whether a builder should have a flag to indicate "I follow the applicative laws" - e.g. for optimising when you could use "map" rather than "bind then return".
Answer: yes, this would be good - for backwards compat, even if for no other reason.
This would probably be a new attribute on the map.

# An annoyance: not being able to have a crate returning `unit`

We could go ahead and fix this by using the `void` return type on `type C () = member x.M() : unit`, and using the `unit` return type on everything else.
(Our current workaround is to define a type `FakeUnit` which we return instead of `unit`.)