Bozhidar Batsov: fsharp-ts-mode: A Modern Emacs Mode for F#

Wait 5 sec.

I’m pretty much done with the focused development push onneocaml – it’s reached a point where I’mgenuinely happy using it daily and the remaining work is mostly incrementalpolish. So naturally, instead of taking a break I decided it was time to startanother project that’s been living in the back of my head for a while:a proper Tree-sitter-based F# mode for Emacs.Meet fsharp-ts-mode.Why F#?I’ve written before about my fondness for the ML family of languages, and whileOCaml gets most of my attention, last year I developed a soft spot for F#. In someways I like it even a bit more than OCaml – the tooling is excellent, the .NETecosystem is massive, and computation expressions are one of the most elegantabstractions I’ve seen in any language. F# manages to feel both practical andbeautiful, which is a rare combination.The problem is that Emacs has never been particularly popular with F# programmers– or .NET programmers in general. The existingfsharp-mode works, but it’sshowing its age: regex-based highlighting, SMIE indentation with quirks, andsome legacy code dating back to the caml-mode days. I needed a good F# mode forEmacs, and that’s enough of a reason to build one in my book.The NameI’ll be honest – I spent quite a bit of time trying to come up witha clever name.1 Some candidates that didn’t make the cut: fsharpe-mode (fsharp(evolved/enhanced)-mode) Fa Dièse (French for F sharp – because after spending time with OCaml youstart thinking in French, apparently) fluoride (a play on Ionide, the popular F# IDEextension)In the end none of my fun ideas stuck, so I went with the boring-but-obviousfsharp-ts-mode. Sometimes the straightforward choice is the right one. Atleast nobody will have trouble finding it.2Built on neocaml’s FoundationI modeled fsharp-ts-mode directly after neocaml, and the two packages sharea lot of structural similarities – which shouldn’t be surprising given how muchOCaml and F# have in common. The same architecture (base mode + language-specificderived modes), the same approach to font-locking (shared + grammar-specificrules), the same REPL integration pattern (comint with tree-sitter inputhighlighting), the same build system interaction pattern (minor mode wrapping CLIcommands).This also meant I could get the basics in place really quickly. Having alreadysolved problems like trailing comment indentation, forward-sexp hybridnavigation, and imenu with qualified names in neocaml, porting those solutionsto F# was mostly mechanical.What’s in 0.1.0The initial release covers all the essentials: Syntax highlighting via Tree-sitter with 4 customizable levels, supporting.fs, .fsx, and .fsi files Indentation via Tree-sitter indent rules Imenu with fully-qualified names (e.g., MyModule.myFunc) Navigation – beginning-of-defun, end-of-defun, forward-sexp F# Interactive (REPL) integration with tree-sitter highlighting for input dotnet CLI integration – build, test, run, clean, format, restore, withwatch mode support .NET API documentation lookup at point (C-c C-d) Eglot integration forFsAutoComplete Compilation error parsing for dotnet build output Shift region left/right, auto-detect indent offset, prettify symbols, outlinemode, and moreMigrating from fsharp-modeIf you’re currently using fsharp-mode, switching is straightforward:12(use-package fsharp-ts-mode :vc (:url "https://github.com/bbatsov/fsharp-ts-mode" :rev :newest))The main thing fsharp-ts-mode doesn’t have yet is automatic LSP serverinstallation (the eglot-fsharp package does this for fsharp-mode). You’llneed to install FsAutoComplete yourself:$ dotnet tool install -g fsautocompleteAfter that, (add-hook 'fsharp-ts-mode-hook #'eglot-ensure) is all you need.See the migrationguide inthe README for a detailed comparison.Lessons LearnedWorking with the ionide/tree-sitter-fsharpgrammar surfaced some interesting challenges compared to the OCaml grammar:F#’s indentation-sensitive syntax is trickyUnlike OCaml, where indentation is purely cosmetic, F# uses significantwhitespace (the “offside rule”). The tree-sitter grammar needs correctindentation to parse correctly, which creates a chicken-and-egg problem: youneed a correct parse tree to indent, but you need correct indentation toparse. For example, if you paste this unindented block:12345let f x =if x > 0 thenx + 1else0The parser can’t tell that if is the body of f or that x + 1 belongsto the then branch – it produces ERROR nodes everywhere, andindent-region has nothing useful to work with. But if you’re typing the codeline by line, the parser always has enough context from preceding lines toindent the current line correctly. This is a fundamental limitation of anyindentation-sensitive grammar.The two grammars are more different than you’d expectOCaml’s tree-sitter-ocaml-interface grammar inherits from the base grammar, soyou can share queries freely. F#’s fsharp and fsharp_signature grammars areindependent with different node types and field names for equivalent concepts.For instance, a let binding is function_or_value_defn in the .fs grammarbut value_definition in the .fsi grammar. Type names use a type_name:field in one grammar but not the other. Even some keyword tokens (of, open,type) that work fine as query matches in fsharp fail at runtime infsharp_signature.This forced me to split font-lock rules into shared and grammar-specific sets –more code, more testing, more edge cases.Script files are weirdF# script (.fsx) files without a module declaration can mix let bindingswith bare expressions like printfn. The grammar doesn’t expect a declarationafter a bare expression at the top level, so it chains everything into nestedapplication_expression nodes:123let x = 1printfn "%d" x // bare expressionlet y = 2 // grammar nests this under the printfn nodeEach subsequent let ends up one level deeper, causing progressiveindentation. I worked around this with a heuristic that detects declarationswhose ancestor chain leads back to file through these misparented nodes andforces them to column 0. Shebangs (#!/usr/bin/env dotnet fsi) required adifferent trick – excluding the first line from the parser’s range entirelyvia treesit-parser-set-included-ranges.I’ve filed issues upstream for the grammar pain points – hopefully they’llimprove over time.Current StatusLet me be upfront: this is a 0.1.0 release and it’s probably quite buggy. I’vetested it against a reasonable set of F# code, but there are certainlyindentation edge cases, font-lock gaps, and interactions I haven’t encounteredyet. If you try it and something looks wrong, please open anissue – M-xfsharp-ts-mode-bug-report-info will collect the environment details for you.The package can currently be installed only from GitHub (via package-vc-installor manually). I’ve filed a PR withMELPA and I hope it will get mergedsoon.Wrapping UpI really need to take a break from building Tree-sitter major modes at thispoint. Between clojure-ts-mode, neocaml, asciidoc-mode, and nowfsharp-ts-mode, I’ve spent a lot of time staring at tree-sitter node types andindent rules.3 It’s been fun, but I think I’ve earned a vacation fromtreesit-font-lock-rules.I really wanted to do something nice for the (admittedly small) F#-on-Emacscommunity, and a modern major mode seemed like the most meaningful contribution Icould make. I hope some of you find it useful!That’s all from me, folks! Keep hacking! Way more time than I needed to actually implement the mode. ↩︎ Many people pointed out they thought neocaml was some package for neovim. Go figure why! ↩︎ I’ve also been helping a bit with erlang-ts-mode recently. ↩︎