← 2023


A good friend of mine, Eddie Berman, showed me a neat paper, On Machine Learning and Programming Languages. In short, this paper tries to sell the Julia Programming Language and the Flux Machine Learning Library by arguing the complexity of machine learning requires a dedicated programming language (and of course implying Julia is this language.) The authors spend the (short) paper describing the complexities of machine learning, and how it can benefit from programming languages concepts.

In my opinion, this is the wrong way to sell a language like Julia. Although, note I lean on the formal methods/programming languages side of things, so I'm biased. I think a better angle is to speak about Julia in terms of abstractions.


As is well-known with the concept of Turing completeness, any reasonable modern programming language can effectively simulate the code of any other modern programming language. The most complex program can you imagine can be written and ran in a language as esoteric as Brainfuck. So, why do we even care about writing programming languages other than, say, Brainfuck? The answer, I argue, can be summarized in the ergonomics of reasoning about abstractions.

Brainfuck directly reasons about Turing machine operations. Assembly reasons about CPU operations, which is effectively synonymous with Turing machine operations (abstracting out the CPU architecture nonsense.)

C, by comparison, is a language written to more effectively reason about the operations of a computer. It is equivalent operationally to the assembly it compiles down to. But, the language itself contains abstractions (in other words, loops, objects, standard library functions) that make reasoning about the computer easier. As many programmers will tell you, it is simply more ergonomic to work with computer memory in C than directly in assembly {1}.

From here, it follows why certain languages exist. Rust, for example, makes reasoning about memory even easier than C via the borrow checking procedure. Sure, you can implement the borrow checking procedure in C, but Rust has it built into the language with additional optimizations via the LLVM compiler toolchain.

The paper speaks about successes of domain-specific languages in formal methods and verification. What the paper does not acknowledge, however, is more specialized verification languages and are generally written to more closely reason about the mathematical objects used in the verification procedures. For example, Promela, TLA+, and NuSMV use temporal logics and automata for verification. It is generally easy to see the languages themselves are built on top of these formalisms. Similarly, theorem proving languages such as Lean and Coq are built on top of the notion of checking type-equivalence between assumptions and propositions. Thus, these languages feature "tactics" to better manipulate the types of objects.

The paper makes the argument that machine learning models require languages of their own. I think it's a natural question for me to ask: what abstractions to do those in machine learning need to reason about? Tensors? Transformers? The paper makes broad allusions to very general abstractions such as recursion and recurrence (no, I'm not kidding lmao.)

Some Thoughts

In my opinion, different languages exist in the first place due to abstraction-specific optimization. Julia, for example, has lots of subtle optimizations for working with differential equations and doing numerical analysis. Those behind Julia know a lot about these objects, so of course Julia features lots of domain-specific optimizations. Is this not the selling point of Julia, or am I just missing something? At least to me, a formal methods person, it's abundantly clear that the domain-specific machinery can be implemented in a domain-specific language. What's the purpose of a paper like this?


{1} This notion of abstraction is most easily seen in functional languages that essentially build everything on top of lambda calculus.