One thing that is great about Erlang’s pattern matching is that it makes it extremely approachable for writing, lexer, parser and compilers in it: https://github.com/elixir-dbvisor/sql and with Elixir macros and sigils then you can embed other languages like sql and zig to name a few!
Does Erlang/Elixer have any edge over Ocaml or Haskell in that niche? They also have pattern matching, of course, and strong static types tend to work nicely for compilers too.
Of course, the big superpower they have is the BEAM and the robust multiprocessing support there, but that’s not especially helpful for compilers…or is it?
As someone who has used both SML, Haskell, Rust and Elixir professionally: No, not really.
Access to the BEAM is nice, but unless you're targeting the BEAM in your compiler I don't see any benefit. Even if you're targeting the BEAM, you might decide to use another language, cf. Gleam: https://github.com/gleam-lang/gleam/
Edit: Actually, one thing I will mention is the superior support in Elixir/Erlang for pattern matching bitstrings[0]. Not usually helpful in compilers, but an evolution of pattern matching that other languages should take up, in my opinion.
Erlang's bitstring/binary handling is one of those things that once you use, you'll wonder why it's not in every language (alongside, for me, Rust's enum/sum types and Python's badly-named but wonderfully useful while-else).
Elixir compiler is written in Erlang, Erlang can produce very efficient code, the new json library can beat c libraries at decoding / encoding. And you get this with a strongly typed dynamic language, which is a distributed language. It’s really hard to beat the BEAM, if only we had better number crunching, but in so cases you can always write a nif.
"Strongly typed" is stretching it. Type checking is bolted on and not part of `erlc`. Typing is quite unergonomic in Erlang/Elixir (similar to Typescript bolted onto JS).
The type system is one of the weakest parts of the beam ecosystem.
You can't really use the word "certainly" when speaking about "strongly typed" because the entire concept is fuzzy and subjective. From the article you linked:
> > However, there is no precise technical definition of what the terms mean and different authors disagree about the implied meaning of the terms and the relative rankings of the "strength" of the type systems of mainstream programming languages. For this reason, writers who wish to write unambiguously about type systems often eschew the terms "strong typing" and "weak typing" in favor of specific expressions such as "type safety".
I personally think the whole concept of "strongly typed", which is usually used as a prop to make dynamic languages count as part of the cool kids typed-languages club, should be ditched as a point of argument. The supposed "weakly typed" languages people are usually comparing to (like C) aren't actually framed as viable alternative for problems dynamic languages are suited for, so they're something of a straw man. I'd like to see advocates for dynamically typed languages ditch the obsession with having types like the cool kids and instead focus on showing why dynamism is valuable.
There are plenty of great cases to make for dynamism without having to argue on rhetorical ground that the static languages defined and dominate.
I agree that the terminology is not ideal, but think there's a huge difference between JS' "weak types", i.e. abundant implicit conversions, and e.g. Elixirs "strong types", where `1 + "foo"` is a runtime error. I don't care if we call the latter something else though. Any good suggestions?
That said, I prefer having both strong and static typing, but that's another argument.
I'd suggest "high-cast" and "low-cast". They draw attention to the thing that people usually mean when they talk about strong (not static) typing: whether operations in a language bias towards automatically coercing types so that a non-type-error result can be produced or not. High-cast languages tend towards requiring explicit type conversion; low-cast languages tend towards both implicit conversion and more complex behaviors when more than one type is supplied to a given operation. Also, the terms pun nicely with "high-cost" and "low-cost".
That said, it's still a spectrum and there's a lot of subjectiveness here. Everyone agrees that '1 + "foo"' is meaningless, but what about string multiplication? If a language documents that an integer multiplied by a string repeats the string, is that weakly typed/low-cast, or is it just documented multiplication operator behavior? If string multiplication is a whole separate operator, is that more strongly typed (and if so, are we all gonna be able to sleep at night since that means Perl 5 is more strongly typed than Python)?
That subjectiveness extends into the domain of hidden runtime costs, as well. Theoretically, any iterable of hashable items can be passed to a language's implementation of "HashSet::union(items)". But the implementation/performance of "union()" might differ based on the type of the iterable: should we be allowed to pass a lazy iterator which produces values after arbitrary custom computations? Many languages say "yes" here, but some consider collecting/each-ing the iterator something that must be explicit so the cost/exhaustion/side-effectfulness of the iteration is made clear. How about unioning a set with a vector, versus another set? Very different algorithmic behavior happens inside the union if another hash set is supplied instead of, say, a static array or linked list; while the complexity for nonlazy unions is always O(N), the average complexity/wallclock performance may be very different. Rust's stdlib, for example, discourages this kind of heterogenous union (not, I suspect, out of a desire for high-cast-explicitness, but because it wants to encourage use of its lazy O(1) union system instead). Are the answers to that question part of the high-cast/low-cast (or strong/weak type system) spectrum, or are they just specific choices made by each language's collections library? Ask 10 programmers, and I suspect you'll get a lot of different answers.
Dialyzer might be considered "bolted on", but the BEAM itself is strongly and dynamically typed. In Elixir the compiler is getting static typing as well.
These languages have other properties that can play the role that types are sometimes relied upon to do. It's uncommon that I think in types at all when building things in Elixir, thinking about shapes usually gets me all the way.
In my experience string processing libraries are the weakest part, due to some of them having abysmal performance for whatever reason. Last I had this problem I wanted to do ETL on mbox files but gave up and did it with someone's PHP one-class weekend project instead.
The Strand programming book states that an early version of the Erlang runtime was implemented in Strand (see "13.1: History" http://www.call-with-current-continuation.org/files/strand-b...), which is an interesting tidbit that I haven't seen come up when the history of Erlang is discussed, like in the featured article.
Studying the BEAM is definitely on my ToDo list. It's task parallelism sounds exemplar, and I really want to understand the architectural ramifications of choosing fine-grained task parallelism vs. a data parallel-friendly approach.
It’s fascinating how long the BEAM has lasted. And even more fascinating how relevant its concurrency model still is in today’s async-heavy world. Built different.
One thing that is great about Erlang’s pattern matching is that it makes it extremely approachable for writing, lexer, parser and compilers in it: https://github.com/elixir-dbvisor/sql and with Elixir macros and sigils then you can embed other languages like sql and zig to name a few!
Does Erlang/Elixer have any edge over Ocaml or Haskell in that niche? They also have pattern matching, of course, and strong static types tend to work nicely for compilers too.
Of course, the big superpower they have is the BEAM and the robust multiprocessing support there, but that’s not especially helpful for compilers…or is it?
As someone who has used both SML, Haskell, Rust and Elixir professionally: No, not really.
Access to the BEAM is nice, but unless you're targeting the BEAM in your compiler I don't see any benefit. Even if you're targeting the BEAM, you might decide to use another language, cf. Gleam: https://github.com/gleam-lang/gleam/
Edit: Actually, one thing I will mention is the superior support in Elixir/Erlang for pattern matching bitstrings[0]. Not usually helpful in compilers, but an evolution of pattern matching that other languages should take up, in my opinion.
0: https://hexdocs.pm/elixir/Kernel.SpecialForms.html#%3C%3C%3E...
Erlang's bitstring/binary handling is one of those things that once you use, you'll wonder why it's not in every language (alongside, for me, Rust's enum/sum types and Python's badly-named but wonderfully useful while-else).
OCaml also has a binary string pattern matching feature which sounds pretty similar: https://practicalocaml.com/parsing-with-binary-string-patter...
Elixir compiler is written in Erlang, Erlang can produce very efficient code, the new json library can beat c libraries at decoding / encoding. And you get this with a strongly typed dynamic language, which is a distributed language. It’s really hard to beat the BEAM, if only we had better number crunching, but in so cases you can always write a nif.
"Strongly typed" is stretching it. Type checking is bolted on and not part of `erlc`. Typing is quite unergonomic in Erlang/Elixir (similar to Typescript bolted onto JS).
The type system is one of the weakest parts of the beam ecosystem.
Erlang/Elixir are certainly strongly typed[0] but they are not statically typed[1].
0: https://en.wikipedia.org/wiki/Strong_and_weak_typing
1: https://en.wikipedia.org/wiki/Type_system#Static_type_checki...
You can't really use the word "certainly" when speaking about "strongly typed" because the entire concept is fuzzy and subjective. From the article you linked:
> > However, there is no precise technical definition of what the terms mean and different authors disagree about the implied meaning of the terms and the relative rankings of the "strength" of the type systems of mainstream programming languages. For this reason, writers who wish to write unambiguously about type systems often eschew the terms "strong typing" and "weak typing" in favor of specific expressions such as "type safety".
I personally think the whole concept of "strongly typed", which is usually used as a prop to make dynamic languages count as part of the cool kids typed-languages club, should be ditched as a point of argument. The supposed "weakly typed" languages people are usually comparing to (like C) aren't actually framed as viable alternative for problems dynamic languages are suited for, so they're something of a straw man. I'd like to see advocates for dynamically typed languages ditch the obsession with having types like the cool kids and instead focus on showing why dynamism is valuable.
There are plenty of great cases to make for dynamism without having to argue on rhetorical ground that the static languages defined and dominate.
I agree that the terminology is not ideal, but think there's a huge difference between JS' "weak types", i.e. abundant implicit conversions, and e.g. Elixirs "strong types", where `1 + "foo"` is a runtime error. I don't care if we call the latter something else though. Any good suggestions?
That said, I prefer having both strong and static typing, but that's another argument.
I'd suggest "high-cast" and "low-cast". They draw attention to the thing that people usually mean when they talk about strong (not static) typing: whether operations in a language bias towards automatically coercing types so that a non-type-error result can be produced or not. High-cast languages tend towards requiring explicit type conversion; low-cast languages tend towards both implicit conversion and more complex behaviors when more than one type is supplied to a given operation. Also, the terms pun nicely with "high-cost" and "low-cost".
That said, it's still a spectrum and there's a lot of subjectiveness here. Everyone agrees that '1 + "foo"' is meaningless, but what about string multiplication? If a language documents that an integer multiplied by a string repeats the string, is that weakly typed/low-cast, or is it just documented multiplication operator behavior? If string multiplication is a whole separate operator, is that more strongly typed (and if so, are we all gonna be able to sleep at night since that means Perl 5 is more strongly typed than Python)?
That subjectiveness extends into the domain of hidden runtime costs, as well. Theoretically, any iterable of hashable items can be passed to a language's implementation of "HashSet::union(items)". But the implementation/performance of "union()" might differ based on the type of the iterable: should we be allowed to pass a lazy iterator which produces values after arbitrary custom computations? Many languages say "yes" here, but some consider collecting/each-ing the iterator something that must be explicit so the cost/exhaustion/side-effectfulness of the iteration is made clear. How about unioning a set with a vector, versus another set? Very different algorithmic behavior happens inside the union if another hash set is supplied instead of, say, a static array or linked list; while the complexity for nonlazy unions is always O(N), the average complexity/wallclock performance may be very different. Rust's stdlib, for example, discourages this kind of heterogenous union (not, I suspect, out of a desire for high-cast-explicitness, but because it wants to encourage use of its lazy O(1) union system instead). Are the answers to that question part of the high-cast/low-cast (or strong/weak type system) spectrum, or are they just specific choices made by each language's collections library? Ask 10 programmers, and I suspect you'll get a lot of different answers.
Elixir team is slowly bringing in type checking into the language: https://elixir-lang.org/blog/2022/10/05/my-future-with-elixi... and https://hexdocs.pm/elixir/gradual-set-theoretic-types.html
Dialyzer might be considered "bolted on", but the BEAM itself is strongly and dynamically typed. In Elixir the compiler is getting static typing as well.
https://learnyousomeerlang.com/types-or-lack-thereof
These languages have other properties that can play the role that types are sometimes relied upon to do. It's uncommon that I think in types at all when building things in Elixir, thinking about shapes usually gets me all the way.
In my experience string processing libraries are the weakest part, due to some of them having abysmal performance for whatever reason. Last I had this problem I wanted to do ETL on mbox files but gave up and did it with someone's PHP one-class weekend project instead.
You probably don't, Numerical Elixir/Nx has been out for years and did the NIF:ing for you.
It's one part of why it's quite convenient to juggle ML and LLM tasks on the BEAM, and easy enough that I can manage it.
https://github.com/elixir-nx
The Strand programming book states that an early version of the Erlang runtime was implemented in Strand (see "13.1: History" http://www.call-with-current-continuation.org/files/strand-b...), which is an interesting tidbit that I haven't seen come up when the history of Erlang is discussed, like in the featured article.
Studying the BEAM is definitely on my ToDo list. It's task parallelism sounds exemplar, and I really want to understand the architectural ramifications of choosing fine-grained task parallelism vs. a data parallel-friendly approach.
I wish articles like this had more meat on why BEAM is good.
You have to say why it's good. E.g. https://news.ycombinator.com/item?id=28015852
Love beam
I just wish elixir had static typing built in :)
Give Elixir a try anyway, you might be surprised:
https://arrowsmithlabs.com/blog/you-might-not-need-gradual-t...
Then you'll love Gleam -- it's a BEAM language with static typing!
https://gleam.run/
If only there was a typed language that didn't hand wave serialization
I don't think we'll ever do better than 'IO is made out of bytes'.
Like Java?
It’s fascinating how long the BEAM has lasted. And even more fascinating how relevant its concurrency model still is in today’s async-heavy world. Built different.