Having your compile-time cake and eating it too

codebje 4 days ago

On running code at compile time, what this reads as to me is evaluating constant expressions at compile time. We're thoroughly used to compilers optimising constant expressions away for us: "if 4 < 2 then a else b" will usually result in identical compiler output as just "b": "4" is constant, "2" is constant, "<" is constant, and the application of "<" to "4" and "2" is constant.

What about "factorial 8" ? Assuming the compiler knows that "factorial" is a pure function, it also knows that "factorial 8" is a constant, and in theory it could evaluate that function and substitute in the result. In practice this won't generally happen automatically, because compile times would blow out something fierce.

But in, say, C++, you can do this:

    constexpr unsigned factorial(unsigned n) {
        return n == 1 ? 1 : n * factorial(n - 1);
    }

    constexpr unsigned fac8 = factorial(8);

And it will compile to the static int value. This looks very much like the author's proposal; reading the various rules around constant-initializer probably will help expose pitfalls.

More interesting for me is partial evaluation. Constant folding evaluates fully static parts of a program to produce a static result, shifting that execution cost from run time to compile time. Partial evaluation also allows some of the program's dynamic inputs to be fixed to a static value, slaps a virtual "constexpr" on everything that now only has static inputs (and is a pure function), and reduces everything it can.

Partial evaluation gets fun when you start applying it to certain types of program: partial evaluation of an interpreter with a given program's source as an input acts just like a compiler. Partial evaluation of a partial evaluator with an interpreter as fixed input produces a compiler. Partial evaluation of a partial evaluator with a partial evaluator as fixed input produces a compiler generator: feed it an interpreter for any language, get a compiler for that language.

aatd86 4 days ago

Is it perhaps a eli5 way of speaking about higher kinded types and type checking decidability?

Is it about types not being available in the frontend of the language as first class values?

Or is it about types not being defined as set of values but something else?

It got me a little confused so I am asking.

codebje 4 days ago

It’s a confusing (and confused) article. The parts on type systems are expressing IMO a fair observation that dependent types are weird and hard, but without any sense of understanding what they are or what they can do that might justify their cost.
The parts on compile time execution are the better parts of the article, IMO. There’s food for thought here. The author might enjoy reading up on partial evaluation.
Then, fulfilling Greenspun’s 10th, the article reinvents Lisp macros.
codethief 4 days ago

> It got me a little confused
Not only you! The post is all over the place…

burakemir 4 days ago

One angle (no static types) is racket, which knows multiple stages for its macros. Maybe the most developed actually working "new tradition".

Researchers have also looked into multistage programming, which is enabled by representing code at runtime. Including how to represent it in type systems/logic.

For Scala, there was a realization that both macros and multistage programming need a representation of programs. I am falling asleep so can't dig out references now, but it is exciting stuff and I think the last word has not been written on all this.

codebje 4 days ago

On types, I think there's a philosophical argument about whether types can be values, and a related one on whether types should be values, but I think I disagree with the author about what a value is, because as far as I understand it a value is a piece of data manipulated by a program during execution.

Languages with type erasure obviously never have types as values. Languages without type erasure have _something_ as a value, but is that something a type, or a representation of a type, and is that a valid distinction? I don't feel well qualified to make a judgement call, largely because I favour type erasure. Types are specification, not implementation.

What the author calls values, I tend to call terms: bits of program that can be judged to be of a particular type or not. "1" is a term: you could judge "1" to be of type "int", or of type "unsigned". "factorial 8" is a term. A value is just a term that's represented at runtime. But is there a type of a type? Commonly, no: types are in a different universe, specified with a different syntax, and understood under a different semantics. The type of "factorial", eg, is a function from int to int. In a language with generics, you might have "List T" that takes some type T and produces the type of lists of Ts.

There's no particular reason why you can't say that the type of List is a function from Type to Type, why you can't use the same syntax to specify that function, or why the semantics of what that function means should be any different to the semantics of "factorial". Consider:

    def factorial : Nat → Nat :=
        fun n => if n = 0 then 1 else n * factorial (n - 1)

    def natty : Type := Nat → Nat

This doesn't imply any fancy-pants type system features. The above two terms are expressible in the simply typed lambda calculus. If you add only outer-most universal quantification, ie, parametric polymorphism, you get Hindley-Milner. If you an arbitrary quantification, you get System F; with coercions, System Fw.

Universal quantification, HM style, gives you functions from types to values (terms that can exist at runtime). If you have, say, "length : List a -> Int" you have a function that's universally quantified over "a": you tell "length" what type "a" is and you get back a function from one concrete type to another, something you can have at runtime. (All the functions producible here are identical, thanks to type erasure, so while this type-to-value thing actually happens in order to perform type checking the compiler will only ever produce one implementation.)

The last option, railed against in the article, is to have a function from a value to a type. This is, to agree with the article, generally pretty weird and hard to get a handle on. Where the article deliberately picks pointless examples, though, I'd like to point out two much more commonly encountered kinds: "Vector a n", the type of vectors of type a and length n. The type argument is a type, but the length argument is a value. The other example comes from C: tagged unions. You have a union type, that might be one of a handful of type, and you have a value that distinguishes which one it is. In C, there's no type system support and you just have to get it right every single time you use it. If you have functions from values to types, though, you can have the type checker get involved and tell you when you've made a mistake.

The whole point of static types is to specify your program's behaviour in a way that the compiler can reject more incorrect implementations. We know from way back in Gödel's day that no type system can be complete (accepting all correct programs) and consistent (rejecting any incorrect programs). Most type systems we can practically work with are inconsistent (you can get any program to be accepted by the compiler with the right type casts) because the incompleteness of the type system hinders expressivity. I believe it's possible to have a consistent type system with sufficient expressivity to allow all _useful_ programs to be written, and those type systems will roughly correspond to the Calculus of Constructions, ie, they will be dependently typed. I am not yet sure I believe that the cognitive load of working in such a language will make it worth doing - which as far as I can tell is the point the author of the article is making about type systems.