This is really interesting - I wonder how it compares to WebKit’s own B3.
I’ve had some thoughts on an IR that doesn’t share LLVM’s problems (as I understand them) for a while now and Cretonne matches my expectations almost entirely, when it comes to the type system - aggregates only serve to allow sub-optimal assumptions and are otherwise a pointless price to pay - so this makes me very happy indeed. (LLVM devs thought I was mad, ha!)
The one thing that having no aggregate types implies is no aggregate constant expressions (at least for LLVM). What I came up with as a replacement for LLVM constants (not that LLVM devs would even consider it) was described in my design for an interpretable abstract Rust machine, which miri (developed by @scott and oli_obk) already implements.
The gist of it, especially for a backend as opposed to an interpreter, is to have globals be byte buffers with relocations (just like you’d emit in the binary you give a linker) that point to other globals, as opposed to having nested aggregates with getelementptr
constant expressions embedded in them.
I can’t find anything related to globals or constants in Cretonne’s source so maybe I’m off the mark.
I did get to see the integer indexing (as opposed to pointers with complex ownership rules), which is another thing that makes me quite happy.
If bounds checking ever gets too costly you could consider using the indexing
crate - it would work remarkably well for any kind of analysis that doesn’t need to modify anything, since you only need to bound-check everything once and then provide the sound compiler-checked bound-less indexing.
EDIT: Ah, one thing I forgot: the design using various side-tables (EntityMap
) is nicely extendable and one place this really matters is debuginfo: wasting exactly 0 bytes per instruction when it’s not in use is really important.
On top of that, one complaint I hear about removing complex types from LLVM is that you lose information needed to debug LLVM IR - the answer there is debuginfo!
If you already have code to emit debuginfo in your front-end, why not make use of it in the back-end for presenting a richer IR?
LLVM’s is terrible, all you see is !dbg !123
and have to look up !123
, and repeat this recursively because it’s a tree of metadata nodes.
The opportunity here is to annotate the textual IL form with all sorts of details that can be lazily reconstructed from debuginfo. For example, (*ptr).z.0.foo[5]
might be GEP ptr, 0, 2, 0, 1, 5
in LLVM and offset 41
in Cretonne - but you can store the compact offset and use debuginfo to reproduce (*ptr).z.0.foo[5]
in an IL comment - nicer than LLVM’s and more efficient underneath!
IOW, Cretonne can have its cake and eat it too - did I mention how happy I am this is happening?