[MIR] constant evaluation

I’ve heard rumors that constant evaluation should be done in the future by translating the constants’ expressions to MIR and evaluating the MIR. Now there’s quite a bit of design space, but we most definitely don’t want to end up with 4 constant evaluators.

So here’s my take

A cache for constants

Necessary to continue to allow things like const I32PTR: &'static i32 = &1;

  • we need to make sure that we put certain constants into static memory (in this case the 1i32).
  • store all constants (and “temporary” constants like the 1 in &1) in a cache
  • have a new ConstVal (ConstVal::Ref) that points into the cache (or an Lvalue::Static, see the bottom of this post)
  • translate the cache in trans so that all the ConstVal::Refs can be resolved

Constant evaluator

During MIR creation, whenever we hit a path or a pattern or an array length that points to a constant or an associated constant, we need to generate a MIR and propagate all constants until there’s a Statement::Assign(Lvalue::ReturnPointer, Rvalue::Use(Operand::Constant(..))) and then extract that constant.

I wrote a small constant propagator on top of the typestrong const ints branch. Simple functions can be completely evaluated if all branches can be decided. At some point there’s a return_pointer = $something statement.

Aggregate constants

If we have an Rvalue::Aggregate we need to be able to turn it into a Constant iff all its elements are an Operand::Constant

Can’t do this currently, because there are no aggregate constants (the current aggregates just store an expression id, not the actual inner values)

To prevent doing everything twice (in ConstVal and Rvalue), we could use Rvalues directly

Advantages:

  • no more constants in MIR, just literals and aggregates of literals
  • no need to touch rustc::middle::const_eval (ever again?), we can just implement MIR-const-eval in parallel
    • we can even not do the typestrong const int PR, but just extract the const ints for MIR-const-eval

Disadvantages:

  • need to do a check to figure out if an Rvalue is a constant

#Misc

We can eliminate ConstVal entirely and merge constants and statics in MIR. Then there’s just Lvalue::Static to reference either a static or a constant. This requires some thought about things like taking constants by value. Basically anywhere we have a Constant right now in MIR, we’d have an Lvalue::Static.

Have I missed anything?

2 Likes

A bit far away (and maybe never done…): would it be difficult for the MIR constant evaluator to execute “arbitrary” functions (ie, using String or Vec)?

Not being able to handle strings has been a long-standing issue in C++, usually solved with horrible hacks (template<chars... C> struct String<C...> {};) so it might be cool (in a distant future) to be able to close the gap.

Beyond being used at type-level, I am also thinking about using complex structures in const globals/variables. For example, the well-known issue with Regex::new: even though the use regex! used to produce faster code because it encoded the structure at compile-time (which was then optimized by the compiler), the new “dynamic” encoding is not amenable to constant evaluation and therefore cannot be so optimized (it is faster than regex! now, but could be even faster if constant-evaluated).

The situation is similar with global const: you need to rely on lazy_static! to use “complex” initialization, which means run-time initialization, and prevents the compiler from optimizing based on the result. Yet, having a constant BTreeMap or HashMap could be useful in many scenarios, especially if the compiler, in Release, could look-up constants at compile-time (-O4 :wink: ?)

This is a good question. IMO Rust shouldn't go down this path.

I don't know the specific implementation details in Rust but generally speaking the continuum here is somewhere between a regular optimization pass in the compiler that can optimize out expressions [e.g. replace "2+3" with "5"] and a full blown interpreter as part of the compiler. The former is very limited (can't do memory allocation, I/O, etc) and the latter duplicates a huge chunk of the compiler and adds a lot of complexity. Having a "const fn" feature then usually results in some sort of compromise that adds complexity on both fronts - the compiler gains duplicated logic (compile the code vs. "run" the code) and the user still has some limitations on what can be const evaluated.

I think that supporting this via "macros" is a much better and more orthogonal design. For example, the lazy_static! macro could be replaced by a procedural macro that generates a compile-time value that the compiler assigns to a static variable. This would require language support to define a collection value at compile time (I think there's a new feature of that sort in c++11/14).

To elaborate a bit more here... The big change in Regex::new was the addition of more sophisticated literal prefix matching, and specifically, that it embeds a DFA for matching multiple patterns very quickly. All of that can principally be encoded statically. (More accurately: I don't know of any specific blocker yet.)

My vague plan for the future of regex! is to become something more like Ragel. That is, it is its own specialized engine that inlines a DFA to Rust code. It's doable, but a DFA usually needs to be augmented by other stuff to make it sensible to use (e.g., captures), which usually means invoking an additional matching engine (e.g., full NFA simulation) to achieve (like RE2 does).

Interesting alternative. Scrub constant evaluation entirely and go for full blown MIR-interpretation. So basically we’d end up being able to run Rust code in a VM that is written in Rust.

1 Like

I was indeed more thinking about the latter, but not a Full Interpreter.

There’s not much that can be done without memory allocation: even the most basic data structures require it! On the other hand, I/O, multi-threads, … bring significantly diminished returns.

So for me the demarcation line is quite clear:

  • yes to memory allocation
  • no to I/O, multi-threading, syscalls, … except Rust built-in logging (for debugging)

That is: “just” having malloc/realloc/calloc/free supported in constant-evaluation.


I do agree that I do not look forward to the duplicated logic; initially my conclusions on building a constant evaluator was that it would be great to leverage trans:

  • send a chunk of MIR (some for reference + the computation to perform)
  • have trans lower it to LLVM IR
  • interpret the LLVM IR (virtual machine with just the memory allocation)
  • have trans read out the “result” and convert it back to MIR
  • return the evaluated chunk of MIR

For sweetness, add in a capability to convert panic! invocations to a stack-trace and to display the logs generated as part of the compiler diagnostic in case of failure; for debugging purposes (still, the programs should be deterministic, so debugging is normally easier).

However, it’s not clear to me how much logic we are talking exactly, and if the cost of involving trans would not be much higher. The current constant evaluator has function evaluations (const fn), integer operations, pattern matching, … how much would be required to additionally support String? Vec<T>? HashMap<K, V, H>?

And that’s exactly the purpose of my question!


We of course need a costs/benefits analysis before deciding whether to implement support for memory allocation in the constant evaluator. For now all I have is:

Benefits:

  • drastic increase in percentage of Rust code that can be constant-evaluated
  • drastic increase in percentage of Rust data structures that can be stored in read-only memory

Derived Benefits:

  • drastic reduction in the number of necessary macros (goodbye concat!, goodbye regex!, …)
  • drastic reduction in the usage of some of the remaining macros (lazy_static!, …)
  • drastic reduction in the number of “having to trust” plugins executed during compilation (security wise, arbitrary code being executed by the compiler is annoying)
  • much lower bar for benefitting of constant data (much more affordable for new users)
  • an uncertain performance gain brought by better constant-propagation/inlining (without even having rustc optimizing by itself)

Drawbacks:

  • larger/more complex MIR, how much larger?
  • larger/more complex constant-evaluator, how much larger?
  • is there potential “clash” with other features?

A Rust VM is unfortunately a much higher bar than just allowing String or Vec I am afraid.

Specifically, I am looking at cross-compilation, where the VM would invoke syscalls or OS-specific calls that would have to somehow be emulated. I had thought about the ability to use plugins that would be called by the VM and expected to directly manipulate the VM state to emulate the effects of the call… it seems, complicated?

Also, the necessity of doing I/O or multi-threading during compilation seems much less necessary (to me).

Coming from C++, I have lost count of the number of times where I was frustrated by the lack of support for std::string in constant evaluation: you can make do with just integers/chars thanks to variadics, much like you can encode a brainfuck interpreter with it, but let’s face it it’s really awkward.

On the other hand, I’ve never missed I/O or multi-threading, so I would say it may just not be worth attempting to implement them and defer this 0.1% need to compiler plugins/build.rs instead. Of course that’s just my experience.

The MIR-interpreter could simply abort on any operations we deem “not-const”. It would be like another target, but instead of going to LLVM, we just dump the MIR and have the interpreter run it. That target would be implemented like any other target, just instead of implementing io, we panic. The allocations could be an asm! call that’s simply some statement that MIR understands.

but to be able to do things like size_of, we need to be able to invoke trans :frowning: or find a way to figure out the size of a type without trans.

The problem I see (and tried to convey) in this approach is that it adds a considerable complexity to the system and the trade-off only affects where that complexity goes.

full interpreter / rust VM:

Downsides:

Duplicates logic. Short term it affects rustc but also important to keep in mind that it greatly affects the visible surface. If someone wants to implement their own compiler for Rust they’ll need to also implement a VM/interpreter in order to be fully compliant. That’s quite a burden for a language built around an open eco-system which would want to encourage alternative implementations.

Benefits:

No limitations for the end-user. just stick a "const’ infront your arbitrary function and it’ll work at compile time too.

Partial solution as you suggested (i.e only allocation):

Downsides:

Still complicates logic as before, but now also adds complexity on the end-user that needs to be aware of an additional computation model. This also requires documentation and constantly answering questions by new users that will be surprised by this.

BTW, regarding your specific demarcation line, there are valid use cases for I/O, multi-threading, sys-calls, etc at compile time and these can already be used in the context of “procedural macros” (although with an unstable and not ergonomic enough way).

For example, we can have a macro to embed values from a DSL (sql queries, xml snippets, etc). Now what if I want to have this validated at compile-time? that’ll require my sql!() macro to connect to a DB engine. similarly, my xml!() macro might need to call an external xml validator, perhaps read a schema file, etc… It’s possible of course to come up with other useful use-cases.

Now, my ideal solution is complete separation of concerns. An ideal regex! macro for example, should be a quick one-liner - just build the regex by calling the regular regex construction function, only inside a macro definition.

Benefits:

  • No duplicate logic in the compiler. macro still needs to be compiled to executable form. the only difference is that it’ll be run by the compiler.
  • Unified model for the end-user, any code that you can write in a function (i.e at run-time) can also be put inside a macro definition and be run at compile-time.
  • Optionally this can be made very ergonomic for the user by automating the process of compiling and loading the macro. E.g. the compiler can spawn a sub-instance to compile the macro def and than just load the resulting binary plugin.
  • It is very easy to generalize to multi-level compilation model. I.e. macro definition uses another macro.

downsides:

The only downside of sorts is that we need to allow (and trust) the compiler to run arbitrary code. now, I’m not sure this is indeed a full downside because there are different levels of trust here, the user’s own macros are less of a risk compared to a completely foreign macro from the internet.

Edit: I forgot to mention (it’s probably obvious from the rest of my reply) but I don’t think that reduction of macro usage is a benefit. I prefer to have more integration of macros into the regular language and make it much more ergonomic. iow, outside the realm of syntax manipulation, a macro should be the same as “const fn” - regular rust, only run at compile-time.

If a const fn is only ever allowed to call other const fn then simply marking memory allocations as const fn (and treating them as built-ins) is sufficient for the type-checker to be able to report to the user “sorry, cannot call this non-const function in a const fn”. This means you do not even attempt interpretation.

There indeed a small issue with size_of, which needs to be a const fn for many things to work. To solve it cleanly I would recommend simply factoring the struct layout algorithm out of trans

… however the cascade of effect is slightly more interesting: some computations may only work for certain values of size_of, in which case the compilation would work on some targets and not others.

I personally do not consider it an issue; it’s somehow already the case with the #[cfg(...)] attributes that some items are not always available.

TL;DR: Only supporting memory allocation is the ultimate trade-off.


First of all, thanks for a very complete answer. I am afraid that you are underestimating drastically the difficulty in implementing a full VM and at the same time overestimating the effort in user's education, and hopefully by the end of this answer I'll have explained what makes me think so and why I think that only memory allocation is the sweet point to aim for. If anything is unclear, please call me out on it.

I will start by answering a completely different point first, though:

BTW, regarding your specific demarcation line, there are valid use cases for I/O, multi-threading, sys-calls, etc at compile time and these can already be used in the context of "procedural macros" (although with an unstable and not ergonomic enough way).

Of course there are usecases for those, I never said there was not. Any demarcation line is, by nature, a trade-off, and I argue that implementing memory allocation gives a huge return on investment while all others afterward (I/O, multi-threading, C-calls, sys-calls, inline assembly, ...) have seriously diminishing returns and therefore are probably not worth the complexity.

I think you fail to appreciate what memory allocation unlocks.

You can have today (and in the future) a plugin which validates your queries against the database schema or an xml schema, however said plugin can only emit the resulting object as a "const" item if the type supports a "const" constructor:

  • today, the number of "const fn" is very limited, because they only support simple structs and integers
  • with memory allocation allowed, it's unlimited1!

Thus, by simply allowing memory allocation, not only do we cater to maybe 99% or even 99.9% of usecases, but we also lift all limits on the emitted const items a plugin might generate for the non-covered cases. That is very empowering.

1 key point: "no life before main", we'll come to it later.

Now, my ideal solution is complete separation of concerns. An ideal regex! macro for example, should be a quick one-liner - just build the regex by calling the regular regex construction function, only inside a macro definition.

Note: there is no need for a macro then; it's just a const fn as we already have.

Unfortunately, building a complete VM for a systems programming language for which you want cross-compiling is Madness:

  • you need to emulate calls into C; note that you will NOT be able to load the C library you are calling for the target you cross-compile to because your current computer is incompatible and there is no reason to think that the equivalent C library that you have on your own CPU/OS will in any way produce the same result (sizeof differs, for example)
  • you need to emulate the underlying OS calls, for all OS to which you need to cross-compile
  • you need to emulate the CPU behavior for all iniline assembly for all CPUs to which you need to cross-compile

Note: I do not even consider NOT supporting cross-compiling. It's just necessary for small embedded devices on which rustc cannot run, and forbidding those devices from benefitting from const-evaluation would seriously hamper the penetration of Rust in this field it suits so well.

I can only think of two ways to cover those requirements, at the moment:

  1. DRY: use a full VM, which emulates both the target CPU and OS. It should be integrated into the compiler so that the compiler will (a) feed it the emitted assembly, (b) setup the memory to represent the arguments and (c) be able after the computation to read the memory the get the result out (which it reinterprets in terms of a Rust value)
  2. Not DRY: implement a semi-VM. All Rust functions containing inline assembly or C calls cannot be called directly, instead the semi-VM comes with one "plugin" per such Rust functions which re-implements it in a VM-way (that is, it changes the VM internals appropriately to emulate the effect of the call, taking into account the arguments' values and the current target CPU/OS). A mechanism is provided for developers to implement such plugins for each and every Rust function they write which either call C or use inline assembly and deliver said plugins with their libraries.

(2) here is my best attempt at solving the issue, and it does not seem that great:

  • if you force developers to implement the plugins, all those doing FFI will hate you; it'll be a pain!
  • if a developer can leave off the plugin for some of the functions, then you are back to only have a subset of the library be usable in const-evaluated contexts, and by transitivity it might very well be a huge missing subset
  • for each plugin, there's a chance its behavior differs from that of the function it emulates; this is a MASSIVE amount of duplication in user land

And yet it's my best attempt, because honestly if you ever think (or someone thinks) that you can pull off (1) then I would really, really, like to know what strategy you are thinking about.

So, I'll be fully honest with you:

  1. A full VM is such a titanic effort, as far as I can see, that I consider it impossible to pull off
  2. A semi-VM is, much like a partial interpreter, ... partial, which needs to be dealt with2

Of course, if you have a much better idea to implement a full or semi VM solution, I am all ears.

2 key point: API stability, we'll come to it soon.

Partial solution as you suggested (i.e only allocation):

Still complicates logic as before,

First of all "complicate the logic" is not an all or nothing, it's gradual. "Just allocation" is probably so much simpler than a full VM than it does not make sense to compare their implementation costs. I dread to check the code base size of VirtualBox or VMWare.

but now also adds complexity on the end-user that needs to be aware of an additional computation model.

Does it?

Let's put aside for a second the concept of the full VM, since it's so uncertain whether it's manageable. Any other solution, no matter where its limits lay, will only be able to interpret/execute a subset of the code at compile-time.

For API stability reasons, it would be unreasonable to let the compiler automatically decide whether a function should be callable at compile-time: any change in implementation might be a breaking change in the API! This requires that the developer be able to annotate the subset of functions that she is willing to support for compile-time computation. The good news is that we already have the solution: it's called const fn.

Thus, any provably doable solution today requires the already existing const fn: the solution chosen only affect how many functions can be const.

Thus, any probably doable solution today impose the very same burden on the end-user.

And I would argue that the rules are simple enough:

  • you cannot initialize a const item from a non-const function
  • you cannot call a non-const function within a const function
  • you cannot refer to a non-const global within a const function

Those rules are shared with C++'s constexpr. The only difference is the subset of functions that can be called, but then the API is already different anyway.

Therefore, I simply reject the argument that it pushes more complexity on the user.


Let us go back now to my claim that with ONLY memory allocation/deallocation allowed (which actually involves the full Rust syntax evaluation) the current procedural macros become unlimited.

The key point, as hinted, is "no life before main".

In light of the Static Initialization Order Fiasco and Static Destruction Order Fiasco known in C++, it was decided that Rust would support no life before main: no user code is executed before main starts or after it ends. Having suffered from both issues, I fully support this position and decision.

It is very interesting, however, to note the consequences of this decision on const items (which are created before main starts): their initialization should require no code to run before main, or equivalently const items can be stored in ROM or .text sections.

Thus, const items cannot contain, transitively, as far as I know:

  • a Process ID
  • a Thread ID
  • a File Handle
  • ...

Note: while the type could contain an Option<File>, it would have to be None in a const item.

Thus, a const item is fully expressible in a compiler that only supports memory allocation in compile-time evaluation.

And as a result, only supporting memory allocation is sufficient to express the initialization of 100% of the const items; it's unlimited!

Expressed differently:

  • any less than memory allocation restricts which items a user can stored in const
  • any more than memory allocation does not lift any restriction on which items a user can store in const

Coupled with the fact that from my personal experience, memory allocation covers 99% of the use cases, I consider only supporting memory allocation to be the ultimate trade-off in terms of implementation cost/user empowerment.

Thanks to those who read everything there, please point out any issue/inconsistency/flaw in my arguments.

6 Likes

TL;DR: I’m NOT advocating a VM. quite the contrary, I’m advocating a multi-level compilation strategy.


Let’s first discuss the general idea of the idea above:

The idea is that compile-time code execution is just regular run-time code execution done by the compiler by loading a macro/plugin and running it during compilation. LISP has a similar model and java/c# have similar capabilities and the best example is probably Nemerle which is built around this concept. C++ is f**ked up in this regard because people are stuck in the binary division between compile-time and run-time instead of the generalized concept of a series of transformations. This is key property in the design. This is why c++ constexpr is stupid and wrong.

Now, this concept removes the duplication I’ve talked about. The compiler just needs to provide the hooks to make this usage ergonomic. Everything else falls out naturally. there is no need to have a separate compilation model or have special support for each language feature. Just compile the macro definition into a shared object which will be loaded and used by the compiler. For cross-compilation scenarios only the final compilation stage should be a cross-compilation stage and all the previous ones need to be compiled for the host because they are run by the compiler itself.

This removes the overlap between macros and const fn, and remove any docs effort to explain what code can be written in “compile-time” because there is no longer such a thing.

I actually agree with you that allocation is very important. I’m just trying to convey that with the scheme above there is no need to limit ourselves to a compile-time only rust subset.

manually supporting this scheme requires two main things:

  1. Restrict mixing macro defs and “runtime” code. This is already the coding convention in Rust because macros are put in a separate crate.
  2. Requires multiple invocations of the compiler, once per stage. First, we compile the crate that defines the macro(s) and than we compile our code in the “runtime” crate with the macro crate loaded by the compiler. Rinse & repeat , i.e a macro contains regular code so it can obviously use loaded macros compiled in a previous stage. so the amount of stages depends on the nesting of macros.

Supporting this automatically means to add infrastructure to the compiler so that it can spawn sub-instances to implement this staging as I alluded to in previous posts. iow, make the compiler automate this process for you, instead of embedding staging logic in a build script.

This of course requires improvements in the ergonomics of macros and maybe also requires some minor additions to the language/std as well in order to integrate this. what you call a const fn is just a macro in disguise so I’m pushing for is just to be frank about it.

1 Like

Some relevant discourse on IRC

Also, when discussing, consider that having a constant propagator is a desirable thing regardless, because one wants to (ab)use it to enable various non-trivial transformations during dataflow analysis that might not always be possible without the propagator.

I’d personally consider a propagator and VM/interpreter/evaluator be two distinct features. Propagator is purely a optimisation detail (which also happens to be a form of eager constant evaluation), while VM/interpreter would be more of a full feature distinct from MIR (though might be implemented on top of it) – perhaps could be implemented as a compiler plugin, even!

3 Likes

I totally agree with the above :slight_smile:

The const propagator is indeed a useful optimization pass of the compiler and it is an internal implementation detail of the compiler.

The VM/interpreter/evaluator however is what I’m arguing against as I see it as something that should be already covered by a more general macro feature and that const fn simply provides overlapping functionally to a subset of the capabilities of macros.

For people who want memory allocation in constants, how would that actually work? Would they have a chunk of static memory in the executable allocated and by virtue of being immutable constants that never get dropped, there’s no worry about trying to free that “allocation”?

This is indeed exactly what I was thinking about.

A constant can already refer to other constants, so I was thinking that each pointer/reference would lead to another (internal/unnamed) symbol properly initialized.

And indeed since there is “no life before/after main” it would mean that neither constructors nor destructors are executed for those const items.

I don't have much experience with Java/C# and no experience with LISP/Nemerle so I may be missing something obvious... but I still don't understand how cross-compilation works.

I propose that we stop talking in the abstract, and instead use this simple argument:

const Known: Vec<libc::tm> = read_from_file(".");

fn read_from_file(file_name: Path) -> Vec<libc::tm>;

Now, we want to compile this on host x86_64 Linux and target x86 Windows. I can certainly compile read_from_file for the host, but:

  • I get a Vec representation suitable for 64 bits even though the target is 32 bits
  • libc::tm is platform specific: there are some common members, but each implementation is allowed to add platform specific members (for example to handle timezones)

So I do not see how you go about reconciling the 64-bits Linux representation with the 32-bits Windows representation. Scaling down 64-bits pointers/integers to 32-bits and eliminating data members that are Linux-specific is feasible; finding out which value the Windows-specific members should have, however...

Could you explain how your multi-stage idea would solve the issue here?

Ok, let’s talk about your concrete example than. First we need to know what does the file actually contain and what are we trying to move to a previous stage. I’d say that using a Vec<libc::tm> is bad design anyway because as you said it is platform dependent.

So our stages here are:

  1. read the file and de-serialize its contents from some known format.
  2. create a const expression that contains that contents.

macro.rs (in pseudo-code):

macro read_from_file(file_name: Path) -> Expr {
    let vec: Vec<String> = read_file(file_name);
    let expr = arrayExpression::new();
    for  e in vec.lines() {
        expr.add(e.to_expr());
    }
    expr
} 

my_module.rs:

const Known = read_from_file!(".");

what I’m aiming for here is that the macro generates the initialization expression. So that the above will be transformed to something like:

const Known = [ libc::tm {year: 1999, month: 1, day: 1, ...}, ... ];

This is portable and works with cross-compilation. There are a few points of note here:

  1. What is the type of Known? I’d argue that a Vec is the wrong type since it allocates on the heap. The macro should generate an [T; n]

  2. The macro seems complex, right? Well there are two parts to this - there is re-use of regular “run-time” code to read the file for example but the other part I wrote with some API is to generate the expression. In Nemerle there is built in splicing syntax that is very ergonomic for this latter part:

    macro (…) { <[ print (“first”); ]> print (“second”); }

Everything inside <[ ]> is output tokens (the return type of the macro, an Expr in our example) and everything outside is executed regularly. Using such quasi-quoting syntax makes the above macro much shorter and much simpler to understand. 

3. What if I want a different data-structure, say a map? Rust will need to have something like the new c++ initialization syntax which allows to write:
 
    std::vector<int> v1{1, 2, 3};
    std::map<int, int> v2{{1, 2}, {2, 3}};

Or in Rust-ish :

let v: std::Vec<Foo> = [ {..}, ..]; // notice there is no vec! macro call here
let v: map<K, V> = {{k1, v1}, {k2, v2}, ..};
  1. The macro itself can execute itself any code, in our example it does I/O, memory allocation, etc… but there is a strict separation of concerns, it cannot allocate the memory to be used by the “run-time” code which runs in a separate stage. Instead, it returns the expression so that the next stage can have a static value it can manipulate. It generates new source code.

Most of the above exists in some form already in Rust. We just need to fill some holes to make it ergonomic and fun.

I am not sure how much we want to execute code that can access the host OS API during compilation - this kind of thing gives integration people nightmares. I would prefer for all host OS API access to be constrained to compiler plugins (and build-scripts) - if we assume that integrators can skip lints and that compiler plugins are rare, this makes the problem much easier.

OTOH, I am not opposed to memory allocation in constexprs - of course, this should call an intrinsic that eventually allocates memory in .data rather than calling jemalloc or whatever.