Thoughts on Compile-Time Function Evaluation and Type Systems


#1

Since the miri merge, there have been various discussions about which operations should be allowed during CTFE, which checks the compiler should do, how this all relates to promotion and which kinds of guarantees we should be able to expect around CTFE. This post is my take on those topics, and it should not be surprising that I am going to take a very type-system centric view. Expect something like a structured brain dump, so there are some unanswered questions towards the end as well.

https://www.ralfj.de/blog/2018/07/19/const.html


#2

Lucky enough, Rust already has an answer to the need to side-step the type system in controlled ways: unsafe blocks. … So, we should be able to write the following:

const fn ptr_eq<T>(x: &T, y: &T) -> bool {
   unsafe { x as *const _ == y as *const _ }
}

Does this mean something like this is allowed?

// NOT marked as const fn
#[inline]
fn is_ascii(a: char) -> bool { a < '\u{80}' }

const FALSE: bool = unsafe { is_ascii('😐') };
//                  ^~~~~~~~                ^

#3

Hm, good question. There is no inherent reason we should not allow this, I think. So I guess my answer is it could be allowed. My only worry is using this accidentally because unsafe opts into so many things at once.


#4

Agreed. Perhaps require unsafe obligations:

const FALSE: bool = unsafe(const::call_non_const_fn) {
    is_ascii('😉')
};
const fn ptr_eq<T>(x: &T, y: &T) -> bool {
   unsafe(const::compare_ptrs) { x as *const _ == y as *const _ }
}

#5

I don’t think we should reuse unsafe as-is for “things that may lead to CTFE errors”. The analogy with the usual type system, soundness, and safety concepts is in some ways illuminating. However, in other contexts we’ve tried very hard to keep the scope of unsafe as small as possible and not entangle UB with other concepts such as memory leaks, “easy to misuse” APIs, or operations that are most useful in unsafe code but don’t have UB themselves (e.g., many operations involving raw pointers). Besides that pedagogical value and “messaging”, this also means that you use unsafe less and thereby are unlikely to accidentially do something UB-ridden in an unsafe block that you created for a different reason (in this case, for example, to compare raw pointers). I think we should stick to unsafe meaning only runtime UB and use a different syntax (really bad strawman: const unsafe) for the CTFE related concept.


#6

Just in case people get confused about the “non-determinism is unsound in array lengths” statement: @RalfJung is oversimplifying - array lengths are just one way to inject values into the typesystem (soon, const generics will be another) - (“const-dependent types”, if you will).

The real trouble is when those expressions are universally quanitified, i.e. they depend on generic parameters, because then each instantiation with the same choice of generics must give the same result.
But if you cache everything and make it work cross-crate, you still break coherence, see this for more details: [MIR] constant evaluation

EDIT: please refer to https://www.reddit.com/r/rust/comments/907a6d/thoughts_on_compiletime_function_evaluation_and/e2pdqnt/ instead


#7

@rkruppe Yeah, I agree that this part of the design could definitely be improved. To add an even worse strawman: unconst? :wink:

@eddyb I really did not expect to have to talk much about the issues with non-determinism here, and wanted to focus on other aspects. I guess that’s what I get.^^ I added a link to your post, thanks.


[pre-RFC] lazy-static move to std
#8

@rkruppe So if we have const unsafe { .. } is that solely for the determinism parts that you have to prove yourself, and then unsafe { .. } inside const fn is for the usual memory safety stuff?

If so, then, ostensibly, you’d have to have const unsafe fn and unsafe const fn where the former is allowed to do non-deterministic stuff, and the former is allowed to do memory unsafe stuff.


#9

Yes, that is my proposal, hopefully with better syntax than const unsafe.

… and unsafe const unsafe or const unsafe unsafe for a function that does both. Yet another reason why const unsafe is an awful name.


#10

If we allow calling non-const fn in unconst/const unsafe context, we don’t even need a designation for "const fn that is unsafe to call in const context", do we?


#11

If we allow that, yes. I don’t think we would want to, though, and the line of reasoning in your post doesn’t automatically lead to it either. As with UB-freedom, correctly using the opt-out requires knowing the proof obligation for why opting out is OK in this particular case (either to prove locally or to push on your own callers). So if we wanted to allow users to assert a non-const fn can be used in a constant context, they would need a way to know why their particular usage won’t trigger CTFE errors.

For API stability, these obligations would have to be described by the author of the non-const fn, not by the programmer who wants to call them. When documenting that, the author can just as well add a compiler-understood marker for “there are some obligations that have to be fulfilled for this function to not trigger CTFE errors”. Therefore, API-boundary-respecting code doesn’t really lose any flexibility by requiring this extra annotation on the non-const functions that it wants to call in a const context.


#12

I see “call this foreign non-const fn in const context” as similar to “use transmute to access internal implementation details of a foreign data structure” – unsafe code can do it but they really shouldn’t and there certainly are not stability guarantees.


#13

Interesting analogy.

However, I consider the ability of unsafe code to break API barriers a necessary evil, not a feature. There’s not really any feasible way for the language to distinguish legitimate type punning from illegitimate type punning and prevent the latter, especially once you broaden the scope from transmute specifically to the myriad ways to break API barriers with other unsafe operations.

Here, on the other hand, we just have to not allow calling arbitrary functions and introduce an extra marker on function definitions to opt into allowing this. This is easier than preventing illegitimate type punning not just because there’s no backwards compatibility concerns yet, but also because the interface between client code and foreign code is much smaller here: you can either call a function or you can’t, there’s nothing comparably invasive and flexible as accessing a foreign data structure’s memory through raw pointers.


#14

I actually like unconst. At least, I think it’s a better strawman since unconst { unsafe { ... } } doesn’t introduce a silly "precedence of unsafe" issue that we certainly do not need to have in the final syntax.


More important question: do we need unconst at all? It’s obvious why we need unsafe in a systems programming language, but what are the compelling use cases for unconst? (i.e., why would I ever want to check that two references are equal in a const context?)


#15

I don’t disagree. We can, however, still take the function address as a fn(...) and then unsafely cast that to const fn(...) and then call that (which “looks and feels” more like the run-time equivalent I compared with above).

I guess an argument can be made though that just because something will always be possible we shouldn’t make it easy.


#16

Wouldn’t it be awesome to be able to use HashMap in const context? We can (miri already has everything implemented), but not without something like unconst.


#17

Okay, that does sound pretty compelling.

What sort of unconst code does HashMap use?


#18

It uses tons of raw pointer manipulation to manage memory (allocate one block instead of three for keys, hashes, values), and uses bit fiddling on pointers to store extra data in the aligned bits.

Notice that you will not be able to have pointers as keys in the HashMap. I don’t know how to support that deterministically, and AFAIK that’s actually still an open question in academia.


#19

I’m suspicious of calling non-const functions in const functions using unconst or whatever else, as in kennytm’s example.

You’re asserting that a non const fn is safe to call at compile time. Two things are possible:

  1. That function is local to your crate. In that case, you should be able to mark the function const.
  2. That function is not local to your crate. In that case, you are relying on behavior the upstream author has not guaranteed (by not marking their function const).

#20

I agree. However, we cannot really prevent people from casting function pointers around.

Well, I guess we could put a check into miri that rejects a function call at CTFE evaluation time if the function is not const. That would make calling a non-const function a CTFE error no matter how it happened. I would be fine with that.