About const expressions in types or why we don't need `const_evaluatable_checked`

newpavlov · December 16, 2020, 10:49am

This post is intended as a continuation of this and earlier discussions between me and @oli-obk. I will try to summarize points, so you don't have to read them.

Let's start with a simple generic code:

fn foo<N: Unsigned>(val: Val) -> Bar<N> { .. }

Now let's take a look at its const counterpart:

fn foo<const N: usize>(val: Val) -> [u8; bar(N)] { .. }

We can see a very clear symmetry here. Every const value can be viewed as a type "bounded" by usize "trait". Bar<N> and [u8; bar(N)] both represent a compile time computation which returns type. Both those computations are Turing-complete and can result in types which are too big to be representable on a target architecture.

For practical reasons Rust does not have mandatory bounds for preventing undecidability or "too big" errors for type-level computations. Same applies to CTFE as well, in both cases compiler simply will abort compilation in pathological cases.

But there is one big difference between type-level computations and CTFE: an ability to panic. Imagine bar defined like this:

const fn bar(n: usize) -> usize {
    assert!(cake(n));
    n
}

We could imagine a rough type-level analog of such computation:

struct Bar<T: Cake>(T);

fn foo<N: Unsigned>(val: Val) -> Bar<N> where N: Cake { .. }

Here instead of the "runtime" check we use a declarative bound. Currently we don't have a way to specify that bar(N) will work for any N that satisfies cake(N) , so we need to bubble up the bar(N) using where [u8; bar(N)]: Sized declaration.

Note that panics cover overflow and underflow errors which are often used for constructing motivation examples. Right now we can not use explicit panics in const fns, but AFAIK this restriction will be lifted eventually.

To summarize: panics is the only difference between type-level computations and const expressions in types. Thus it's the only motivation for having the const_evaluatable_checked feature. For example, n.wrapping_sub(2) and n/8 are completely valid definitions of bar for which the bubbled bound will be redundant (but note that n/8 usually implies that n is multiple of 8, so reliable code should assert n % 8 == 0).

Now one may argue that const_evaluatable_checked is a good thing because it makes explicit that const fn may fail and thus it improves code reliability. But in my opinion it's not consistent with how const fns a handled in other places. For example:

// no need for bound on `bar(N)`
trait Foo<const N: usize> {
    const FOO: usize = bar(N);   
}
// the bound is mandatory
trait Foo<const N: usize>
    where [u8; bar(N)]: Sized
{
    fn foo() -> [u8; bar(N)];
}

Both version will result in the same compilation error if we are to pass an "invalid" N causing bar to panic (there are reasons to track such bounds for diagnostics, but it can be done by compiler implicitly). So to be consistent we either should require mandatory bounds in both versions or in none. IIUC the former option is problematic due to the backward compatibility guarantees.

While I fully support efforts in making code more reliable, in this case I think that the const_evaluatable_checked approach is inconsistent with the rest of the language and that in practice it will cause a lot more friction compared to improvements in code reliability to justify its existence.

How can we improve this situation? As you can see the main problem lies in panicking const fns, so the ideal solution would be to introduce a nopanic property (I am not sure if it's correct to call it "effect"), somewhat similar to the C++ noexcept operator:

const nopanic fn bar(n: usize) -> usize { n.wrapping_mul(10) }
// compilation error: `bar` may panic
const nopanic fn bar(n: usize) -> usize { 10*n }

This property can be quite useful for non-const code as well, e.g. in high-reliability code. Currently we have to use quite restricted hacks like no-panic or inspect LLVM IR to find places which emit panics (we had to do it while working on a cryptographic library). Of course, there are some difficulties to consider, e.g. some panics may be only eliminated in optimized builds, which in turn means that nopanic function may become panicking in future compiler versions due to possibility of degraded optimization capabilities.

Also ideally such feature should be accompanied by ability to restrict input arguments (strawman syntax):

// `cake` returns `bool` and must be `nopanic`
const nopanic fn bar(n: usize) where cake(n) { .. }
const nopanic fn baz(n: usize, m: usize) where n + m != 42 { .. }
// when used in types the bound would have to be bubbled
fn foo() -> [u8; bar(N)] where cake(N) { .. }
// non-`nopanic` expressions in types can be either forbidden
// or covered by a special `nopanic` bound:
fn foo2() -> [u8; f(N)] where f(N): nopanic { .. }
// outside of `const` context the bound will be converted to an assert,
// for example in runtime code:
bar(n);
baz(n, m);
// will be implicitly converted to:
assert!(cake(n));
foo(n);
assert!(n + m != 42);
foo(m);

The nopanic bound looks quite similar to the Sized bound, but it only covers the affected expression, not the whole type. Also I think it will be less misleading, since it clearly indicates the guaranteed property.

Of course ideally const expressions outside of types ideally should be treated the same way as expressions in types:

// will not compile without the bound
trait Foo<const N: usize> where cake(n) {
    const FOO: usize = bar(N);   
}

// bound is not needed because we pass a static value
const N: usize = 42;
const FOO: usize = bar(N);

matthieum · December 16, 2020, 6:15pm

newpavlov:

// outside of `const` context the bound will be converted to an assert,
// for example in runtime code:
bar(n);
baz(n, m);
// will be implicitly converted to:
assert!(cake(n));
foo(n);
assert!(n + m != 42);
foo(m);

I find this part problematic, and somewhat controversial.

Contracts were scheduled for C++20, and one of the controversial aspects that led to the proposals being struck down was specifically that mixing compile-time and run-time validation in a language geared for performance is problematic.

If I am writing performance sensitive code, I do not want the compiler sneakily inserting an assert! behind my back.

A very important principle for readable/understandable code is Locality of Reasoning -- the ability to reason about the code in-situ, without having to refer to other code. In Rust, it's often referred to as being explicit.

There's no code more implicit and non-local than invisible code inserted by the compiler.

newpavlov · December 16, 2020, 6:31pm

How exactly such bounds are less explicit than asserts inside function body? Functionally they are completely equivalent to each other, the only difference is that in const context we have to explicitly bubble those conditions if necessary. If anything, having such bounds in function signature improves visibility and thus explicitness. Plus you can bubble those conditions (e.g. if you want to keep the nopanic property), thus avoiding the "implicit" asserts. I think most of your arguments can be also applied to implicit panics which are inserted "behind your back" during indexing and similar operations.

The alternative is to use bound on a whole const function/expression instead of a more specific condition, which I think is less transparent and requires looking into code to understand failure conditions (e.g. instead of where is_even(N) we will have where do_smth(N): nopanic or where [u8; do_smth(N)]: Sized).

Boomber · December 16, 2020, 8:52pm

I find the discussion is starting to turn into a workaround against try-exception like in other languages. Can we maybe model it in the type system directly similar to how Option or Result enum works?

So the nopanic model can simply prevent you from using anything that could panic in the function and then you can specify a possible "invalid" state in the return type instead.

const nopanic fn bar(n: usize) -> Option<usize> { ... }

// The compiler will give error if N does not return `Some(usize)`
fn foo<const N: usize>(val: Val) -> [u8; bar(N)] { ... }

This moved the problem from hard-to-define exception (panic) into an easily analyzable type for the compiler and there's no implicit assertion with this method. Though, I think using Option here is not the best option since it makes it looks like the compiler is implicitly converting bar(N) into usize. (maybe a simpler limited sub-set of pattern matching syntax?)

newpavlov · December 16, 2020, 9:23pm

I think your approach is quite surprising in a bad sense from user point of view, it's really weird to get implicit unwrapping in [u8; bar(N)]. We don't have anything similar to it in the rest of the language. And note that nopanic can be a quite useful property even outside of this discussion, it can be very useful in some areas to have compiler-verified guarantee that function will not panic for any possible combination of inputs (another useful property is provable termination, i.e. such function would have to be written in a total subset of Rust, but it's an off-topic for this discussion). I would love to see it added eventually even if it will not used for ensuring evaluation of const expressions in types.

Also your proposal does not bring anything new to the table compared to panicking const fns without mandatory bounds (the approach I've advocated for before proposing nopanic). Today compiler also will return a compilation error if CTFE resulted in a panic. See this comment and the further discussion for arguments why const expressions in types are considered special.

carbotaniuman · December 17, 2020, 1:29am

I have many gripes about const bounds like: exposing implementation details, terrible ergonomics, etc, but the lang team seems committed to a

With that being said, I'm not too sure what exactly this proposal is proposing. Are you saying that const bounds such be restricted to total functions, or are you wanting a implied bounds approach? You present a problem with const bounds as they currently are, but I'm not sure how this addresses anything.

comex · December 17, 2020, 7:19am

I agree that implicit asserts are no good, but where clauses that depend on runtime parameters would be a neat feature if they were explicit – i.e. if any asserts had to be added manually, and not doing so resulted in a compile error.

This would be a form of refinement types, which are a lighter-weight cousin of dependent types.

On the other hand, I think this is pretty far afield from whether or not const expressions should be statically checked.

In my opinion, const_evaluatable_checked is just the tip of the iceberg. Rust has a much broader problem with not being able to do type-level metaprogramming without encoding all the computation you ever want to do into trait bounds, resulting in a lack of generality and power, as well as very ugly signatures propagated up the stack.

H2CO3 · December 17, 2020, 11:39am

Ironically, this has the exact same problem as cheked exceptions in other languages. It is essentially an effect (with the wrong default), and now all the const code that does not panic (which, I suspect, is the majority of const code) has to be annotated with nopanic.

And then even if you provide the right default, there's still the question of higher-order functions, which now also need something to propagate nopanic effects instead of assuming or not assuming it by default, at which point it becomes exactly as painful as rethrows annotations, because they will be needed on almost every const function for usability reasons, but then they become redundant.

(Note that I'm explicitly not trying to say that I agree with the OP, I'm just very wary of this particular alternative design you proposed here.)

matthieum · December 17, 2020, 5:15pm

How exactly such bounds are less explicit than assert s inside function body?

As I mentioned, the main issue is that they are out of the way, which breaks locality of reasoning.

I'd prefer, instead, a mechanism by which the user must guarantee that the value matches the bounds prior to invoking the function, and get a compiler error if they do not. This would leave the user with multiple choices:

Introduce the bound on their own function, passing the buck up a level.
Introduce control-flow to handle the case where the bound is not satisfied.

They could choose, as per (2), to simply put an assert!, but they would have the guarantee that if they do not, and accidentally forget to prove a bound, then they'll get a compile-time error.

A compile-time error, rather than silently inserting an assert!, would be a safeguard in terms of performance and logic, assisting users during refactorings and upgrades alike.

RustyYato · December 17, 2020, 6:08pm

This would require refinement types to implement correctly, where as just putting an assert doesn't. (and the assert could be elided by LLVM, so that shouldn't be a problem in practice). Adding refinement types to Rust would be a large undertaking, similar to const-generics.

newpavlov · December 18, 2020, 5:22am

I am not sure I understand why this:

fn foo(a: usize) -> usize
    where is_prime(a)
{ .. }

is out of the way compared to:

fn foo(a: usize) -> usize {
    assert!(is_prime(a));
    ..
}

I also would love for Rust compiler to be powerful enough to allow (2), but I fear that it will be too complex to implement, thus the implicit asserts.

Maybe it can change with editions? For example, before say 2030 edition the bounds will result in implicit asserts, and after that, with prepared compiler infrastructure, users will have to provide explicit asserts if necessary properties can not be proven with available information. It should be relatively easy for rustfix to insert explicit asserts during edition migration.

matthieum · December 18, 2020, 5:35pm

newpavlov:

I am not sure I understand why this:
fn foo(a: usize) -> usize
    where is_prime(a)
{ .. }
is out of the way compared to:
fn foo(a: usize) -> usize {
    assert!(is_prime(a));
    ..
}

I was not talking about the definition of foo, but the usage (call-site) of foo.

Circling back, my first concern is to not use the bounds syntax (where) for 2 things:

Compile-time verification of requirements, as for trait bounds.
Run-time verification of requirements, which you are proposing here, which I feel is better left to a require clause -- as per the design-by-contract terminology used in Prusti.

Whilst one can argue that it's a prerequisite regardless of whether it's checked at compile-time or at run-time, I feel that a statically typed language is better served by making it clear what is checked when, and therefore (potentially) run-time checks deserve their own syntax.

My second concern is centered around correctness/performance/safety.

From a correctness/safety point of view, panics have to be handled carefully. Due to their somewhat invisible nature, it is very easy to accidentally leave a value in a half-modified (inconsistent) state, leading to correctness issues. When these correctness issues are encountered in unsafe code, they can quickly escalate to safety issues.

From a performance point of view, checks (and panics), can easily take you off the happy path. We have countless examples of index-based access where a change in the surrounding code would suddenly lead the optimizer to not elide the check any longer, resulting in much worse performance for an otherwise functionally equivalent code.

This leads me to care for 2 things:

Checks/Panics should always be visible, and unmistakable, in the code.
Ideally, checks/panics should be immediate/local, that is present at the call site, rather than inside the callee, as it makes auditing code easier.

The idea of the compiler automatically injecting assert! from where bounds in the generated code violates (1):

Ideally, I'd prefer no compiler injection of code. Explicit > Implicit. The user can simply use a panic! (or assert!) within the definition of foo, at least it's visible, though not in-situ.
If injection is somehow mandated, I'd prefer a different source notation than a where clause, such as a require clause. It would cleanly separate the only compile-time from the potentially run-time.

However, I'd much prefer a method where the user has to:

Either propagate the bound.
Or insert the run-time check themselves.

This means that as code changes, the user is made aware of missing checks, and can reevaluate whether correctness, safety, and performance goals are still met.

Would it?

It seems to me that we could aim for a simpler (literal) implementation:

If fn foo(n: usize) has a require n > N clause,
And it is called with i as its argument.
Then there should be a literal assert!(i > N); in the function, prior to the call.

Rust already uses control-flow analysis for initialization tracking and borrow checking, so it does not seem too complex to me if we keep the checks literal.

And we could even look forward to const_assert!(i > N);, which would require the compiler to prove at compile-time that the assertion is met, or raise a compilation error.

RustyYato · December 18, 2020, 5:39pm

Just slapping on an assert doesn't require refinement types, but lifting those asserts to compiler errors does because then it has to interact with the rest of the trait system and provide proofs, which is non-trivial. (But this is what I already said before).

edit: checking if users defined predicates actually satisfy where/requires clauses is non-trivial, and this is what requires refinement types.

newpavlov · December 18, 2020, 6:14pm

I do not insist on specifically using where clauses, as mentioned in the post it's nothing more than a strawman syntax. Introducing a separate clause may be a good idea either way so not to break the Type: Trait syntax simplicity of where clauses.

Every function called in your code can violate this principle. You can't say if a called function will panic without at least reading its documentation or ideally closely inspecting its code. Think of those asserts being inserted into a function body, not at call site.

One possible middle-ground solution could be to introduce a special language-backed intrinsic, macro or maybe even a keyword which would check function requirements, something like assert_req!(foo(n)).

oli-obk · December 19, 2020, 8:37am

I just want to remind everyone that const-evaluatabe-checked bounds are neither something the lang team has signed off on or even discussed in a meeting yet (this is literally just the const generics working group), nor are they meant to be the final word in how to do const bounds.

The whole point of const-evaluatable-checked is to at least be en-par with the typenum crate, which means we do not have to think about new concepts in the language itself, but just reuse existing concepts. This is forward compatible with any proposal that has better or more automatic bounds or whatever someone will think up in the future.

Now, back to the topic.

The inconsistency between const generics and associated constants is known, and we've been trying to figure out how to get associated constants cleaned up in forever. It is almost impossible to generate good diagnostics for post-monomorphization errors, and once we have a system preventing them, clippy (and later rustc) can start suggesting adding apropriate bounds. In my personal opinion, the inconsistency should not mean we throw away an approach that makes new code more reliable, but that we should try to fix the inconsistency.

I am absolutely on board with a better and compiler-builtin scheme for preventing a function from transitively or directly performing a panic. This is something definitely desired by many users (safety critical systems, unsafe code authors, ...), not just for const eval purposes. I just don't think it should be a prerequisite for const-evaluatable-checked, as the two features are entirely orthogonal, and const-evaluatable-checked gives us typenum parity with much nicer syntax (const-bound syntax is definitely something we should bikeshed, the current array bound hackery is ... exactly that, a hack so we didn't have to come up with a syntax )

newpavlov · December 19, 2020, 8:52am

Thank you for clarification!

My concern is that by allowing panicking const expressions it will be really hard to introduce the nopanic restriction later. Thus the end state will be either the inconsistency or a system which allows post-monomorphization errors in both types and constants. But as stated in the min_const_generics stabilization PR, unfortunately the the genie is already out of the bottle.

Do you think it will be possible to restrict const generics later without breaking the backward compatibility guarantees?

oli-obk · December 19, 2020, 9:59am

No, which is why I really want to keep it basic now and relax it only in small steps. We really screwed up going for the full fancy and "convenient" feature with promotion (same post-monomorphization problems), I really would not like to see it happen here, too.

I'm not sure I understand? Do you mean because there would be less pressure on introducing the feature? Like without const generics there was too little desire for it so we never introduced it? Or do you see a fundamental problem that makes it harder to introduce nopanic later?

newpavlov · December 19, 2020, 6:50pm

I mean that I am not the only one who felt that bound on types are inconsistent. And there is a strong pressure to finalize const generics as soon as possible, which does not leave much space for development of the nopanic and require-clause features. So the template-like solution will be supported by both consistency and convenience arguments.

Regarding possible ways of salvaging this situation. AFAIK currently the only way to trigger a panic in const fn is using arithmetic operations. Let's say that nopanic will be introduced in a future edition and other panic sources will not be allowed in const fns until then. Would it be possible to mark all const fns defined in crates which use earlier editions as nopanic? It would mean that arithmetic operations in earlier editions will be wrapping and overflow errors will be "legal" (i.e. we can say that they will be compiled using release profile), while in later editions they would have to be covered by appropriate bounds. In theory it should allow introduction of the mandatory nopanic restriction without breaking the backward compatibility promise. Could something like this work in your opinion?

carbotaniuman · December 19, 2020, 7:38pm

You can also panic by using array indexing, and you can fail CTFE by doing something that isn't supported such as ptrtoint casts.

carbotaniuman · December 19, 2020, 9:12pm

I just want to remind everyone that const-evaluatabe-checked bounds are neither something the lang team has signed off on or even discussed in a meeting yet

Ah my bad. It sadly doesn't help that you are on the lang team though, and we always make those subconscious conclusions. I do think it's a legitimate gripe that wherever these bounds are discussed, it always seems to get shot down though

Topic		Replies	Views
How to handle constants in the compiler compiler	19	4578	March 25, 2019
Thoughts on Compile-Time Function Evaluation and Type Systems language design	32	5442	March 25, 2019
Idea: a path toward reducing the need for `unsafe` by making invariant machine checkable language design	11	1460	April 15, 2023
Pre-RFC: `?` operator in constants language design	28	1039	December 23, 2024
Pre-RFC: Revamped const_trait_impl aka RFC 2632 language design	81	14759	April 21, 2022

About const expressions in types or why we don't need `const_evaluatable_checked`

Related topics