Idea: make assert!() a keyword - REJECTED

HeroicKatora · September 18, 2020, 7:53am

This is proposing that code code may not compile with later versions of the Rust compiler. That seems opposed to the core principle of forward compatibility. Even when declaring violations as unsoundness, and thus technically permitting the compiler from failing to compile them, it seems rather odd.

With regards to runtime checks, I don't see the difference to a standard assert. Equivalence and impossibility proofs are already performed locally as part of most optimizations done by llvm, adding the—hopefully pure—checks on every function can be done as a user extension.

mankinskin · September 18, 2020, 8:42am

I don't think I understand what you are looking for..

Why not just use assert! then? There is also bool::then to turn a bool into an Option.

What I am missing currently is a type that automatically checks its values for some constraint, or a predicate function, so that I can have guarantees for the data of that type, which are enforced for me. Currently, we have to manually enforce invariants for the data of a type, by carefully constructing it and managing write access. This is just like using a u32 to store boolean values. So basically what I am looking for is an extension of the type system, which allows me to constrain types through predicate functions, and verifying these properties by the compiler. However as @Tom-Phinney mentioned, this is probably more complex than I can currently comprehend.

In your example you are showing how Decoded is constructed. Depending on what is stored in Decoded, you could use the current type system to guarantee correct values, or not. For example, if you were decoding just arbitary numbers, you could use u64 and store numbers as large as you could fit. But, there is no type currently, that can only be exactly 40 to 50, or a type that can only be an even integer, or solutions of 0 = 2x - y. Or a string starting with aaa.

The difficult part is validating these invariants at compile time. How can you know if a value of the range (40, 50) is 49 right now, and you can increment it, or if it is 50 already and incrementing would break the invariant? You have to check this at runtime, each time you increment the value.

But I would use a type like this too, one that can return an error whenever I change its value. This probably already exists though, for example, there is the Range type, even with syntax sugar: 40..=50.

I think my fundamental problem is that types and values have a very thin and washed out border between them. In a way, they are the same, because they are both stored as data. You could store a constant value of type Range:

const RANGE_40_TO_50: Range = Range::new(40, 51);

or you could build a type

struct Range40To50(u32);
impl Range40To50 {
    const START: u32 = 40;
    const END: u32 = 51;
}

It seems like types are actually just constant values with a lot of syntax sugar. Instead of defining the byte offset for each field (a constant), the compiler infers it for you when you write the struct definition. I would really like to extend on that idea and make meta programming more expressive. But maybe that is a project for a new language altogether, I don't know.

ckaran · September 18, 2020, 2:51pm

Yes, you're right, I am proposing that code that compiles now may fail to compile in the future. When the programmer uses assert_constraint!() or try_constraint!(), they are telling the compiler that something is true. If an assertion is false, the fact that we catch it at compile time or at runtime doesn't change the fact that it is false. If an assertion with a false statement compiles today, it's not because it's correct, it's because the compiler has limitations that prevent it from detecting and reporting that the assertion is false. As those limitations are improved, the compiler should report false statements as hard errors; that is what makes Rust a sound language, the fact that in safe Rust you can't express certain unsound statements without the compiler refusing to compile the code.

Put another way, it is more ergonomic to catch the error at compile time than it is to ship it and have to figure what's going wrong on your customer's machine at 2 AM, all the while people are screaming at you that it needs to be fixed right now!

There's no strong argument for assert_constraint!() over assert!(), but there are several weak arguments for it:

As mentioned above in my reply to @HeroicKatora, code that compiles today may not compile in the future. The documentation for assert!() states that it is a runtime check, and is silent about compile time checks. AFAIK, there is no documentation anywhere that says assert!() may break your build in the future, and changing that may break implied forward compatibility guarantees. assert_constraint!() doesn't yet exist, but if it did we could document the fact that as rustc and other tools improve, code that used to compile may stop compiling if the assertion can be proven false at compile time.
It has a nice symmetry with try_constraint!().
Tools that are completely separate from rustc can parse Rust code looking for assert_constraint!() and try_constraint!(), treating them as full keywords. So if you want to put together a SMT solver for Rust, it may become easier.
I have no idea what the complete interface for assert_constraint!() and try_constraint!() should look like. Should they look like the current assert!() family of macros, or should there be additional parameters? Are their uses precisely equal to assert!()???

I'm in 100% agreement with all that you said here and following. What I think would happen is that a new layer would be added to the compiler that converts these constraints into new types that enforce the constraints. The new types would be substituted into the right places, generating newly expanded code (HIR? MIR? Something else?) that gets passed to the rest of the code. The type system would then handle the verification.

The tricky part is, of course, how do you do this? My thought is that we can figure that out over time. Once the user-facing constraint mechanism is in place, we can build on it in the future.

Tom-Phinney · September 18, 2020, 3:46pm

I started to write a post analogizing these type+constraints to Ada subtypes, which are base types with added constraints. They differ from newtypes in that the subtype has precisely the same set of traits and methods as the base type.

However, as I pursued this line of thought I ran into two problems. The first, and more difficult, is that it seems impossible to infer where during expression evaluation the constraints should be checked. The only reasonable solution is that the constraints should be checked

when values are being stored into an object of the subtype, and
on method return when the method's output is declared to be of the subtype.

The latter implies that constraints are not applied to outputs of methods shared with the base type, but only to new methods declared on the subtype.

The second problem is that this interpretation requires all method signatures of the base type to remain unchanged, so it does not permit operations on the subtype to return a Result unless operations on the base type returned the same Result. That implies to me that any detected constraint violation within a method of the base type necessarily induces a run-time panic.

My conclusion is that a derived type that returns a Result needs to be a newtype rather than a subtype, with method signatures adjusted everywhere that the newtype is stored or returned. Similarly, code needs to be modified to check or propagate the result, often by adding a ? postfix try operator. This seems like something that can be explored with an attribute proc macro.

I don't know whether the above considerations help any of those contemplating this issue. I hope that it will be of benefit to someone. The analysis clarified for me that the try variant needs to be handled differently than the panic variant.

ckaran · September 18, 2020, 3:48pm

I just realized that there is an additional requirement on assert_constraint!() and try_constraint!(); there are no guarantees as where/when the tests are performed, or if this changes with different versions of the compiler. That is, version A.B.0 of the compiler might do the same thing as assert!(), version A.B.1 might validate everything at compile time, A.B.2 might realize that A.B.1 had a bug and only some of the constraint could be evaluated at compile time, A.B.3 discovers that it is faster/more power efficient/better to do the entire check at runtime, and A.B.4 figures out that for some backends its better to do different techniques, etc. None of these changes are considered to be breaking changes, even at the patch level.

In addition, the set of values that are validated at compile time and the set that are validated at runtime might have a non-empty intersection. This is entirely up to the compiler to decide, can change at any time for any reason (e.g., speed, power efficiency, etc.).

Finally, the one breaking change that I can think of is reserving assert_constraint!() and try_constraint!(). The main advantage of turning them into keywords is that tools other than rustc will be guaranteed that user code didn't create its own macros that match this, but with different semantics, and this can be validated while parsing.

mbrubeck · September 18, 2020, 4:18pm

Anyone interested in this might also want to look at the calculated typestate system that was part of the Rust language in version 0.3 and earlier. It was a system of run-time assertions, combined with static analysis to track which values were known to pass which assertions at all points in the program's control flow, and an optimization pass to remove redundant run-time checks.

This Stack Overflow thread has some more info, including some of the reasons that typestate was removed from Rust.

Fishrock123 · September 18, 2020, 5:22pm

Other discussion aside, were this assert keyword to exist, I think it would be less confusing and more rust-like if it only worked statically, and otherwise told you to use assert!().

ckaran · September 18, 2020, 7:18pm

The problem there is that it constrains compiler improvements. If we permit it to be evaluated at either compile time or run time, then as analysis improves we can shift what used to be run time analysis to compile time analysis.

ckaran · September 18, 2020, 7:49pm

Thank you for the link. Looking at the stackoverflow thread, I noticed the following:

This means that predicates for a types are useless in themselves, the utility comes from annotating functions. Therefore, introducing a new predicate in an existing codebase is a bore, as the existing functions need be reviewed and tweaked to cater to explain whether or not they need/preserve the invariant.

I think what we're trying to invent here solves this issue somewhat. The programmer will be able to annotate a block of code to the degree that they feel like annotating it; if they don't feel like annotating a chunk of code, it just means that the compiler isn't able to verify as much, and therefore can't optimize as hard it might otherwise be able to.

And this may lead to duplicating functions at an exponential rate when new predicates pop up: predicates are not, unfortunately, composable. The very design issue they were meant to address (proliferation of types, thus functions), does not seem to be addressed.

I think what we're developing still has this issue, even if the compiler will be the one burdened with generating the types. In the worst case, every bit pattern that is possible would need to have its own type. I don't think that there is an easy way around this.

That said, I also think that this is an unlikely occurrence in most cases. Unfortunately, I think that the only way to evaluate how often this becomes a problem is by implementing the system, and then testing it out.

dns2utf8 · September 21, 2020, 1:36pm

Sorry for coming late to the discussion. Making assert! a keyword would break the ability to replace the assert* family with custom implementations like pretty_assertions. So keeping it as a macro also has benefits.

ckaran · September 23, 2020, 8:57pm

Well, that appears to be another good reason to define assert_constraint!() instead of redefining assert!()!

ckaran · September 28, 2020, 1:55pm

OK, I think we're at the point where we can start discussing the signature for these new macros/keywords. Here is my proposed API:

assert_constraint!() has an API that is identical to assert!(). This will have the least friction from an end-user's point of view, which will be good for those that never read the instructions.

try_constraint!() is slightly different. The first argument is the constraint, which is identical to whatever assert_constraint!() is able to accept. The second argument is a Result<(), Box<dyn std::error::Error>>. Like assert!() and assert_constraint!(), the final argument is optional, and is the string type that assert!() or assert_constraint!() are able to accept. I chose this ordering as it is the closest to the ordering of rust's functions that I could think of (foo(constraint) -> Result).

Alternatively, we could require the return signature to be Result<(), ContraintError>, which has the advantage that the arguments to try_constraint!() won't need a Result type specified, and will therefore make it easy to replace all uses of assert_contraint!() with try_constraint!(), and vice-versa. As a note, see @mankinskin's comments here.

@mankinskin, do you have a preference for a particular API? I prefer the first API as it gives the user slightly more control over what is returned, which can be valuable, but I'm open to anything anyone wants.

As discussed in the above discussion, there are no guarantees as to where the constraint is evaluated; it could be at compile time, or at run time, and this can be changed arbitrarily between compiler releases (even individual commits to the compiler source, let alone an actual release).
Both assert_constraint!() and try_constraint!() will be reserved as if they were keywords (this could be done in the 2021 edition, if everyone so chose). This give the compiler, and any other tooling, significantly more power as they don't have to worry about macros that 'look like' these macros.

If these made it into the 2021 edition, my expectation is that they would just be macros. assert_constraint!() would actually expand to assert!() internally. try_constaint!() would require a little more effort because we can't just wrap an assert in std::panic::catch_unwind() and expect things to work right (the user may have chosen to abort(), preventing the catch from occurring). That said, I don't expect either of these to be a significant amount of work code-wise.

The real effort is going to be in writing up good documentation, making it very clear as to what these macros are for, and making it very clear that these are reserved names that just look like macros (that is, if you write your own version of the macro, the compiler will reject it in the same way it would for a keyword).

So, thoughts, comments? Is it time to start proposing an RFC?

mankinskin · September 29, 2020, 8:54pm

Honestly, I can't really understand what the benefit of macros like this would be over the existing assert! macros. What I would find useful are arbitrary predicates applied to a type, with the checks for each object being implemented by the compiler, or checked at compile time if possible. But this would require very careful design, if you want better ergonomics than just having a method guarding access to the object.

However I don't really see what your are going for with the assert macros. Do you have an example of what they solve?

phlopsi · September 30, 2020, 8:39am

Why not make assert work with/like warnings? You can write deny(some_warning) and turn warnings into compile-time errors. This is opt-in and by doing that you opt-out from the guarantee, that code will also compile in future Rust versions. Rust could emit warnings for assertions, that have been proven wrong and you can opt-in to turn those into compile-time errors, giving up certain compatibility guarantees. However, trying to prove assertions is an optimization issue, i.e. it might only emit warnings/errors with more aggressive optimization options.

Also, I'm not sure why it has to be specific to the assert macro. All it does is panic, so why not just throw warnings for all proven-to-be-called panics, that can be denied/forbidden?

bjorn3 · September 30, 2020, 9:31am

Doing deny(some_warning) to error on failing assertions would be bad IMO, as changes to rustc may cause it to detect more or less cases of guaranteed failing assertions, thus breaking the build.

phlopsi · September 30, 2020, 10:53am

That's the whole point, tho? You can break your build with any deny/forbid by changing the Rust version. This would simply be a new warning that can be turned into a compile-time error, if needed.

ckaran · September 30, 2020, 2:33pm

There are two benefits to the design I'm thinking about:

As @dns2utf8 explained, there are already uses of the current assert!() family of macros that depend on them remaining as true macros. Changing their behavior would be a breaking change in Rust, and so should be avoided if at all possible. That means that if we want to change the signature, we can't do it.
The proposal is a macro-like object, that is initially a true macro, with no promises of when it changes from a macro to a keyword or back. The difference is that while the assert!() family is provided by libcore, it is still theoretically possible to write your own macro with the same name. As a result, the compiler may be more limited in what it is able to do. By reserving assert_constraint!() and try_constraint!() as if they were new keywords, the compiler and surrounding tools have strong guarantees as to what the compiler is looking at, from the parsing stage forwards, which may give the compiler more freedom to optimize.

I see the idea behind this, but I suspect that you're still seeing this as a macro only. I expect that the compiler could be extended in the future to generate dependent types that can be further optimized, using these constraints to guide it. If you are able to turn off the constraints at will, then that can make things more difficult.

You're right. A way forwards would be to have another stage in the compiler that expands attributes to assert_constraint!() statements at every access of the variable. So, the pipeline would be parse -> expand attributes (as assert_constraint!() statements) to all accesses -> evaluate as much as possible while compiling & report errors -> emit code where required. But this pipeline has an inherent assumption; assert_constraint!() is not rewritten. If it was, then the pipeline could be broken. Reserving assert_constraint!() means that won't happen.

wesleywiser · September 30, 2020, 3:09pm

LLVM can already use assert! to guide optimization. For example:

pub fn test_1(v: Vec<u8>) {
    v[0];
    v[1];
    v[2];
}

pub fn test_2(v: Vec<u8>) {
    assert!(v.len() > 3);

    v[0];
    v[1];
    v[2];
}

The first function will bounds check v three times to see if the index operations are in bounds but the second elides the bound checks entirely because of the assert. godbolt

mankinskin · September 30, 2020, 3:18pm

I am not sure if using assert_contraint macros is the right approach to implement something like dependent types. It may turn out as a type triggering assertions way too often. Ideal would be a type-safe interface that can only be used in a way that doesn't break the types definition at runtime.

I think a lot of this will be solved by const generics, so maybe we should focus on getting involved there?

ckaran · September 30, 2020, 6:57pm

I have two concerns with that proposal. First, it's scope is limited; it can only be used for values that are known at compile time.

Second is friction. At this point, I never specify lifetimes for variables because rust is so good at figuring them out. That has greatly improved my life as a programmer as I'm not fighting the borrow checker all the time. Constant generics look like something that will introduce friction once again as end users are suddenly faced with (possibly cryptic) errors from the compiler regarding values that aren't constant at compile time. This will be especially problematic when the user 'knows' that the values are all constant (but for some reason they aren't, and the user doesn't understand this).

This proposal allows the end user to just write what they mean, and let the compiler figure out the best place to evaluate it. And as things improve within the compiler, more and more of that work can be moved into compile-time evaluation rather than run-time evaluation.

Alternatively, since the time of evaluation is deliberately unspecified, we could have debug builds that just emit assert code, which results in very fast builds, and release modes that attempt to evaluate the constraints. We can even add level of effort switches so that programmers can choose how much time is spent evaluating the constraints before giving up and just emitting code to evaluate it at run time. This could be useful for overnight/weekend builds where the server is given many hours to do a compile, and can therefore explore a very deep graph of constraints to prove (or disprove) what is possible, and to really optimize the code.

Topic		Replies	Views
Reviving RFC 1662: assert_lt, assert_le, assert_gt, assert_ge libs	6	1596	September 26, 2020
Keywords to reserve for Rust2018 epoch	37	5793	March 25, 2019
Design lang language design	14	797	January 31, 2021
[Pre-RFC] assert macros simplified libs	9	1749	March 25, 2019
New blog post about the lang team's initiative for this year language design	16	2294	March 25, 2019

Idea: make assert!() a keyword - REJECTED

Related Topics