Unsafe Blocks / Async Blocks : should they be parsed differently?

The way I think about unsafe and async blocks is that they are syntactically the same and semantically similar. If you decorate a block with unsafe it allows you to perform some operations otherwise not available, namely raw pointer dereferencing and calling unsafe functions. If you decorate a block with async it allows you to perform some other operations, namely await. And this system (in theory) is extensible, later the language can introduce other kinds of blocks, like the already existing RFC try blocks, and they would also be syntactically same and semantically similar. Unsafe blocks signify computation in unsafe context, async blocks signify in asynchronous context, try block would in fallible contexts. Hell maybe someday we can have user defined custom blocks, like F# does for computation expressions. (Not that I advocate for such a thing and yes it is all monads under the hood in F#, and I do not advocate for their inclusion in Rust either, the only thing I advocate is that there are good reasons to treat these blocks similarly).

When I was browsing the Reference I came upon the production rules for expressions, I saw that Unsafe blocks are produced via non-terminal ExpressionWithBlock, and Async block via ExpressionWithoutBlock. I suspected that this might be a bug I opened an issue to in the repo of the reference. @ehuss helpfully pointed out that initially there was no such discrepancy. But later they discovered that rustc parses them differently. The evidence was that async blocks when used as expression statement require a trailing semicolon, but ExpressionsWithblock have that semicolon optional. So they edited the reference to mirror rustc's behavior.

I have read the RFC that introduced async/await expressions and async blocks and I did not find anything that suggests that the original intention was to parse these blocks differently. I would like to know what is the general consensus on the issue. Was it even openly discussed and decided one way or the other? Is it a bug or is it a feature? What does the keyword generics initiative think about the existing discrepancy or future generalizations?

1 Like

I think it makes sense that async blocks are parsed like closures, given that they are very similar in that they both always return a value representing the block. And at the same time I think it makes sense that unsafe blocks are parsed the same way as control flow blocks in that they both return the value returned by the block contents rather than a value representing the block itself and in many cases this value is a unit type.

Lang discussed this recently in relation to Stabilise inline_const by nbdd0121 · Pull Request #104087 · rust-lang/rust · GitHub to decide which one const { ... } blocks should follow.

Importantly, async { ... } is very different semantically from unsafe { ... } (and const { ... }) in that it's a thunk. It's never useful to have an async { foo() }; statement, as that does nothing. (Similarly, || { foo () }; is also useless.) Whereas unsafe { foo(); } as a whole statement is useful.

Is it good that there's a difference here? That's not as obvious to me. Maybe async should have worked the same way as unsafe for consistency. But it might be too late to fix that now.

4 Likes

Is that really true? The only reason I think why it would be hard to change is backwards compatibility, if some users relied on these features. But as you point out async as an expression statement is useless. I doubt that changing the parsing rule would break anyone's build. Only way to know is to follow the usual procedure of introducing new rustc version, tentatively making the change and compiling crates in crates.io to see if anyone's build breaks.

Do you think of any other reason that this is hard to change?

From the GitHub converstaion you linked, it is obvious that even the most experienced Rustecans are surprised about the issue, that is a reason to introduce more uniformity. And it seems that now is the time to do so before const blocks are stabilized and add their weight to backwards compatibility issues.

Stability without stagnation is an important value to uphold, and this is a low hanging fruit to eliminate one papercut, one language inconsistency.

Nit: there may be implicit coercions and such, especially around lifetimes, that wouldn't occur if you simply pass in foo() instead of async { foo() } (or especially async { foo().await }) or foo instead of || foo().

It may not be something which people would write by hand, but it's still valid syntax, and it can happen e.g. as a result of macro expansion, where it occurs as a degenerate case of some construction.

Now sure, we could run Crater and check whether this actually happens in the wild, but why bother? It's a very minor kink which most users who don't write Rust parser aren't even aware about, it has legitimate arguments to be the way it is now, and there can always be some private code which uses this pattern but can't be tested by Crater. The benefits of that change just wouldn't be worth the risk.

Note that I specifically said

async { foo() }; statement

It's absolutely true that if you're passing the async block or closure to something then it can matter. But if you're just using it as a statement -- discarding it with a ; -- then it's always dead code.

(It might still affect type checking, depending what exactly you do with it, but it's definitely dead regardless.)

1 Like

Wouldn't this change simply turn

fn foo() -> i32 {
    async {
        4
    }
    3
}

from a syntax error (current)

error: expected `;`, found `3`
 --> src/lib.rs:5:6
  |
5 |     }
  |      ^ help: add `;` here
6 |     3
  |     - unexpected token

into a type error? (like with non-() typed if and loop)

error[E0308]: mismatched types
 --> src/lib.rs:3:5
  |
3 | /     async {
4 | |         4
5 | |     }
  | |_____^ expected `()`, found opaque type
  |       ^ help: try adding `;` here
  |
  = note: expected unit type `()`
           found opaque type `impl Future<Output = {integer}>`

Granted, I do think it makes the most sense that this be parsed similarly to closures.

1 Like

Nice point. Indeed, with unsafe {} we can write the block as

unsafe {
    4;
}
4

and there are no type errors, because the unsafe block has the type of its last expression. But the analogous async block has the type of anonymous future, regardless of inner statements.

1 Like