Grammatical ambiguity around `catch` blocks

So @cramertj has been working on implementing catch blocks, and they ran into an interesting issue with the grammar. I think we believed that catch { <expr> } would be legal to add without ambiguity because the only thing it could be was a struct initializer, and those have the form Struct { (field: expr)* }. But with the new field-init-shorthands, catch { foo } could be either a catch block or a struct initializer.

Thoughts on how we should handle this? I suspect that very little code actually uses catch as the name of a struct or enum variant, but of course it’s possible.

Note that local variables named catch should be ok, since the “keyword” would only take effect when followed by { (and, e.g., while catch {..} would therefore parse as while (catch) { ..}, given the restrictions on expressions that appear in a while).

Options I see:

  • add some system to let us grow new keywords (that’s a bigger topic…)
  • breaking change for structs named catch (we’d want to measure, do a warning period)
  • resolve based on whether a struct named catch is in scope (much as we do for pattern bindings; not my preferred plan, but probably an option)
  • use do, which is reserved :slight_smile:
  • Use do as a sort of keyword escape?
    • do catch { <expr> } // if you don't mind
    • (sort of joking, but maybe…)

@cramertj points out that my reasoning is false-ish, in that type ascription means that catch { foo: bar } is also ambiguous. But it doesn’t really change the fundamental options, though I guess it makes the idea of a “context-dependent parse” too complex to really imagine (not that I favored that anyhow).

Frankly, I’d be amazed if someone named any struct catch. We should probably try to do a search across all crates.io and maybe github repos.

EDIT: e.g. GitHub search for "struct catch" in .rs files. Oh and enum Foo { catch {...} } use self::Foo::*; is entirely valid too, but harder to find.

1 Like

Heh.

I'm for the simplest solution - always treat catch { in expression positions as start of a catch block. Nobody calls their structures catch anyway.

[link]

3 Likes

Heh, I wondered if this came up on the thread before, but I course I didn’t actually search. Should have known it’d be you who brought it up (since you seem to be quite adept at uncovering such problems).

A variant of this is to "refuse the temptation to guess" (in Python parlance) when there's a struct called catch in scope. That is, emit an error if we encounter a catch { ... } that could either refer to a struct called catch or be a catch expression. This is still a breaking change, but only for code using field init shorthand[*], which has recently been stabilized but hasn't hit stable yet. So, all existing code using catch { field: expr } as struct initializer would continue to work, but any of the below would get an error:

  • writing a now-newly-stable shorthand initializer like catch { field }, or
  • writing a catch expression while a struct named catch is in scope, or
  • having a catch expression somewhere and introducing a struct called catch

[*] This doesn't address type ascription, though. (But I recall chatter doubting the use of type ascription, so maybe it may not come to fruition after all?)

3 Likes

might be a horrible idea but what about catch! { <expr> } through that would collide with macros named catch, which might be even likelier then a struct named catch.

This should be done sooner rather than later, otherwise Rust will start to accumulate hacks and context keywords and will become difficult to parse. Let's not become another C++ :wink:

My suggestion is to have a gradual process of keyword reservation:

  1. first add a warning that an identifier will become reserved in the future and suggest renaming.
  2. after a period of time (one release cycle?) make it a hard error.
  3. after another period of time, reserve the keyword.

In addition, there should be tooling somewhere to apply auto rename for to-be-reserved identifiers in order to help automate the transition.

2 Likes

There's a growing list of other syntax we would ultimately like to repurpose in this way, but we generally avoid breaking the ability to run code that worked on 1.x on 1.y where y > x except in cases where the code was exploiting a compiler bug or was just a terrifically rare construction. Our stability guarantees demand we wait until 2.0 for most of these.

I assume the system @nikomatsakis alludes to for introducing new keywords would involve some kind of opt-in to make it not a breaking change. (@nikomatsakis correct me if I'm wrong.)

Maybe something involving ?, since that’s sparsely used and related to what this does anyway? For example: catch? { foo()?.bar()?.baz()? }

Alternatively, whole-function-in-a-catch-block would help the Ok(()) issue, so it would be cool if the catch syntax could be something that would work at the block-for-the-function level. Non-serious proposal for illustrative purposes: fn four<E>() -> Result<i32, E> Âż{ 4 }, or the above would be Âż{ foo()?.bar()?.baz()? }

Also, is there a way that throw can be reserved inside a catch block, so that an attempt to add that sugar later won’t hit similar grammar problems?

Perhaps we should reconsider some of the keywords that are not ambiguous, e.g.:

do {
    foo()?.bar()
}
1 Like

So what does the Language Team think about having crates declare language version they are targeting?

3 Likes

If a language version (target) is introduced I believe it should be crate specific, aka mixing crates using different language versions should be fine. I would even consider making it module! specific (or having some major.minor scheme where all modules in the same crate have to have the same major lang version, but possible different minor lang versions)

I think such a language version should be for some new features which introduce some syntactical only breaking changes . Also this could be extended in "some" degree to API changes, as long as this won't lead to incompatibilities with existing libraries using a older language version. Note that I mainly mean incomparability of using older libraries in libs/progs using a newer language version. I think the other way around it is fine as long as the library still can be used, through maybe not some of it's part's (e.g. parametrized modules or some other strange, but probably use full, thinks).

This can be used to introduce keywords which change crate/module/function internal aspects only, like catch. This could also be used to "phase out" some other part's, which else wise would, at most, generate warning, e.g. maybe the (in the future) old macro system (<- not completely sure about this myself).

Through I think it should not be called language version we already have a language version i.e. the rust(c) version. Maybe some think like syntax version might be a better name (but then if might actually be a bit more then syntax...).

Oh and add syntax_version = <newest> to new crates and additionally make sure "nice" error messages are produced when accidentally colliding with "old" syntax versions.

Alternative a #[syntax_features=...] crate/module(/function?) level annotation can work too (including stable).

Lastly do catch { ... } is not "that" bad :slight_smile:

Probably worth linking to [pre-rfc] Stable features for breaking changes at this point since a good chunk of that thread has been discussing the idea of “epochs” as a way of introducing minor breaking changes such as making catch a keyword, and the details of that idea seem to fit the spirit of your post.

1 Like

Was a decision ever made on catch syntax? Did it just land as do catch for now?

For now. This is not meant to be a permanent solution.

How about reserving a big bunch of keywords, once and for all (i.e. catch, await, class, etc.). Ah, I see that you already had the idea: https://github.com/rust-lang/rust/issues/10293 - well, how about reopening that ?

Well, it’s a bit late now. =) I’d prefer to solve this permanently with the concept of epochs, which is exactly the sort of thing we were anticipating when we closed #10293. I think the reason I am not keen on reserving “a bunch of keywords” is that then we wind up picking keywords not based on whether they are the Right Keyword, but based on whether we happen to have reserved it. OTOH, the whole Epoch discussion hasn’t really reached consensus yet, and in particular it’s not clear how often we might want to declare a fresh epoch – if the idea is to do this very rarely, then perhaps it makes sense that, in the next epoch, we reverse course on #10293 and do indeed reserve a bunch of common keywords.