Should r#<digit> be a valid raw ident?

Given that we have r# raw identifier syntax, I think it makes sense for it to allow any “Ident_Continue” character after the # rather than requiring starting with a “Ident_Start” character like regular keywords.

I’m currently thinking about it since I’m writing a compile-to-Rust tool that allows mixing named and unnamed fields, so mapping the unnamed fields to r#<digit> makes some amount of sense. (They’re not exposed names, so for now I’m just using random UUIDs.)

1 Like

If you wrote struct Foo { r#0: i32 }, could you access foo.0 (without r#)?

It might be interesting if struct Foo(i32, i32) became equivalent to struct Foo { r#0: i32, r#1: i32 }. (but also weird.)

The intent is for r#0 to be distinct from 0, the same way r#fn is separate from fn. It would, however, be interesting if struct Foo(i32, i32) became sugar for struct Foo { 0: i32, 1: i32 } fn Foo(r#0: i32, r#1: i32) -> Foo { Foo { r#0, r#1 } }.

r#0i32 and r#0foo would also be valid but obviously foo.0i32 and foo.0foo remain invalid (would have to be accessed as foo.r#0i32/foo.r#0foo).

I would, personally, prefer to keep the r# syntax solely for r#keyword, and not allow identifiers that start with digits. r# exists for interoperability/compatibility related to keywords.

(I can imagine uses for digit-only and other unusual symbol names in binary formats, but mechanisms for specifying symbol names typically accept strings.)

6 Likes

What problem is this actually solving? Machine-generated fields are trivial: you know the complete set of human-generated fields, so you can just create __generated_foo_<n> or such for increasing n, and skip any names that the user picked.

1 Like

I’ll admit there isn’t really a direct problem this solves, other than maybe slightly more predictable generated names, though even those could just be _<digit> (so long as you turn off the clippy lint). It’s just something my intuition said would make sense, I guess.

When the difference between the id_start and the id_continue set gets larger (i.e. XID_Start and XID_Continue), then there might be more utility in allowing a raw identifier to start with a continue character.

Proposed new Rust name mangling scheme can’t support identifiers starting with a digit.

2 Likes

Raw identifiers exist to ease machine translation from one edition of Rust to another; I don’t think its necessarily a good idea to expand their use cases. I’d prefer a solution to the problem identified based on the actual proc macro APIs, like an API to create a new, hygienically unique identifier without specifying what it is called.

7 Likes

You can already do struct Foo(i32, i32); Foo { 0: 10, 1: 11 }, so I don't follow the proposal here. Especially if...

...because then the field shorthand wouldn't work in the case shown.

2 Likes

I actually wasn’t aware of that feature.

Would it make sense to be able to write struct Foo(i32, i32); as the following surface syntax, then?

struct Foo {
    0: i32,
    1: i32,
}
fn Foo(_0: i32, _1: i32) -> Foo {
    Foo { 0: _0, 1: _1 }
}

I honestly don’t know and I’m just sort of exploring space at this point.

See conversations about this over in https://github.com/rust-lang/rust/issues/35626#issuecomment-272250730

The thing that we’re nowhere close to allowin in surface syntax yet is making Foo(_, _) work in patterns.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.