Pre-RFC: Raw identifiers


#21

I like the backslash. Probably the at-sign second.


#22

Nothing particular, but I don’t see a reason in artificially adding exceptions.

If there’s a consensus about disallowed characters for linkers (good point), there would be a valid reason for a restriction (or escape rules).


#23

Seeing the choices in actual code I think I prefer the look of \catch, all else being equal.


Backslash and at-sign do have the downside of not being delimited syntaxes so you need some rules to detect the end of the identifier, I assume if we go for a syntax like that then we’d limit it to being like a normal identifier, which would make this only useful for escaping keywords and not much else.

Potentially it could be extended a bit... but allowing a character like `:` in an escaped identifier would be really annoying for code like `foo::\catch ::Bar { \catch : 5 }` requiring forced whitespace (assuming whitespace is not allowed, but some people probably do want whitespace in their identifiers).

As @matklad mentioned using a delimited syntax like `catch` allows for easily extending the characters allowed in the identifier, which may be useful for things like FFI to languages that have a wider valid identifier character set. It does make it a more heavy-weight syntax, I don’t think I would use `struct` instead of strukt in proc-macros, whereas I have used @class heavily in reflective C# code.


I think deciding whether these raw identifiers should support just reserved, normal-syntax identifiers; slightly extended syntax identifiers; or full, anything goes, string-like identifiers; would be a good first bikeshed before really arguing over the exact characters used to represent the rawness of the identifiers.


#24

If we go with string-like identifiers in backticks, do we also need a raw form of that?

enum Foo {
    `catch`,
    `that was a keyword!`,
    `we can escape like \`catch\` too`,
    `but if we want a \\ in our identifer`,
    r`maybe we'll want raw \ like this`,
    r#`and use raw ticks `\` like this`#,
    r##`even `\` and `#` together`##,
}

It’s a deep rabbit hole!

But it could also be progressively expanded. Start with just identifier characters to address reserved keywords, then open it up to any characters with \ escaping, and then maybe go all the way to really raw forms.


#25

Yes, I think you’re right. The only use would be as a shorthand if you want to avoid some kind of mangling attribute on every foreign declaration.


#26

I personally don’t much like \catch since \ is almost universally a single-character escape elsewhere rather than a whole-word escape, so it looks odd to me (though I’m sure I would get used to it over time).

I’ll just throw \{catch} into the pot as an idea while I’m here.


#27

This looks great!

Personally, I was happy when I saw the r#catch syntax. It’s reasonably short, fits decently with other syntax, and doesn’t take up a new sigil that I’d rather keep for a more important use.

I think that having a short syntax is far more important in C# because of reflection and anonymous struct literals. Razor uses things like new { @class = "errorbox" } all over the place as an inefficient associative container, with the field names appearing in the output. I hope Rust doesn’t develop such a pattern.

A possible extension: this could allow both identifier#catch and keyword#catch (probably with shorter prefixes) as a way to also expose catch in Rust 1.0 (or multi-version) code.


#28

Anything except r#… and br#… will break Macros 1.0.

macro_rules! x {
    ($a:ident # $b:ident) => {};
    ($c:ident) => { should error };
}
x!(identifier#catch);
x!(keyword#catch);

#29

Hmm, that macro example also rules out @catch, but \catch and `catch` should still be OK because those characters aren’t yet found in legal tokens. (error: unknown start of token: \)


#30

Many C dialects support $ as a character in identifiers (probably for compatibility with VMS), so this might help with FFI if someone uses this particular C feature.


#31

So any new literal kind? That’s a shame, and blocks things like s"foo" for Strings too, I guess.

I often feel like every possible syntax addition is breaking for macros…

If possible, I’d like to keep ` unused so writing docs and posts here stays easy :smile:


#32
#[allow(non_camel_case_types)]
#[derive(Copy, Clone)]
struct s;

impl std::ops::Div<&'static str> for s {
    type Output = String;
    fn div(self, literal: &'static str) -> String {
        literal.to_owned()
    }
}

fn main() {
    let _strings: &[String] = &[s/"hello", s/"we", s/"are", s/"strings", s/"😛"];
}

But seriously, the syntax "foo"s (a suffix s) is reserved and can be used for this.


#33

True, but you can also rename it locally, like #[link_name = "foo$bar"] fn foo_bar().

Having struggled with it here a few times, I’m with you on that. :slight_smile:


#34

Lisp and Ruby use a leading colon for symbols (e.g. :catch), so maybe that would help make it easier to learn for people who have seen that pattern before?

Stealing @cuviper’s example:

use foo:::catch::bar;
let foo = :catch { bar: 42 };
let foo = Foo { :catch: 42 };
foo.:catch = 42;
foo.:catch(42);

Personally I think the `catch` variant isn’t as visually distracting as @catch or r#catch. Although using a delimited string for idents means a function name can now contain spaces, which looks odd after all these years of only allowing alphanumeric and underscores in an identifier.


#35

Lisp uses them for keyword symbols, which are symbols in a special namespace. Symbol quoting uses |…| quoting.


#36

Question: how important is the catch problem in practice? How many crates out there indeed use catch as a name of a field or method?


#37

Unknown – maybe not at all. I think the only reason catch couldn’t be a contextual keyword was for the possibility of a struct catch (which already defies naming conventions) and thus catch { ... } constructing it. But that was enough of a blocker that we now have do catch { ... } for the expression, and the keyword breaking change was discussed in the epoch proposal.


#38

I’ve been reminded (by renewed activity) that there is a competing proposal, Infix notation for function call / invocation, that wants to use \foo for something else.


#39

Thanks, then we can also look at raw identifiers used as such infix operators:

  • a \r#catch b
  • a \`catch` b

(in the realm of “silly but syntactically allowable”)


#40

Among 10165 crates from crates.io, 28 use catch identifier for some definition. At least some of them use catch in public API. Full report is here: https://gist.github.com/matklad/acebc3579fc7c70d02056cd8b3d2bf0a.