Pre-RFC: Raw identifiers

I like the backslash. Probably the at-sign second.

Nothing particular, but I don't see a reason in artificially adding exceptions.

If there's a consensus about disallowed characters for linkers (good point), there would be a valid reason for a restriction (or escape rules).

Seeing the choices in actual code I think I prefer the look of \catch, all else being equal.


Backslash and at-sign do have the downside of not being delimited syntaxes so you need some rules to detect the end of the identifier, I assume if we go for a syntax like that then we’d limit it to being like a normal identifier, which would make this only useful for escaping keywords and not much else.

Potentially it could be extended a bit... but allowing a character like `:` in an escaped identifier would be really annoying for code like `foo::\catch ::Bar { \catch : 5 }` requiring forced whitespace (assuming whitespace is not allowed, but some people probably do want whitespace in their identifiers).

As @matklad mentioned using a delimited syntax like `catch` allows for easily extending the characters allowed in the identifier, which may be useful for things like FFI to languages that have a wider valid identifier character set. It does make it a more heavy-weight syntax, I don’t think I would use `struct` instead of strukt in proc-macros, whereas I have used @class heavily in reflective C# code.


I think deciding whether these raw identifiers should support just reserved, normal-syntax identifiers; slightly extended syntax identifiers; or full, anything goes, string-like identifiers; would be a good first bikeshed before really arguing over the exact characters used to represent the rawness of the identifiers.

2 Likes

If we go with string-like identifiers in backticks, do we also need a raw form of that?

enum Foo {
    `catch`,
    `that was a keyword!`,
    `we can escape like \`catch\` too`,
    `but if we want a \\ in our identifer`,
    r`maybe we'll want raw \ like this`,
    r#`and use raw ticks `\` like this`#,
    r##`even `\` and `#` together`##,
}

It’s a deep rabbit hole!

But it could also be progressively expanded. Start with just identifier characters to address reserved keywords, then open it up to any characters with \ escaping, and then maybe go all the way to really raw forms.

4 Likes

Yes, I think you're right. The only use would be as a shorthand if you want to avoid some kind of mangling attribute on every foreign declaration.

I personally don’t much like \catch since \ is almost universally a single-character escape elsewhere rather than a whole-word escape, so it looks odd to me (though I’m sure I would get used to it over time).

I’ll just throw \{catch} into the pot as an idea while I’m here.

This looks great!

Personally, I was happy when I saw the r#catch syntax. It’s reasonably short, fits decently with other syntax, and doesn’t take up a new sigil that I’d rather keep for a more important use.

I think that having a short syntax is far more important in C# because of reflection and anonymous struct literals. Razor uses things like new { @class = "errorbox" } all over the place as an inefficient associative container, with the field names appearing in the output. I hope Rust doesn’t develop such a pattern.

A possible extension: this could allow both identifier#catch and keyword#catch (probably with shorter prefixes) as a way to also expose catch in Rust 1.0 (or multi-version) code.

Anything except r#… and br#… will break Macros 1.0.

macro_rules! x {
    ($a:ident # $b:ident) => {};
    ($c:ident) => { should error };
}
x!(identifier#catch);
x!(keyword#catch);

Hmm, that macro example also rules out @catch, but \catch and `catch` should still be OK because those characters aren’t yet found in legal tokens. (error: unknown start of token: \)

1 Like

Many C dialects support $ as a character in identifiers (probably for compatibility with VMS), so this might help with FFI if someone uses this particular C feature.

So any new literal kind? That's a shame, and blocks things like s"foo" for Strings too, I guess.

I often feel like every possible syntax addition is breaking for macros...

If possible, I'd like to keep ` unused so writing docs and posts here stays easy :smile:

2 Likes
#[allow(non_camel_case_types)]
#[derive(Copy, Clone)]
struct s;

impl std::ops::Div<&'static str> for s {
    type Output = String;
    fn div(self, literal: &'static str) -> String {
        literal.to_owned()
    }
}

fn main() {
    let _strings: &[String] = &[s/"hello", s/"we", s/"are", s/"strings", s/"😛"];
}

But seriously, the syntax "foo"s (a suffix s) is reserved and can be used for this.

6 Likes

True, but you can also rename it locally, like #[link_name = "foo$bar"] fn foo_bar().

Having struggled with it here a few times, I'm with you on that. :slight_smile:

Lisp and Ruby use a leading colon for symbols (e.g. :catch), so maybe that would help make it easier to learn for people who have seen that pattern before?

Stealing @cuviper’s example:

use foo:::catch::bar;
let foo = :catch { bar: 42 };
let foo = Foo { :catch: 42 };
foo.:catch = 42;
foo.:catch(42);

Personally I think the `catch` variant isn’t as visually distracting as @catch or r#catch. Although using a delimited string for idents means a function name can now contain spaces, which looks odd after all these years of only allowing alphanumeric and underscores in an identifier.

Lisp uses them for keyword symbols, which are symbols in a special namespace. Symbol quoting uses |…| quoting.

1 Like

Question: how important is the catch problem in practice? How many crates out there indeed use catch as a name of a field or method?

Unknown -- maybe not at all. I think the only reason catch couldn't be a contextual keyword was for the possibility of a struct catch (which already defies naming conventions) and thus catch { ... } constructing it. But that was enough of a blocker that we now have do catch { ... } for the expression, and the keyword breaking change was discussed in the epoch proposal.

I’ve been reminded (by renewed activity) that there is a competing proposal, Infix notation for function call / invocation, that wants to use \foo for something else.

Thanks, then we can also look at raw identifiers used as such infix operators:

  • a \r#catch b
  • a \`catch` b

(in the realm of “silly but syntactically allowable”)

Among 10165 crates from crates.io, 28 use catch identifier for some definition. At least some of them use catch in public API. Full report is here: https://gist.github.com/matklad/acebc3579fc7c70d02056cd8b3d2bf0a.

5 Likes