The idea is that a loose "[" IDENT "]"
would not be a valid type or expression path, but would still be a valid module path. We already maintain a grammar distinction between the 3 (type paths have parameters without a turbofish, expression paths have parameters with a turbofish, module paths can't take parameters).
But then one couldn't actually write the following, right?
fn foo<T: [std]::io:Write>(x: T) -> [std]::vec::Vec<T> { ... }
It's a minor thing and there is a number of similar situations already.
There are probably few bigger fans of simple grammar than me, and I'm still okay with this! The disambiguation costs maybe three extra lines in the parser.
EDIT: #39318, for example, was much worse and much less useful.
<TYPE>::a::b::c
needs to be usable with any possible TYPE
and type grammar is too diverse to make types usable as path segments in general (there are pointers, references, lots of other ambiguous stuff). So, only identifiers (+ optional generic arguments following them) are allowed in paths, everything else has to be put in angle brackets <...>
. This
was a conservative decision.
[IDENT]
is a very limited grammar and it can be accepted as a path segment, unlike a general TYPE
.
[std]
is being used as a module path in each of those instances, denoting the root module of crate std
. It always appears in the construction module-path "::" ident
, which is unambiguous once the parser encounters the "::".
AH, and as you said above you can't use plain brackets to put a slice into a path, you need extra angle brackets. Good
So you're saying it would make more sense to treat this exclusively as path prefix, and not allowing use [std];
at all, same as we don't allow e.g. use super;
?
I think I agree with that. It's the one thing that is making me subtly uncomfortable about this syntax. I edited the original post to add this view.
Nit: that's not entirely true. You can have the expression-path [std]::vec::Vec::<T>::clone
. The construction is more like
expr-path <- expr-segment
expr-path <- expr-path-or-crate "::" expr-path
expr-path-or-crate <- expr-path
expr-path-or-crate <- crate-bracket
crate-bracket <- "[" IDENT "]"
// ^ means `[std]::<u32>` is *syntactically* illegal, which I think
// is ok.
Sure, my intent was just to distinguish between a path and a path prefix, so I fudged the details a bit. Maybe a bit more than strictly necessary.
But you can also have such ambiguities in ordinary expressions and there you have precedence rules and you can use parentheses for disambiguation. E.e. *value.member
is not the same a (*value).member
.
The same could also apply to paths.
Yes it's very conservative and I imagine that the rule could be relaxed to only require brackets if strictly necessary (like parentheses). I think this would even be backwards compatible.
This new syntax would probably prohibit such a generalization.
I think [cratename]
is confusing if it’s going to be used inline, and not only in use
statements. Originally I wanted the qualifier on the use
itself:
use extern cratename
or
use extern cratename::something
I don’t think the possibility of code using impl [std]::fmt::Write for Smth
is great for the readability, especially if it’s mixed with arrays etc.
As much as I dislike Go, I think that having every external symbol prefixed by it’s module name/import name was an interesting idea. So in Rust it would mean that only modules (and not symbols itself) can be imported with use
(with a possibility of rename to avoid collisions), and then used with a prefix they were imported with, so always:
use extern std::thread;
fn ... {
thread::sprawn()
}
and use std:;thread::spawn
is impossible.
I’m not sure if’s best idea for Rust, but it sure helps with certain things.
I’ve looked through the token list and selected tokens that can be more or less usable for other-crate-relative paths without ambiguities (including “inline” non-use
paths).
Most of tokens were thrown away immediately because they can be used as binary/unary operators and they can be used freely with paths in expressions.
I tried to not introduce new tokens for this, but in principle ~
can be used regardless of syntax details because it’s unused at the moment, or the backslash token (\
) can be added to the language (can’t say I’m a fan though).
There are roughly three groups of possible syntaxes:
a INFIX b::c
OPEN_DELIM a CLOSE_DELIM b::c
PREFIX a::b::c
INFIX
variants look universally bad, IMO, mostly because they break the path into separate parts.
a _ b::c
a::_::b::c
a crate b::c
a:\b::c // Windows drive letters!
a#b::c
a#::b::c
// etc
OPEN_DELIM a CLOSE_DELIM
variants look better.
[a]::b::c
(a)::b::c
{a}::b::c
// but not <a>::b::c, it's already taken
PREFIX a::b::c
(1) keywords.
From all keywords only crate
, extern
and in
looks somehow relevant.
// Less noise
extern a::b::c
crate a::b::c
in a::b::c
// More noise
extern::a::b::c
crate::a::b::c
in::a::b::c
PREFIX a::b::c
(2) sigils, look kinda random.
_ a::b::c
_::a::b::c
#a::b::c
#::a::b::c
~a::b::c
~::a::b::c
Aaaand… ta-da!
@a::b::c
is not actually ambiguous!11
@
can currently be used only in patterns like this ref? mut? IDENT @ PATTERN
and IDENT PATH
is never a valid token sequence,
so ident @a::b::c
is unambiguously a pattern (ident @ a::b::c == IDENT @ PATTERN
) and not ident @a::b::c == IDENT PATH
, because
the latter is not a valid syntax.
Basically, the three most viable syntaxes were already discussed in various threads:
// The least noisy syntax, and well fitting because `@` has "location" semantics.
@a::b::c
// Slightly more noisy syntax, and memorable because `[ ### ]` is a *crate*.
[a]::b::c
// The "no sigils" variant, good if `use extern a;` will be used roughly like
// `extern crate a;` is used now, with "inline" uses being rare. Probably too
// wordy if used more often (e.g. in every `use` import).
extern a::b::c
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.