With some regularity there appears proposals to introduce _::Variant
or .Variant
shorthand for path::Enum::Variant
analogous to Swift's .variant
syntax. I've also seen :Variant
and ::Variant
and other similar ideas e.g. to allow use path::Enum::Variant::*
at the top of match
scope, or to just implicitly glob-import enum variants into patterns. The syntax looks more or less like this:
match foo {
_::Bar => baz,
_::Qux => quux,
}
Undoubtedly Swift set a good precedent with this feature, their community seems to embrace it a lot, and having something similar is certainly desired by many in Rust community.
But all of these proposals then becomes abandoned. I think that's mostly because our language is quite different and it's impossible to recreate in Rust as swift experience as with .variant
in Swift. For example, the closest might have been .Variant
but even the smallest detail like an uppercase letter on it becomes a surprisingly huge obstacle: the beginning of this construct ceases to look as beginning of pattern or expression — it's something new instead and that may not fit into Rust's strangeness budget. Also, as far as I remember community never had been able to reach consensus whether this syntax should be available in patterns, in expressions, or in both, and either whether it should be available for structs, for enums, or for both; — it was determined that technically it's possible to implement whatever combination we want but also it was always hard to motivate any particular.
New idea
I propose to reuse _Variant
syntax.
Yes, I know that this is already valid identifier, and that it differs from previous proposals only in one symbol which may not look significant. But the premise here is that we already have this syntax in Rust, and we already have underscore to mean "elide" or "hide", and we already have a brilliant precedent of introducing new feature through existed syntax with .await
, and we rarely use types starting with underscore anyway.
So, within the current proposal it should work as this:
- Compiler still continues to accept types, variables, constants, etc. with names starting with underscore
- Then when writing
_Something
and matching name is already in scope it's picked first - Otherwise, if it's not in scope then compiler tries to infer enum type and pick its corresponding variant
- And if no enum or variant were found [assuming in pattern context] then a refutable binding is introduced as a fallback, simultaneously compiler issues warning about an uppercase letter in variable name
There's in fact only one user facing addition: to make compiler try pick enum variant first before creating refutable binding or emitting missing type/variable error. And it seems to be fully backward compatible except one case where _X
refutable binding in old code resolves in a new code to an enum that also contains _X
variant. Fortunately, that's an unlikely scenario and it could be very easily mitigated on edition boundary by simply renaming _X
into something else. That said, a new edition would be also required but most likely it would be just a formality.
Examples in pattern context
pub fn is_enabled(&self) -> bool {
matches!(self, Self::Enabled)
}
pub fn is_enabled(&self) -> bool {
matches!(self, _Enabled)
}
pub fn is_enabled(&self) -> bool {
matches!(self, .Enabled)
}
pub fn is_enabled(&self) -> bool {
matches!(self, _::Enabled)
}
// from internals.rust-lang.org/t/bring-enum-variants-in-scope-for-patterns/12104
let is_timeout = match error {
LibraryError::FailedRequest(RequestError::(ConnectionError::Timeout)) => true,
_ => false,
}
let is_timeout = match error {
_FailedRequest(_ConnectionFailed(_Timeout)) => true,
_ => false,
}
let is_timeout = match error {
.FailedRequest(.ConnectionFailed(.Timeout)) => true,
_ => false,
}
let is_timeout = match error {
_::FailedRequest(_::ConnectionFailed(_::Timeout)) => true,
_ => false,
}
To make it clear once and for all: this syntax is for enums only and not for structs or anything else. At first, that would be familiar mental model for Swift users which IMO is important to preserve since both languages seems to copy features from each other and usually are expected to behave similarly. And at second, in this way we can be certain that nobody would become lost in long patterns common in Rust code when destructuring structs like this:
let _ { // Some lookahead is required to understand
app_id, // what we're trying to destructure here
window_properties,
rect: _ {
mut x,
mut y,
width,
height,
..
}
..
} = focused_window;
let swayipc::Node { // This pattern instantly makes sense
app_id,
window_properties,
rect: swayipc::Rect {
mut x,
mut y,
width,
height,
..
}
..
} = focused_window;
And despite patterns is where path inference is desired the most I also propose to make it available in expressions either — exactly how Swift developers implemented it. IMO the symmetry between patterns and expressions is the reason to have this syntax being "activated" by operator on a first place; since anything like _Variant
or .Variant
would parse fine in both contexts why should we artificially restrict it to a particular? Moreover, since Some(x)
already looks the same in patterns and expressions having path inference behaving differently for many people could also appear surprising or even frustrating.
Examples in expression context
let piped = !atty::is(Stream::Stdin);
let piped = !atty::is(_Stdin);
let piped = !atty::is(.Stdin);
let piped = !atty::is(_::Stdin);
let state = if !finished {
state::Progress::Continues
} else {
state::Progress::Completed
};
let progress_state = if !finished {
_Continues
} else {
_Completed
};
let progress_state = if !finished {
.Continues
} else {
.Completed
};
let progress_state = if !finished {
_::Continues
} else {
_::Completed
};
Real world example comparison
I've extracted some snippet (not being written by me) from rustfmt to demonstrate how the proposed syntax and some alternatives may look when there's more context around. This was put on image because the same comparison looks terribly in markdown and even worse on Discuss which puts scrollbars on long chunks of code.
Click on image to enlarge:
So, this is where advantage of _Variant
may become actually visible:
- The current Rust syntax provides more useful information but its problem is that there's a lot of that information and it repeats a lot, moreover, it's not about how visible code works but how containing it code base is organized. When reading code in most of cases I simply not interested in that while for the rest an IDE feature like inlay hints which brings explicit paths back seems to be sufficient. Also, I feel that this information has "compressing" effect on how variables are named e.g. we write
match rx.recv() { path::Msg::Variant => ... }
while instead it should bematch what_messages.recv() { _Variant => ... }
— might be a bit obscure but I hope the idea is obvious. Overall, this is a good looking syntax but hard to read without distractions. - The
.Variant
alternative is rather okay but there's already a lot of dots in Rust code especially in::
and..
and that makes this syntax a bit hard to discern. Also, problematic is the fact that a single dot usually means "field or method access" so with thatmatch
expressions looks almost like method chains and some effort is required to perceive them as control flow constructs (Swift doesn't falls into that because it putscase
before each arm). IMO, disambiguation from method call here even for the compiler might be too complicated task. In contrast, underscores in Rust are semantically attached not to preceding item but to surrounding them scope e.g. that's visible onmatch x { _ => .. }
; and for enum path inference this has a lot of sense especially in expression context e.g. from the above snippetself.visit_attrs(&item.attrs, _Outer)
reads as a mini DSL where_Outer
could be perceived as a continuation of whatvisit_attrs
started to communicate. BTW, also interesting is the fact that underscores in_Variant
beautifully aligns with_ if
and_ =>
inmatch
expression while.
doesn't. - The
_::Variant
alternative is bad because of noisiness and either because here underscore represents some "elided away" namespace common for everything. I also feel that the::
is the source of that noisiness, that it's completely unnecessary and that it's either misleading because it looks like a part of path which wasn't completely elided away. So, for user this might suggest that things like_::path::Enum::Variant
or_::Struct {}
are supported as well or that we have plans to introduce them in the future (I'm skeptical that we would have them). Furthermore, such complicated structure might be annoyingly hard to edit and navigate e.g. it takes three Ctrl+Backspace clicks in order to delete it while only one is required with_Variant
.
Drawbacks
There are of course many, but I don't believe anything is critical.
The most problematic might be C/C++ FFI since a lot of names in these languages begins with underscore and interoperability with them is a priority in Rust. Fortunately, from what I know in both languages users shouldn't define their own types beginning with underscores — these are internal for compiler/tooling implementation and usually in Rust we would find them in very low level code e.g. in core::arch
module. Also luckily for us there doesn't seem to be any _UpperCamelCase
but mostly _snake_case
and _UPPER_CASE
names — that's of course subtle but nevertheless a distinction on which e.g. syntax highlighters could rely.
It's also possible to argue that currently Rust allows prefixing unused types with underscore in order to stop compiler complaining about them — this is derived from the same feature for variables. Indeed there seems to be conflict, but I think we can live with it considering how uncommonly types are prefixed with underscores and that #[allow(dead_code)]
together with infinity of other naming options allows to achieve exactly the same result.
Perhaps there remains some value in having _Name
being a valid type e.g. when experimenting in Rust playground, but I also think that this use case isn't important enough to force us selecting less optimal syntax for enum path inference or completely abandon it. Anyway, that code will still continue to compile, only it might become less idiomatic and the effect of underscore prefix in type names for users might become harder to discover.
What would be objectively bad is the visual similarity between ignored variables and inferred enum paths e.g. when _binding
is placed alongside with _Variant
in match
expression these could be very easily confused between. But it also seems that common reason, syntax highlighting and compiler warnings (e.g. to prevent the aforementioned situation in match
) would mitigate this issue completely so on practice any confusion would rarely occur.
What if we also append enum name suffix?
The constant source of criticism of enum path inference was always that it would inevitably lead to "disorientation" because less specific variant names like _Delete
, _Enabled
, _Stop
could belong to different enums coexisting in the same code base e.g. {LocalFile|RemoteFile}::Delete
, {DarkTheme|Cookies}::Enabled
, {EventLoop|ServerConnection}::Stop
and in some circumstances it may be too complicated to determine whose exactly enum variant we reading.
Somebody have said that this may change the way how we name enum variants e.g. LocalFile::Delete
may become LocalFile::DeleteLocalFile
, Cookies::Enabled
may become Storage::CookiesEnabled
, EventLoop::Stop
may become EventLoopMessage::MessageStop
and so on (unfortunately I've lost link to that discussion). So, this is very solid objection: despite enum renaming alone may be too disruptive we also like the way how currently enums are named and nobody would want to change it.
We can address this, although the way which I propose is very surprising: enum path inference when applied could also force users to append enum name to the variant as suffix. For example, there would be _DeleteRemoteFile
, _EnabledDarkTheme
, _StopServerConnection
and so on instead. I know, this looks weird and perhaps everyone thinks at this point that I've completely lost my mind — what's the point in this syntax when in most of cases it would basically change word ordering resulting in the same amount of symbols?
But this actually seems to solve every problem with path inference for enums:
- Module names on enums would be hidden but not enum names that contains the most useful information
- The construct is monolithic: no parts, no special symbols, so it should be easier to edit and navigate
- More intuitive word ordering for humans that either matches the typical for Rust
value: Type
pattern - Encourages users to uniformly access enums and name them sensibly
- Almost no chances that this will require changes to someone's code in a new edition
- Enum patterns and instantiations are very easy to discern because this syntax is much "heavier"
- It's also self-obvious that this will support enums only and there's no way to enable it on structs
Below is the same image with this and .VariantEnumName
alternatives added:
IMO, this effect is very interesting but I also wonder whether it might be generalized on practically every enum or sometimes an utter nonsense would be still produced. Preliminarily I've ran some experiments on files in rustfmt and in some other crates and so far found only that in some cases it may lead to repetitions like _FnFnKind
on the above example and that particular variants may become incredibly long e.g. _ScrollContinuousPointerEvent
. Anyway, readability of code seems to be improved and these corner cases don't even look that bad.
Real downsides are that perhaps learnability of the language would become worse a bit, and that overall strangeness budget may not welcome such contribution. People either may not perceive it well because advantages of writing code in such way may not be instantly obvious. And of course it's extra work and an extra maintenance burden for language and tooling developers.
So, should I prepare RFC for this feature or it's too strange? Are there people willing to help me with that?