[Pre-RFC] Inferred Enum Type

Sure, I see that you could do it, but your example is only made shorter, not more abstract/expressive, by omitting the names of the types. I don't see an advantage in expressiveness of _ {..} over simply using the name of the struct. Unless, of course, the struct name is not visible in this context, which is a thing you can do, but doesn't seem like a useful technique or something to encourage people to do.

In other words, in what context does it make sense to destructure a struct or enum where you can't name it? Maybe for macros? :thinking: I think a more general typeof mechanism is still the superior solution though.

5 Likes

Boy, I don't know. I find all those _'s quite an eyesore. Some redundancy is good for readability.

Let's say I want to decipher sig's type. It's elided on the right, so I guess I'll look up to see what struct it's contained in. Oops, the struct name is elided as well. Okay, well to figure out what struct this is I need to know what Fn is, but the enum name is elided. Darn it again. I guess I have to go see what item is, then work my way back down the type hierarchy once I do? That's a lot of effort just to figure out sig.

Natural language has a fair amount of redundancy -- for good reason. If you miss a word you can almost always figure it out from context. We shouldn't go too far trying to eliminate redundancy from Rust.

13 Likes

Without even knowing the exact context of that code, I can make a well-educated guess at the three _ being Item, Fn and FnSig respectively (or very similar synonyms to these if I've forgotten the exact names). IMO there are cases like this where the type tree is so commonly used in certain types of projects that being able to elide this redundancy is worthwhile, the enhanced readability for developers experienced in the context outweighs the reduced readability for newer developers (and one of the most prominent feature of IDE integrations is the ability to view elided/inferred types).

1 Like

FWIW, I have a lot of sympathy for this position.

While I do like the idea of having _ inference available in a few places, I would only want to add it if 1) it's perceived as a genuine improvement to both writing and reading code in the places it's available, not just an abbreviation people can live with, and 2) we take this kind of social pressure into account, and make sure that the places where it's used are benefits and the places where it isn't a benefit discourage it (either via lint or by making it not available in that context, depending on what makes sense).

I would not want to end up in a situation where a feature is simultaneously not a benefit and incurs social pressure to use it anyway.

10 Likes

To argue the opposite side from my previous post, this does help the "but I want to grep for HttpMethod" scenario a bit. It still doesn't let them find the actual uses, especially if it's in a crate-local prelude-style module, but even then they could delete it and get compiler errors at the uses.

I agree, but to me that's the argument for why parameters and return types need to have their type specified exactly. They're quite often redundant -- whole-program inference SML-style demonstrates that quite clearly.

Inference is "figure it out from context", by definition.

If it's not clear enough to the reader, then that says it should be split into more functions or otherwise add more type annotations. But that's already the case today. Especially if, as CAD97 mentioned, you just use field access. Since you can use all the fields and call all the methods you want without ever needing to annotate a type anywhere.

So if let a = foo().a; is fine -- which we have to assume it is because it's allowed today and people do it all the time without complaint -- why wouldn't let .{ a, .. } = foo(); also be fine?

(Aside: I'm arguing a little bit on both sides in this thread. My goal is to try to tease out whatever the differences are between things proposed here and things that are already accepted. I find that's the best ways to make progress on something that has lots of gut reactions -- we can at least make progress on agreeing on the distinctions, even if we get to different conclusions from weighing those distinctions differently.)

Agreed! I'll add that it's often much easier to find things in rustdoc anyway -- especially if you're not in one of those IDEs that gives type hints.

I wrote some code the other day that was basically this, following rustfmt:

match &data.terminator().kind {
    TerminatorKind::SwitchInt {
        discr: Operand::Constant(constant),
        switch_ty,
        targets,
    } =>

I'd be quite happy to save the vertical space to get to

match &data.terminator().kind {
    .SwitchInt { discr: .Constant(constant), switch_ty, targets } =>

Because if I'm familiar with the area, terminator().kind is plenty for me to know that "yup, it's a TerminatorKind".

And if I'm not familiar, I can ask rustdoc and it takes me right there: https://doc.rust-lang.org/nightly/nightly-rustc/?search="SwitchInt"

At least, that's what I do to find for the other places that are already using uses. For example, here was the definition of that method:

    fn reachable_blocks_in_mono_from(
        &self,
        tcx: TyCtxt<'tcx>,
        instance: Instance<'tcx>,
        set: &mut BitSet<BasicBlock>,
        bb: BasicBlock,
    ) {

Where do those come from? I dunno, I'll ask rustdoc. I'd probably want to go there to know what to do with the type anyway.

And I have no interest in looking through the uses. They look like this:

use crate::mir::coverage::{CodeRegion, CoverageKind};
use crate::mir::interpret::{Allocation, ConstValue, GlobalAlloc, Scalar};
use crate::mir::visit::MirVisitable;
use crate::ty::adjustment::PointerCast;
use crate::ty::codec::{TyDecoder, TyEncoder};
use crate::ty::fold::{TypeFoldable, TypeFolder, TypeVisitor};
use crate::ty::print::{FmtPrinter, Printer};
use crate::ty::subst::{Subst, SubstsRef};
use crate::ty::{self, List, Ty, TyCtxt};
use crate::ty::{AdtDef, Instance, InstanceDef, Region, ScalarInt, UserTypeAnnotationIndex};
use rustc_hir::def::{CtorKind, Namespace};
use rustc_hir::def_id::{DefId, CRATE_DEF_INDEX};
use rustc_hir::{self, GeneratorKind};
use rustc_hir::{self as hir, HirId};
use rustc_target::abi::{Size, VariantIdx};

use polonius_engine::Atom;
pub use rustc_ast::Mutability;
use rustc_data_structures::fx::FxHashSet;
use rustc_data_structures::graph::dominators::{dominators, Dominators};
use rustc_data_structures::graph::{self, GraphSuccessors};
use rustc_index::bit_set::{BitMatrix, BitSet};
use rustc_index::vec::{Idx, IndexVec};
use rustc_serialize::{Decodable, Encodable};
use rustc_span::symbol::Symbol;
use rustc_span::{Span, DUMMY_SP};
use rustc_target::asm::InlineAsmRegOrRegClass;
use std::borrow::Cow;
use std::convert::TryInto;
use std::fmt::{self, Debug, Display, Formatter, Write};
use std::ops::{ControlFlow, Index, IndexMut};
use std::slice;
use std::{iter, mem, option};

use self::graph_cyclic_cache::GraphIsCyclicCache;
use self::predecessors::{PredecessorCache, Predecessors};
pub use self::query::*;

Most of which I didn't add.

There's just so many that they're basically useless to me as a human.

And, oh look, one of them is a * anyway:

1 Like

If they are willing to edit files and launch a compiler instead of only using grep, maybe rust-analyzer’s “find all references” should be on the table as well, which I expect to find _:: if this happens.

6 Likes

I simply don't use rust-analyzer because even my best efforts find it challenged at addressing the code I write, which is often code pushing the leading edge of the compiler abilities, which do not have support in r-a, and I increasingly am reducing my usage time of the text editor it has the best support for (VS Code). It also often actually delivers overall worse diagnostics due to the fact that no text editor I am aware of fully supports the features that would be required to actually deliver good ones.

2 Likes

Ok, so direct comparison. What's different that you don't like in the proposed

match item {
    _::Fn(_ {
        sig: _ { ident, generics, inputs, output, .. },
        ..
    }) => todo!(),
    _ => bail!(),
}

where today I could write the following—with the exact same amount of type info (if not less)—and nobody would complain:

if let Some(item) = item.as_fn() {
    let ident = &item.sig.ident;
    let generics = &item.sig.generics;
    let inputs = &item.sig.inputs;
    let output = &item.sig.output;
    todo!()
} else {
    bail!()
}
2 Likes

I argue that we shouldn't give this argument too much weight. grep is already often inadequate for finding uses of items, especially types. There are many limitations:

  • Items can be renamed on imports
  • Grep may find a lot of false positives, which adds noise to the search results:
    • The same name may be used for different things
    • The name may appear in strings and comments
  • Because of type inference, the type of bindings is often not visible to grep

While grepping can work well for fields or enum variants, it is quite unreliable for types. This is not necessarily a problem though, because IDEs can offer more reliable ways to search for usages of a type.

The bigger problem is that humans don't see the type. However, this doesn't seem to be much of a problem in practice, at least in Java, where you can omit the enum type in switch statements.

4 Likes

It's a fair question. It's foremost a gut-level, aesthetic reaction. I can try to justify it but to be clear I didn't reason my way into my opinion.

I think part of it is that every time I see _ it's like seeing a foreign word in a piece of text. It makes me tap the brakes. I have to stop and think, "What does that word mean?" Imagine the first snippet had four :question: question marks. It'd be pretty distracting. _ feels like that. It's supposed to be this unobtrusive nothing-symbol, but it actually draws attention to itself because it's not a normal alphanumeric identifier.

Another part is that it reminds me of Perl and its overuse of sigils. Rust is quite good about not being too symbol heavy. We have enough ::<>s and |_|s and @s in the language as it is.

If this feature were to be added, I actually prefer @scottmcm's .Variant syntax. .Item is easier on the eyes than _::Item.

9 Likes

I think I first mentioned this syntax in Auto infer namespaces on struct and enum instantiations - #6 by scottmcm, but all I did is shamelessly steal it from Swift https://docs.swift.org/swift-book/LanguageGuide/Enumerations.html#ID147.

1 Like

The most compelling argument I've seen against this feature is that it may make code less greppable, since the enum name will appear in fewer places. And I do think that's an important argument to balance.

I would object that the problem with the feature making the code "less greppable" is not that the EnumName becomes _. It's that we use the lexical grep search in the first place. When searching for every occurence of EnumName, I would expect the best/recommended solution to be: use some project-level semantic search instead. Why not some rust-analyzer feature, to make both occurences of EnumName and relevant _ show up :slight_smile:

Well, I understand why any automated code analyzer tool would be challenged by the code you are writing then. But the situation you describe seems the very kind of situation you would like to avoid writing _::Variant for the very purpose of remaining able to grep the enum name lexically. This is not an argument in favour of not making this option available to other "regular" types of code, right?

And yet my code, no matter how bleeding edge it is, has to interface with a large project, most of which is written in a more conventional Rust style: the Rust compiler. And I have to interface with what I would frankly say is a random sequence of modules in the compiler and standard library, each time. And if the tool fails on the combined set of my code and that, then the tool fails entirely, for my purposes. So I take my lack of tooling into a more conventional project, yet retain an inability to manage any lexical peculiarities that are justified primarily by some other tooling being used to prop it up.

I must admit, it is an eye-sore. How about allowing omitted type names only in nested structs in patterns:

match item {
    Item::Fn({
        sig: { ident, generics, inputs, output, .. },
        ..
    }) => todo!(),
    _ => bail!(),
}

Looking at it from perspective from C/C++ syntax, it's similar to aggregate initializers' syntax.

2 Likes

The parser could unambiguously recognize { ident : as the start of an anonymous struct... if type ascription wasn't a thing on nightly.

1 Like

Sometimes I want to find usages of a struct or enum by a crate. I don't know if rustdocs has a feature for this, so what I've done is visit search in the git repository on GitHub. This is pretty helpful if the crate documentation is not clear on some details. This could be a reason for people wanting grep to work.

I consider matching an enum is analogous to destructuring a struct, which also requires the type name.

let foo = Bar { x: 10 };
...
// Explicit version
let Bar { x } = foo;
// Inferred version
let _ { x } = foo;

The analogy is somewhat flawed, because enum matching is so more common than struct destructing, and it requires repeating the typename for every matched variant. A compromise might be a solution that requires the enum typename to only be used once.

I've been thinking more about this example, and found something to distinguish at least one part of it: field access syntax is expression form of an irrefutable pattern, in a way.

So that says to me that one might say that this is unambiguously fine for structs, but it's not necessarily as obvious for enum variants.

Though maybe that's an indication that we're lacking syntax for enums. The existence of that as_fn (and related things like Result::ok) makes me imagine a world where instead of needing to make the method, it's just, say, if let Some(thing) = item.Fn { instead -- it's not like there are any fields on enums right now.

And even if it was only irrefutable patterns where it was ok, that'd still be nice for things like avoiding the Type::Array(ArrayType { element_type, length }) => repetition code from making distinct types to put in the enum variants. Having Type::Array(.{ element_type, length }) => makes that that much less annoying, while still being quite clear.

Come to think of it, we already have pattern examples of not needing to specify types for irrefutable patterns: both _ and bindings already work exactly like that!


I do see a bunch of potential practical problems with this, like how it can't be a place projection the way the other things are. And that means the distinction between .as_fn() and .as_mut_fn() and .into_fn() might be trickier to encode. But it's more a thought experiment, not a fleshed-out proposal, so some unresolved questions aren't a problem

1 Like

Counter to this example is enum variant types (and even more so types as enum variants, though that has far less support).

If/when Type::Array is a type of its own, I expect new code to have a lot less Type::Array(struct ArrayType { type, length }) and more Type::Array { type, length }. Only time will tell, though.

And an irrefutable pattern type elision will still be useful for other cases of nested structs, even if usage of newtype variants decreases.

variant type vs. type variant

These aren't the point of the thread, but do have an impact on the usage of newtype variants.

"Enum Variant Types" is where Enum::Variant is a proper refinement type of Enum. A value of type Enum::Variant is effectively a value of type Enum that is known to be the Variant variant.

"Types as Enum Variants" is effectively a new kind of enum variant which is a structural (rather than nominal) newtype variant. Ultimately it boils down to a newtype variant with extra sugar, notably in that you pattern match it as Type { .. } rather than Enum::Type(Type { .. }).

It's worth noting that being able to elide the type name for irrefutable struct patterns would all but subsume the benefits of types as enum variants, especially when enum variant types are a thing.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.