Auto infer namespaces on struct and enum instantiations

fxdave · May 24, 2021, 2:24am

Motivation

In our cases, we have many big enum-struct combinations, and we follow the "Do not add additional context" rule, and because of that, we have a bunch of similarly named structs and enums. That's a problem because we cannot use them due to the conflicting names. Or I fear to use them because they are too simple, generic names that any module could have. Therefore, even the simplest instantiation looks like this:

use crate::msg::{self, viewport_mouse};

msg::Msg::ViewportMouse(
    viewport_mouse::Msg {
        pos: MousePos
        action: viewport_mouse::Action::Down
    }
)

I want to be able to write this instead:

use crate::msg::Msg;

// Hold the root unchanged because that has to be precise.
Msg::ViewportMouse(
    // viewport_mouse::Msg could be simplified
    // because we know the type by the definition
    Msg {
        pos: MousePos
        // viewport_mouse::Action::Down could be simplified    
        // because we know the type by the definition
        action: Down
    }
);

~~I think it fits well to rust because `match` statements already allow this. The kind of usage and the place of the usage is also similar.~~ - EDIT: `match` doesn't allow that.

EDIT: scottmcm's proposed syntax

use crate::msg::Msg;

// Hold the root unchanged because that has to be precise.
Msg::ViewportMouse(
    // viewport_mouse::Msg could be simplified
    // because we know the type by the definition
    . {
        pos: MousePos
        // viewport_mouse::Action::Down could be simplified    
        // because we know the type by the definition
        action: .Down
    }
);

Pinpoints to look at:

use crate::msg::Msg; I import the root Msg enum.

Msg::ViewportMouse wasn't changed. Rust could infer this from usages (return type, var type, etc...), but that would raise some hard questions. I want to simplify the proposal by not allowing its simplification.
So for the root, it would be required to be explicit.

msg::viewport_mouse::Msg is a struct that was simplified to .

msg::viewport_mouse::Action::Down is an enum variant that was simplified to .Down

Implementation

~~Same as match statements where Rust can eliminate enum namespaces, but this would work for structs too.~~ - EDIT: match doesn't do that.

Rust would allow both full paths and leaves inside any struct and enum instantiations. Examples: some_module::SomeEnum::SomeVariant could be simplified to .SomeVariant some_module::SomeStruct could be simplified to .

Cons

1. Possible ambiguity with variables

This warning is currently reported on match statements: https://doc.rust-lang.org/nightly/error-index.html#E0170 However, I think it is not a problem. Rust could continue reporting this warning for matches - and after this - for instantiating. It's not a problem because most of the rust projects follow the naming conventions. ~~It's good to notify the programmer tho.~~ - EDIT: `match` doesn't allow that.

2. Similarly named structs and enum variants look the same

A (A(A))

This usage is clearly wrong. However, this feature does not necessarily lead the code to be like this, and Clippy could warn developers.

3. Changing the definition could compile false positively, which is currently not the case

4. It is already possible to improve readability

4.1 Use with alias

At least we don't have the double colons.

use some::Thing as SomeThing
use some::other::Thing as SomeOtherThing

However, "re-exports can further obfuscate what is used":

4.1 Into

-- "Not horrible, but also lacks the power of the dot"

5. For an N deep instantiations, It requires N more steps from us to get to the type definitions (without an IDE)

Pros

1. More readable

CAD97 · May 24, 2021, 2:49am

It's been previously proposed that when the type is known, to allow eliding it with _. E.g.

msg::Msg::ViewportMouse(
    _ {
        pos: MousePos
        action: _::Down
    }
)

Could you provide an example? I was unaware that struct pattern matching had no visibility restrictions.

fxdave · May 24, 2021, 3:08am

It's been previously proposed that when the type is known, to allow eliding it with _ . E.g.

Great! Could you give me a link to this RFC?

Could you provide an example? I was unaware that struct pattern matching had no visibility restrictions.

~~This currently works:~~

pub mod a {
    pub enum A {
        Foo,
        Bar
    }
}

fn main() {
    let f = a::A::Foo;
    match f {
        Foo => println!("F"),
        Bar => println!("B"),
    }
}

EDIT: I misunderstood the warning description. This does not work.

RustyYato · May 24, 2021, 3:13am

The second arm is unreachable, because Foo and Bar are not matching on the variants.

fxdave · May 24, 2021, 3:18am

The second arm is unreachable, because Foo and Bar are not matching on the variants.

Hmm, that's sad. It looks like I misunderstood the warning description. I fix the post.

scottmcm · May 24, 2021, 3:43pm

I don't think it's ever made it to an RFC. Here's a recent discussion: Short enum variant syntax in some cases - #5 by scottmcm (with a link to the discussion before that...)

I've also pondered using . for it, kinda like Swift, to avoid it looking like "path elision" (which doesn't exist, and probably never should -- no std::_::Add, for example).

With that your example would then be

msg::Msg::ViewportMouse(.{ pos: MousePos, action: .Down })

fxdave · May 24, 2021, 8:41pm

Thanks, I have read the discussion. This swift syntax is even better than that I've proposed. I update the main post.

I see there aren't any clues about not having this feature, except one: I've read, changing the definition does not always produce errors in instantiations. For example:


// We define two similar but different enums:
enum IsSomething {
    True,
    False
}

enum IsSomethingElse {
    True,
    False
}

// We define a struct with one of the enums
struct A {
   is_something: IsSomething
}

// Later, we change the definition
struct A {
   is_something: IsSomethingElse
}

// The instantiation will not change:
A {
   is_something: .True
}

This is reported currently because we need to change the IsSomething to IsSomethingElse in every instantiation. After this change, this kind of modification will not be reported, which theoretically could be a problem because we said yes to a different subject.

In practice, is_something would possibly be changed as well to is_something_else, which would cause errors, so that would be good. We cannot be 100% sure tho because it is possible to change the type without the parameter name.

Besides this, I think the situation is rare. In contrast, instantiating an enum is something that every Rust program has, so I would still love to see this simplification.

A more common but similar situation is to rename a struct, which would not be a problem because the struct has the same meaning. That would be even better because we only change the definitions.

skysch · May 24, 2021, 10:08pm

The commonality of enums also works against this simplification. Given that most Rust code is written to avoid stuttering (Enum::Variant is preferable to Enum::EnumVariant), name collisions are not uncommon -- infact, very likely -- because people are designing enums under the assumption that the variant need not indicate which enum it belongs to, because it is always namespaced. So you're effectively going to reverse that preference.

('Going to' is perhaps too strong. But this would create a good reason to stutter where few currently exist.)

scottmcm · May 24, 2021, 10:25pm

This is definitely my biggest concern with the enum version. It would be very possible for currently readable things to become a confusing call like process_request(.Skip, .Error) that made sense when the variants had their types there too.

(OTOH, people could currently do that by importing the variants, and they generally don't, so it might be ok to trust people to put the type names there if it's important for readability. And this isn't a problem in struct literals -- ProcessRequest { cookie_handling: .Skip, redirect_handling: .Error } doesn't have the problem that the function call does.)

skysch · May 24, 2021, 10:36pm

I can't speak to what people do, but this is some of my recent enum heavy code:

Example code

impl<T> Expr<T> where T: Eval {
    fn simplify(self) -> Self {
        use Expr::*;
        
        match self {
            Not(p) => match *p {
                Not(q) => q.simplify(),
                q      => Not(Box::new(q.simplify())),
            }
            And(a, b) => {
                let a = a.simplify();
                let b = b.simplify();
                if a == b { a } else { And(Box::new(a), Box::new(b)) }
            },
            Or(a, b) => {
                let a = a.simplify();
                let b = b.simplify();
                if a == b { a } else { Or(Box::new(a), Box::new(b)) }
            },
            _ => self,
        }
    }

    fn pushdown_not(self) -> Self {
        use Expr::*;
        if let Not(expr) = self {
            match *expr {
                Bool(p) => Not(Box::new(Bool(p))),
                Not(p) => p.pushdown_not(),
                Or(a, b) => And(Box::new(Not(a)), Box::new(Not(b))),
                And(a, b) => Or(Box::new(Not(a)), Box::new(Not(b))),
            }
        } else {
            self
        }
    }

    fn distribute_and(self) -> Self {
        use Expr::*;
        if let And(a, b) = self {
            match (*a, *b) {
                (p, Or(q, r)) |
                (Or(q, r), p) => Or(
                    Box::new(And(Box::new(p.clone()), q)),
                    Box::new(And(Box::new(p), r))),
                (a, b) => And(Box::new(a), Box::new(b))
            }
        } else {
            self
        }
    }

    fn distribute_or(self) -> Self {
        use Expr::*;
        if let Or(a, b) = self {
            match (*a, *b) {
                (p, And(q, r)) |
                (And(q, r), p) => And(
                    Box::new(Or(Box::new(p.clone()), q)),
                    Box::new(Or(Box::new(p), r))),
                (a, b) => Or(Box::new(a), Box::new(b))
            }
        } else {
            self
        }
    }
}

It's very hard for me to see any advantage in trying to simplify this. It's already simple. Adding more symbols and contextual prefixes into the mix seems like it would increase the opportunity for beginner confusion and line noise, without making this kind of code any clearer. If I'm not importing the enum variants, it's either because (a) they are ambiguous in the first place, and this feature wouldn't solve that or (b) I'm not using very many variants, in which case, I'd rather have the enum type for clarity rather than save a few chars.

fxdave · May 24, 2021, 11:24pm

Here is an example of our case:

Example

fn handle_mouse_down(e: &web_sys::MouseEvent) -> Option<msg::Msg> {
    e.stop_propagation();
    let pos = get_mouse_point(e);
    let pos_on_window = get_mouse_point_window(e);
    let target = e.target().unwrap().unchecked_into::<web_sys::Element>();

    if let Some(draggable) = web_sys_utils::find_parent(target.clone(), "draggable") {
        let node_id = web_sys_utils::get_data::<usize>(&draggable, "id")
            .expect("Currently, every draggable element should have an id");
        if draggable.class_list().contains("node") {
            Some(msg::Msg::ItemMouse(item_mouse::Msg {
                pos,
                action: item_mouse::Action::Down,
                item_type: item_mouse::ItemType::NameDescriptionSpace,
                id: node_id,
            }))
        } else if draggable.class_list().contains("node__linker") {
            Some(msg::Msg::WindowMouse(window_mouse::Msg {
                pos: pos_on_window,
                action: window_mouse::Action::DownOnLinker(node_id),
            }))
        } else {
            unimplemented!("You can only drag nodes and linkers currently");
        }
    } else if let Some(relation) = web_sys_utils::find_parent(target, "relation") {
        let relation_id = web_sys_utils::get_data::<usize>(&relation, "id")
            .expect("Currently, every relation element should have an id");
        Some(msg::Msg::ItemMouse(item_mouse::Msg {
            pos,
            action: item_mouse::Action::Down,
            item_type: item_mouse::ItemType::Relation,
            id: relation_id,
        }))
    } else {
        Some(msg::Msg::ViewportMouse(viewport_mouse::Msg {
            pos,
            action: viewport_mouse::Action::Down,
        }))
    }
}

which could be:

Example after this feature

fn handle_mouse_down(e: &web_sys::MouseEvent) -> Option<Msg> {
    e.stop_propagation();
    let pos = get_mouse_point(e);
    let pos_on_window = get_mouse_point_window(e);
    let target = e.target().unwrap().unchecked_into::<web_sys::Element>();

    if let Some(draggable) = web_sys_utils::find_parent(target.clone(), "draggable") {
        let node_id = web_sys_utils::get_data::<usize>(&draggable, "id")
            .expect("Currently, every draggable element should have an id");
        if draggable.class_list().contains("node") {
            Some(Msg::ItemMouse(. {
                pos,
                action: .Down,
                item_type: .NameDescriptionSpace,
                id: node_id,
            }))
        } else if draggable.class_list().contains("node__linker") {
            Some(Msg::WindowMouse(. {
                pos: pos_on_window,
                action: .DownOnLinker(node_id),
            }))
        } else {
            unimplemented!("You can only drag nodes and linkers currently");
        }
    } else if let Some(relation) = web_sys_utils::find_parent(target, "relation") {
        let relation_id = web_sys_utils::get_data::<usize>(&relation, "id")
            .expect("Currently, every relation element should have an id");
        Some(Msg::ItemMouse(. {
            pos,
            action: .Down,
            item_type: .Relation,
            id: relation_id,
        }))
    } else {
        Some(Msg::ViewportMouse(. {
            pos,
            action: .Down,
        }))
    }
}

The second case would be much more readable.

In the first case, it is hard to see the logic behind the code, because there are so many meaningless words. When I read the code it is quite similar to when I open /dev/sda1 in less, I would probably see some meaningful ASCII words, but the rest is something like this:

t^@^B^H ^@^B^@^@^@^@<F8>^@^@ 
^@@^@^@^H^@^@^@^@^P^@^@^D^@^@^@^@^@^@^B^@^@^@^A^@^F^@^@^@^@^@^@^
@^@^@^@^@^@^@<80>^A)=<B4>^F<D0>NO NAME    FAT32   ^N^_<BE>w|<AC>"
<C0>t^KV<B4>^N<BB>^G^@<CD>^P^<EB><F0>2<<E4><CD>^V<CD>^Y<EB><FE>T

(Of course, it is much better in our case.)

I think it's worth allowing this. Some other code examples would be beneficial tho.

skysch · May 25, 2021, 12:47am

This is how I would write that code:

Rewritten

use web_sys::MouseEvent;
use web_sys::Element;
use msg::Msg;

// These should be scoped to the function unless they're used in multiple places.
use window_mouse::Msg as WindowMouseMsg;
use window_mouse::Action as WindowMouseAction;
use item_mouse::Msg as ItemMouseMsg;
use item_mouse::Action as ItemMouseAction;
use item_mouse::ItemType;
use viewport_mouse::Msg as ViewportMouseMsg;
use viewport_mouse::Action as ViewportMouseAction;
use web_sys_utils::get_data;
use web_sys_utils::find_parent;


fn handle_mouse_down(e: &MouseEvent) -> Option<Msg> {
    e.stop_propagation();
    let pos = get_mouse_point(e);
    let pos_on_window = get_mouse_point_window(e);
    let target = e.target().unwrap().unchecked_into::<Element>();

    if let Some(draggable) = find_parent(target.clone(), "draggable") {
        let node_id = get_data::<usize>(&draggable, "id")
            .expect("Currently, every draggable element should have an id");
            
        if draggable.class_list().contains("node") {
            Some(Msg::ItemMouse(ItemMouseMsg {
                pos,
                action: ItemMouseAction::Down,
                item_type: ItemType::NameDescriptionSpace,
                id: node_id,
            }))
        } else if draggable.class_list().contains("node__linker") {
            Some(Msg::WindowMouse(WindowMouseMsg {
                pos: pos_on_window,
                action: WindowMouseAction::DownOnLinker(node_id),
            }))
        } else {
            unimplemented!("You can only drag nodes and linkers currently");
        }

    } else if let Some(relation) = find_parent(target, "relation") {
        let relation_id = get_data::<usize>(&relation, "id")
            .expect("Currently, every relation element should have an id");
        Some(Msg::ItemMouse(ItemMouseMsg {
            pos,
            action: ItemMouseAction::Down,
            item_type: ItemType::Relation,
            id: relation_id,
        }))

    } else {
        Some(Msg::ViewportMouse(ViewportMouseMsg {
            pos,
            action: ViewportMouseAction::Down,
        }))
    }
}

This is the rewrite that your proposed feature is supposed to improve. I argue that it doesn't, and that this code is as readable as any Rust code out there.

Rust already has a notoriety for line noise. It doesn't need another noisy syntax and inference mechanism to solve a superficial problem.

Which you've replaced with meaningless symbols. But those words aren't meaningless, they tell the reader what your data looks like and where to go to understand it. Hiding that does not make the code more readable, it just makes it more contextual and more terse.

Where in the code am I supposed to find out what a Msg::ViewportMouse takes for an argument? I can see in your original code that it takes a viewport_mouse::Msg. In your rewrite, I see that it takes a .. Off to read the docs we go.

More over, things like Msg::ItemMouse(ItemMouseMsg { .. }) Should be replaced with Msg::from calls. It is obviously trivial to convert an item_mouse::Msg into a msg::Msg, so implement that conversion. That gets rid of all of your redundant wrapping.

Overall, the point is thus: you have many ways to improve your code. Use them first, then after you can clearly demonstrate that none of them work, propose a language change.

fxdave · May 25, 2021, 1:34am

Thanks for improving my code.

Rust already has a notoriety for line noise. It doesn't need another noisy syntax and inference mechanism to solve a superficial problem.

Is that mean any syntactic sugar proposal would be discarded?

it just makes it more contextual and more terse.

Indeed, however, I think it is part of the readability.

Where in the code am I supposed to find out what a Msg::ViewportMouse takes for an argument? I can see in your original code that it takes a viewport_mouse::Msg . In your rewrite, I see that it takes a . . Off to read the docs we go.

IDE could know that what is the ., and the parent is always there.

Overall, the point is thus: you have many ways to improve your code. Use them first, then after you can clearly demonstrate that none of them work, propose a language change.

Although it works, the proposed syntax is less verbose, and faster than aliasing every type.

skysch · May 25, 2021, 2:16am

None of the code presented in this thread is being read in an IDE. None of the code I write is written in an IDE. If Rust changes such that I can't read or write reliable and clear code without one, I'm not getting an IDE, I'm finding a better language. (Anyway, If your IDE is a solution, make it patch up your unreadable code for you. Make it hide the redundancies and silently swap . for inferred types & variants. You don't need the compiler to do it.) This is a general rule: any argument for improving the language is only weakened by the willingness to assume an IDE is available; because now you can always consider improving the IDE instead.

Besides, you're contradicting yourself. The code cannot be more readable and also need an IDE to better understand. Those are in direct conflict with each other.

CAD97 · May 25, 2021, 3:08am

There's two levels of readability involved here.

I want a high-level overview of what the code is doing. Here, the takes_enum(Enum::Variant(EnumVariant { .. })) is line noise compared to takes_enum(.Variant(_ { .. })), because takes_enum is enough context to infer the enum argument, and EnumVariant is only used for the definition of that one enum variant.
I want to understand the exact types involved, so that I can locate and edit them as necessary, and trace the exact execution path of the code. Here, having the types spelled out helps.

But I think we should career to 1 over 2. Specifically, because we already have type inference. What is the type of eggs in let eggs = foo_and_then_bar(&mut spam);? You don't know until you check the signature of foo_and_then_bar.

It doesn't hurt the explicitness of Rust to extend inference to the types of arguments. Explicit is not noisy, burdensome, not even local, after all. In addition, Rust does not have function overloading, so it's always clear from the function in question what the type of the argument is (otherwise type inference wouldn't kick in in the first place).

It comes down to that if you elide too much information, your code is harder to read. It's not the language's fault if you do that, though. The inference/elision discussed in these proposals is clearly beneficial in some notable cases (the biggest one being giving us effective kwargs for free and in a backwards compatible way), and when names are too generic, continue to spell out the paths/types/etc.

Does a type in the following example add any useful information to the code useful to understanding what it does? I would argue, no, it doesn't, unless your specific purpose is to edit the type, in which case you'll want to edit the function as well, so jumping to the function first doesn't even add any extra steps.

let call = RemoteCall::from_config(_ {
    target: "https://example.com/api",
    fallback: "https://example.com/bpi",
    timeout: Duration::from_secs(10),
    ..
})?;
let result = call.submit().await?;

The best typed without new inference placeholders alternative, without adding Ng builder types

let call: RemoteCall = remote_call::Config {
    target: "https://example.com/api",
    fallback: "https://example.com/bpi",
    timeout: Duration::from_secs(10),
    ..
}.try_into()?;
let result = call.submit().await()?;

Not horrible, but also lacks the power of the dot as well as doc discovery ("I'm on the page for RemoteCall, but how do I make one?").

skysch · May 25, 2021, 3:43am

That's something that could be fixed in the docs. I would say, it should be, because Rust's docs are awful in a number of ways: chief among them is the inability to build a prioritized and cohesive picture of how to use an object or trait, but it is also difficult to hide irrelevant details and show important ones. For instance, I went here to implement this trait, but the declaration is hidden by default, (sometimes doesn't even include the full definition), and I'm stuck staring at a worthless example of how to use it if I had already had an implementation instead. Traits should have an example implementation 'template', impls should be grouped, the sidebar should have better grouping, and I shouldn't have to click "unhide" multiple times to see something new, and anything under a doc header should probably be hidden by default unless there is nothing above it.

CAD97 · May 25, 2021, 4:42am

Traits are used much more often then they're implemented, of course the default page layout is going to be biased for that. Functions on the page are presented in the order they are in the source file (exception: required trait methods are hoisted above provided methods), and (for structs) grouped by what impl block they're in, complete with the comment on said impl block.

Rust's generated documentation is miles above what any scripting language has (basically, just semi structured ways to describe your API in prose), and the only thing I could argue that javadoc/doxygen do better is putting a summary of all the methods' signatures front and center, but that's partly because that's most of the useful information you'll get out of a javadoc/doxygen generated documentation.

I fail to see how both of these things can be true simultaneously. Either it's collapsed by default and you need to unhide it, or you don't need to unhide it, it's uncollapsed.

Either way, this isn't the thread to discuss rustdoc. If you have concrete constructive proposals on how to improve rustdoc and/or concrete stdlib documentation, please do open a new thread, though. As a community we're highly invested in making our docs as good as they can be. (Just do keep in mind that rustdoc docs are API reference docs, and their design caters to that over a more guide style book like is common for JS libs on e.g. readthedocs.)

skysch · May 25, 2021, 2:43pm

See here:

There are two [-] expansion boxes. Clicking on the first reveals nothing but the other. One shouldn't have to expand two different frames to see only a single line of text.

kornel · May 25, 2021, 2:56pm

Yes, I'd love this. Swift has shown that this can be used to make very nice APIs.

mathstuf · May 26, 2021, 11:53am

To echo @skysch's reply, I read quite a lot of code on websites (mainly when reviewing code, but also occasionally found through searching). There are no IDE facilities in such places, so expecting IDE support for readability is not something I'd like to see at least.

Topic		Replies	Views
[Pre-RFC] Inferred Enum Type language design	61	5009	May 16, 2022
Pre-RFC: Struct constructor name inference language design	35	3140	December 18, 2019
Elide the enum name on match language design	6	937	December 13, 2022
[IDEA] Implied enum types language design	83	4965	September 12, 2023
Improving struct init shorthand language design	21	1481	September 9, 2021