[Pre-RFC] Well-known import paths

Motivation

There are some items that are usually imported by the name directly, such as AtomicUsize. And there are some types that are usually imported by module, e.g. atomic::Ordering, or mpsc::channel/mpsc::Sender. Editors are usually designed to autocomplete with the former method, which causes a bit of annoyance for items that are explicitly intended to be imported by module, resulting in unidiomatic code. Furthermore, these types often result in duplicate names, which look terrible on rustdoc output.

Proposal

Importable items can use an attribute like this:

#[well_known($import:path $(as $alias:ident)? $(, ::$usage:path)?)]

Using examples as above:

#[well_known(self, ::Ordering)] enum Ordering {...}
#[well_known(crate::sync::mpsc, ::channel)] fn channel() {...}
#[well_known(crate::result::Result::Ok, ::)] 
  • $import is resolved at the item's module scope, where path aliases like crate, super and self are allowed. This tells the editor that we want to add an import for the item referenced by $import. Path arguments are not allowed in this path.
  • $alias is an alternative identifier that $import should be aliased to. If not provided, it should be the last segment of $import.
  • $usage is the path used to reference the actual item from the import. If omitted, this means $import already resolves to the item.

If the #[well_known] attribute is not provided for a module-level item, the default is #[well_known(self::ItemIdent, ::)].

If the #[well_known] attribute is not provided for sub-module-level item (e.g. enum variants), it the default is #[well_known(self::ContainingType, ::ItemIdent)].

Compiler changes

When compiler reads a #[well_known] attribute, it validates that at the current scope, $import::$usage (or $import if $usage is omitted) resolves to the item that the attribute is applied to.

Lint changes

A new allowed-by-default lint well_known_paths is introduced. For each reference to item #[well_known(import as alias, ::usage)] item from some code under module user, if user is not equal to or a submodule of import (or the module containing import if it refers to an item), the lint is triggered if the reference is not in the form alias::usage, with a fix suggestion that changes the reference to alias::usage, and inserts use import as alias if missing.

Rust-analyzer changes

When autocomplete is executed for an item, instead of just importing the path into scope directly and autocomplete the item identifier, import alias::usage instead and add use import as alias if missing, similar to the lint behavior.

Rustdoc changes

When item is referenced from another module outside import in the signature (e.g. function inputs/outputs, type bounds, implementors), generate alias::usage as the text, instead of the current behavior of just using the identifier.

When item is linked by user documentation through intra-doc links in the form [path] (instead of [text](path)), use the well-known name instead of the path as provided by the user.

5 Likes

FWIW, rust-analyzer can complete "through" qualifiers, but yeah, this is not ideal

atomic

I'd say that the fundamental capability we need here is custom tool attributes. If something like #[ide(import_qualified)] just didn't cause a compilation error, that would be enough to solve the problem in practice, as IDEs would be able to use that for completion & their own linting without coordination with compiler. There's no need to annotate std with those attribute, an IDE might include a sort of extenral "type stub".

Maybe we don't even need custom attributes, and just need to add a new namespace in addition to rustfmt and clippy, for example, literal ide?

Separately, I think it would be good to somehow more explicitly bless conventional imports somewhere. What I've noticed at work is that a half of the people use impl Debug, the other half impl fmt::Debug and the third half impl std::fmt::Debug, and it's rather hard to achieve consistency organizationally without anything with authority to point to (the same goes for foo/mod.rs vs foo.rs). Perhaps this could be in scope for the new style team?

7 Likes

As you've mentioned in the last paragraph, it's better to have a convention of which import style to use. IDE-specific annotations are not really a good solution because that requires libraries to be IDE-aware (which is why things like editorconfig exist).

I love the idea of providing attributes to indicate common convention.

This won't work universally, as there are some types whose import paths depend on the need for qualification or not. For instance, I would import std::io::Write if writing a module that doesn't need fmt::Write, but if I need both then I'll use them as io::Write and fmt::Write. If I used an IDE that included autocompletion and refactoring, I'd want it to notice if I'm importing the other of those two and refactor to qualify both.

Cases like that aside (and I don't know offhand of any such cases other than Write), I think a "conventional path" attribute sounds like an excellent idea.

A few examples, from my normal usage:

  • io::Error and io::Result
  • anyhow::Error and anyhow::Result
  • iter::once, iter::repeat, iter::empty, iter::from_fn
  • zip (not iter::zip)
  • ptr::read, ptr::write, ptr::null, ptr::replace, ptr::swap, ptr::copy
  • copy_nonoverlapping (not ptr::copy_nonoverlapping, because it's immediately obvious what this refers to)
  • drop_in_place (likewise)
  • std::process::exit
  • Drop, Seek, Read, Debug, Display, Error, Deref, DerefMut
  • TcpStream, File, Cow, Box, String, OsString, Arc, Vec, Option, Result, Iterator, Command, ExitCode
3 Likes

I think the cases of disambiguating io::Write and fmt::Write, or disambiguation of identifier collisions in general, is something the compiler lint can identify to tolerate non-conventional import styles. There are a number of such cases, others being references from the same module.

why zip not iter::zip?

1 Like

FWIW, it's always fmt::Debug and fmt::Display for me:

in

impl fmt::Debug for Spam {
    fn fmt(&self, f: fmt::Formatter<'_>) -> fmt::Result
}

there are three types from the module.

Additionally, Dispaly and Debug both share the fmt name, so it's more readable not to call .fmt and always use fully-qualified syntax, and not importing any of the traits helpts there.

1 Like

I have a fairly strong association with zip being a common operation on iterators and not meaning anything else, as compared to things like once or repeat. And it's commonly called directly in the line with for, in which iter::zip takes up precious space.

As mentioned, this is my own usage, and may or may not match common convention.

I really like the idea of being able to declare recommended imports, allowing a module to express which way its namespace is designed to work and creating consistency in IDE-assisted authoring of code that uses that module.

(I'd especially love to see a world where std::io is formally marked as "import this module, not its items" and beginners don't come to us with code saying -> Result<()> and needing an explanation that there's a hidden IO-specific error type in there — or an error type from some other library that we can't identify because we're given a function in isolation without the enclosing scope's uses.)


However, I'm a little concerned that linting against non-recommended paths by default would not be such a benefit. Right now, you're free to choose between importing a module or its functions/types — either can be good. You might choose to import the functions/types when your module is highly focused on using that module (say, a module with lots and lots of file IO might import items from std::io), or you might choose to import the module alone when you are dealing with more than one concern. If there is a warning, then there will be noise (in the form of either compiler warnings or #[allow]s in the source code) when code has reason to deviate from the defaults.

So, I'd suggest that the lint should be allow by default; the only automatic behavior would be to advise IDEs (but also, see below).


Further thoughts on the specific proposal:

  • Annotating every single item seems painfully verbose in many cases. How about an attribute that applies to a module meaning “prefer importing me instead of my items”? This could even be a MVP of the overall functionality by itself.

  • Name bikeshedding: “well-known” doesn't speak to me. I think a better name might be something like prefer_import, prefer_path, or recommended_path.

  • A related purpose that might be addressable with the same tools: when the compiler prints the path of an item, it would be nice to be able to control which path it prints, when there are multiple. In fact, right now the compiler will happily print inaccessible paths to items, when those items have been reexported publicly from private modules. Being able to designate a canonical path to an item for imports would also help fix that.

7 Likes

That seems like a useful default, but I would expect modules to often contain a mix of items that should be imported directly and items that should be used via the module name.

4 Likes

For the specific case of Debug/Display, I'm of the opinion that they should've been in the prelude anyway. Yes, this precludes calling .fmt.

IMHO any trait derive in the prelude should also have the trait in the prelude. Traits and trait derives should always be exposed at the same path and until we possibly get a way to import a name only for a certain namespace kind it shouldn't be possible to have one but not the other.


The only other major case would probably be Error types and Result aliases. If a file is dealing almost exclusively with a specific error type, it IMHO makes sense to import them directly. Otherwise, they should be prefixed with the crate/module name that they're the ubiquitous error/result for.

(And as a side note, it should always be pub type Result<T, E = self::Error> = std::result::Result<T, E>;, not pub type Result<T> = std::result::Result<T, self::Error>;. That way if you import the Result alias you can still Result<T, E> for some other error.)

5 Likes

Yes, as mentioned above, this lint should be allow-by-default, mainly serving to enable cargo fix. However, it would be great if we can come up with a sufficiently comprehensive set of exemption rules where violations are allowed, such that it is not counterproductive to enable this lint.

would it be feasible to apply the #[allow] on the use statement instead of the actual usage locations? Does the current compiler design allow this kind of usage?

agreed, "well-known" was just the first term that came to mind, maybe "conventional" or "prefer" is better.

From my experience with dirmod, it is quite common to end up with a module with many mixed item types. For example iter::once vs iter::zip, as josh suggested above.

Would also be very useful in type mismatch messages, especially those enormous paths full of generics.

was this some kind of edition 2015 legacy design choice where (derive) macros are always in scope by default?

Yeah, the default injection is #[macro_use] extern crate std; and

  • originally macros didn't interact with name resolution at all and were purely a syntactical ordering thing
  • originally attributes were all built-in and the built-in derives were part of #[derive] and not macro items of their own

Java has a useful design imo to address this dichotomy. Regular import statements in Java refer to classes (their fundamental code unit) but it also has a syntax variant static import which allows to import static members within a class.

The Rust analogue would be to limit use statements to modules only and make all its items available and have a separate use syntax to import specific items within a module.

use foo::bar; // must be a module name
use item foo::baz::Baz; // an item within a module

fn func() {
    let b = Bar::new();
    let c = Baz::new();
}

(The syntax choice doesn't matter, it's just an example of the idea)

This sort of design would have been ideal imo. Currently we only have the direct item variant. We could invert this (as a backwards compatible approach) and introduce a new syntax such as:

use module foo::bar::baz; 

Where the above must be a module path. And maybe in the future have the defaults switched over an edition boundary.

Edit: I forgot to mention this is also a feature in C++. They have

using foo::bar;
using namespace a::b::c; // is an entire namespace

I wonder if there would be appetite for changing this in edition 2024. We now have some macros like std::pin::pin! that aren't #[macro_export]ed. Maybe edition 2024 can drop the #[macro_use] and explicitly put other macros including the derives into internal paths and std::prelude::edition_2024.

Actually I thought this was already the case (that the macros are imported via prelude, not via #[macro_use]). It's definitely true for any of the std macros defined by a macro item rather than #[macro_export] macro_rules!, which I thought was already the case for all of the std macros (that aren't just builtins).

we need here is custom tool attributes

IDEs would be able to use that for completion & their own linting without coordination with compiler

It seems to me like this would force all users touching the same code to use the same (possibly proprietary) tooling, which is not as good as leaving everyone the freedom to choose. Alternatively, supporting every IDE would probably require vendor hell like

#[ide(vscode_well_kown(...))]
#[ide(jb_preferred_import(...)]
#[ide(common_alias(...)] // supported by visual studio and sublime
struct EveryItemNeedingAWellKnownImportPath;

I think standardizing these things would be better for consistency and editor-agnosticism.


This might be controversial, but in addition to the main usecase discussed here, I'd like to note that this feature could also be used to give structs long, explicit, clear names while automatically letting the users import them as acronyms.

For example, below are a couple of some names that sometimes cause confusion among beginners. In cases like this, we could use longer, and more explicit explicit while keeping the same ergonomics by adding common aliases.

#[well_known(self as Cow)]
pub enum CopyOnWrite ...

#[well_known(self as Arc)]
pub struct AtomicallyReferenceCounted ...

Then, their imports would be automatically more clear, if slightly verbose:

// and noone gets confused about cows (moo)
use std::borrow::CopyOnWrite as Cow;

To be clear, I'm not saying we should rename these items in the standard library, that would cause too much confusion. Just that this feature would allow library authors to use more explicit names in similar situations, without fearing that the verbosity would make things hard to read/write.

2 Likes

I don't think that'll happen given the current IDE landscape: there are only two IDE backends for rust, IntelliJ and rust-analyzer, and they are cooperating (and also both are open source). With an ide prefix, I am fairly sure that we'll see convergence on the same semantics of attribute, with ra-only or IJ-only features at the margin. Overall, it seems like the case where we should focus on enabling, rather than on prevention.

:thinking:

Actually, I think that "enabling" thinking changes my stance on

Maybe we don't even need custom attributes, and just need to add a new namespace in addition to rustfmt and clippy, for example, literal ide ?

We should unlock general custom attributes already -- they are a very annoying stumbling block for whole classes of third-party tooling. Eg, I think an outside-of-compiler static analyzer for panics is worth a couple of kingdoms at this point, but to build such a tool you need syntactic-space for annotations to guide the analysis. It's possible to stub-out the way analysis works internally, but it's impossible to stub out "UI" accepted by rustc.

1 Like

There is a hack to get custom tool attributes right now, define a proc-macro that just consumes the attribute tokens and passes the attributed tokens through unchanged. Then when running the tool you can either make that proc-macro do something to inform the tool (e.g. have it send to an IPC channel when some env-var is set), or search for the attribute while expanding the code somehow.

The main advantage to having proper custom tool attributes is that I guess they would be inert attributes that stay in the expanded code. So you could fully expand the code then parse it to see all the attributes easily.

1 Like