Pre-RFC: Nicer Types In Diagnostics

da-x · October 17, 2019, 7:09pm

Hi all, and @pnkfelix (mentioned in one of the issues as someone who might want to collaborate on this). I am drafting an RFC for the continuation of the following:

github.com/rust-lang/rust

The compiler should report publicly exported type names if possible

opened 04:04PM - 04 Feb 15 UTC

pcwalton

E-hard C-enhancement A-diagnostics A-resolve P-medium T-compiler D-newcomer-roadblock

If a private type is reexported at some public location, the Rust compiler will …report the private location when referring to the type. It'd be more helpful if the compiler reported the public alias for it. This was reported by Coda Hale on Twitter. ---- Update by pnkfelix: here is a standalone bit of code illustrating the problem. ([play](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=ba6fa8cd0b3b03f756ebc8cbd95597a4)): ```rust mod a { mod m { pub struct PrivateName; } pub use m::PrivateName as PublicName; } fn main() { // a::m::PrivateName; // would error, "module `m` is private" a::PublicName; a::PublicName.clone(); } ``` As of nightly 2019-06-17, this issues the diagnostic: ``` > error[E0599]: no method named `clone` found for type `a::m::PrivateName` [...] ^^^^^^^^^^^^^^^^^ | (author of `fn main` wants to see `a::PublicName` here) ```

github.com/rust-lang/rust

Shorter type-names on type errors

opened 11:08AM - 29 Apr 18 UTC

closed 10:02AM - 04 Dec 20 UTC

lukaslueg

C-enhancement A-diagnostics T-compiler

Type errors (`E0308`) are a common occurrence in rustc. Whenever I confuse mysel…f about the type of something, I tend to assign it to some `x: i32` and let the compiler blame me with a hint about what type `x` actually is. The `note`-section attached to the error message contains the fully qualified type name, which is technically as correct as one can get. However, this also produces a lot of clutter, because the type-name that I'm actually interested in is much more concise if types in scope of the offending statement are taken into account. For example: ```rust use std::collections::HashMap; fn foobar(x: &i32) { } fn main() { let mut foo = HashMap::new(); foo.insert("foo", Ok(Some("bar"))); foo.insert("oof", Err("Oh noes!")); foobar(&foo); println!("{:#?}", foo); } ``` The `note` says > found type `std::collections::HashMap<&str, std::result::Result<std::option::Option<&str>, &str>>` Again, while this is as precise as it can get, most of this is just clutter, especially with deeply nested types: I know that `HashMap` is from `std::collections`, I also know that `Option` refers to `std::option::Option`. If I had some other type shadowing those names, I'd be aware of it. Reading the note, I have to demangle the type-name to > found type `HashMap<&str, Result<Option<&str>, &str>>` which is what the type-name would be if I wanted to write it down, considering all the `use`s. Could we at least have an option (or an extra note) for `E0308` where the type-names get uncluttered by considering the types that are in scope for the offending statement? This would reduce the intellectual burden and may enable one to just copy&paste the type-name from the note.

It's hard to figure out what diagnostics will be convenient to read, so I've included a draft implementation of this feature.

Summary

When types are printed in diagnostics, each type's fully qualified type incurs a cognitive load on the reader. For example, printing the type std::vec::Vec<std::vec::Vec<std::vec::Vec<u16>>> could have been shortened to Vec<Vec<Vec<u16>>>. This proposal seeks to remedy the situation, by omitted the type's path when possible, however it also seeks to prevent ambiguity.

Motivation

The expected outcome is that it will be easier to read type errors, but without losing the ability to pin-point the exact problem to the user. Currently, a fully qualified type path precisely indicates a type, and we need to make sure that stripping the path does not create ambiguity.

Guide-level explanation

First, for let's define how each non-qualified type falls into either of three categories:

Types that are imported by default from prelude.
Types that are imported in the scope in which the type errors occurs, and are not prelude types.
All other types that don't fall into the first two categories.

In the most reduced from of types printing setting named minimal, we differentiate between the three categories.

For prelude types, as long as the type is not occluded by an import with the same name, we omitted the path entirely.
For imports that are available in the current scope, we also omit the type path, but introduce a via imports line to the type error (see below), illustrative of the use statement that imports the type.
For all other types, as long as the type is not occluded by an import with the same name, we omit the path, but also introduce a similar reachable by line to the error message.

Consider the following example:

fn test3(v: &Vec<Vec<Vec<u16>>>) {}

fn main() {
    use std::collections::HashMap;

    let ip: std::net::IpAddr = "0.0.0.0".parse().unwrap();
    test3(vec![vec![(String::from("x"), ip, HashMap::new())]]);
}

The output will be:

7 |     test3(vec![vec![(String::from("x"), ip, HashMap::new())]]);
  |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected reference, found struct `Vec`
  |
  = note: expected type `&Vec<Vec<Vec<u16>>>`
             found type `Vec<Vec<(String, IpAddr, HashMap<_, _>)>>`
            via imports `use std::collections::HashMap;`
           reachable by `use std::net::IpAddr;`

Instead of the uniform, default setting thus far:

 --> src/case2.rs:7:11
  |
7 |     test3(vec![vec![(String::from("x"), ip, HashMap::new())]]);
  |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected reference, found struct `std::vec::Vec`
  |
  = note: expected type `&std::vec::Vec<std::vec::Vec<std::vec::Vec<u16>>>`
             found type `std::vec::Vec<std::vec::Vec<(std::string::String, std::net::IpAddr, std::collections::HashMap<_, _>)>>`

For the transition period, there is also a by-import setting that limits reduction of type paths for imports, preventing the the reachable by segment.

7 |     test3(vec![vec![(String::from("x"), ip, HashMap::new())]]);
  |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected reference, found struct `Vec`
  |
  = note: expected type `&Vec<Vec<Vec<u16>>>`
             found type `Vec<Vec<(String, std::net::IpAddr, HashMap<_, _>)>>`
            via imports `use std::collections::HashMap;`

Also for the transition period, the three settings can be controlled via a crate-level type_diagnostic feature. For example:

#![feature(type_diagnostic)]
#![type_diagnostic = "minimal"]

More examples in a very early draft rustc implementation of the RFC.

Reference-level explanation

The feature will be implemented via the following steps:

Produce an imports map from librustc_resolve back to librustc.
The imports will be a map from the module concept of resolve to the DefId that are defined in that scope, with their respective Idents.
The parent scope will be included and walkable from each HirId in librustc/ty/print/pretty.rs, so gathering up the uses in scope will be efficient.
Code paths that emit errors will need to pass the HirId of the relevant scope, which will be stored in TLS for pretty.rs to catch.

Very early implementation draft in rustc.

(TBD: corner cases)

Drawbacks

We may choose not to do this because people are already too accustomed to groking the current and long uniform paths.

Rationale and alternatives

One alternative is that instead of by-import line, we quote the actual span of the use statement that brought the type. However this may be hard to read, and the use lines may be too numerous. Also, this does not apply to reachable via.

The impact of not doing this is a newcomer's roadblock issue, as was already realized in previous discussions.

Prior art

(TBD: needs research, am not aware of other languages doing something like this)

Unresolved questions

How those prints should be formatted exactly, to make them most readable?
How do we modify the code for it? Is TLS an okay solution or should we do a fuller rewrite of the pretty-printing engine?
Can or should we rely on Spans instead of HirId?
Should the default for the setting be kept uniform, should we change the default, or get rid of the setting altogether?

Future possibilities

From an implementation standpoint, the import_map can be used by analysis tools for cross-crate important analysis, say, for determining if there is dead code in a set of packages comprising a workspace.

Also, tools such as rustfix may use the extra printed lines to augment the use pseudo-section of a module.

repax · October 17, 2019, 9:42pm

I like this. Can the "via imports" lines be omitted too if the type names are unambiguous?

Centril · October 18, 2019, 12:06am

Awesome RFC!

Maybe use a rustc flag instead? (which could be exposed via cargo eventually on stable)

What about type aliases? I wouldn't want to see them expanded to their RHS when reading diagnostics.

I would look at Haskell (GHC) which AFAIK does this (including for aliases).

I'd say let's iterate during implementation based on real diagnostics.

Imo TLS in a pretty printer is pretty confusing so ideally it would be avoided but perhaps not at first?

Change it! The current outputs with all these long paths are like the worst part of rustc's otherwise good diagnostics.

I can attest to this also being a roadblock issue for experts as well.

da-x · October 18, 2019, 7:26am

Let's say we want to drop the via imports for use std::collections::HashMap; from the example above - we would have to iterate all the importable types from all crates (including the local one) to verify that HashMap is singular. It's interesting, and may be do-able.

However, in this case, the via imports appearing would be an indication that there's more than one HashMap available for import, and it may confuse users causing them to wonder "where is this other HashMap lurking". It's a bit hard to explain to users.

da-x · October 18, 2019, 8:23am

You mean via Cargo's rustflags and not the per-package Cargo.toml? If so, developers can have this globally enabled in their development environment regardless of what they are working on. However we would have to push for people to try this feature out. Maybe we can have it both - enable it implicitly in rustc's source via the feature so that once it hits beta all rustc devs will see it in action. It's a big change that may require more eyes looking at the newer diagnostics before we ship it out.

Me too, however I'm not versed in how the type system is implemented - I guess that meta-data regarding alias expansion will need to be kept somehow so that "de-expansion" can happen during pretty-printing of types. Otherwise, trying to fit unrelated aliases on expanded types in order to compress them may be a hard problem (computationally?). It's also a bit orthogonal to this RFC, and we can do it separately.

Haskell is a good example, however it does so partially, for prelude items and for non-qualified imports (equivalent to use in Rust). However, it doesn't do reduction for types that are not imported (i.e. the reachable by bits of this RFC - for that part I haven't found yet a prior art).

Let's see an example:

import Data.HashMap.Strict (HashMap, fromList)
main = do
    let map = (fromList [(1, "x")] :: HashMap Int String)
    show $ (fromList [(1, map)] :: HashMap Int String)

Results in:

    • Couldn't match type ‘HashMap Int String’ with ‘[Char]’
      Expected type: HashMap Int String
        Actual type: HashMap Int (HashMap Int String)
    • In the second argument of ‘($)’, namely
        ‘(fromList [(1, map)] :: HashMap Int String)’
      In a stmt of a 'do' block:
        show $ (fromList [(1, map)] :: HashMap Int String)
      In the expression:
        do let map = (fromList ... :: HashMap Int String)
           show $ (fromList [(1, map)] :: HashMap Int String)

For qualified imports, the type errors in GHC show the full qualified paths, as long as the types are not accessible via a non-qualified import.

So the following:

import qualified Data.HashMap.Strict (HashMap)
import Data.HashMap.Strict (fromList)

main = do
    let map = (fromList [(1, "x")] :: Data.HashMap.Strict.HashMap Int String)
    show $ (fromList [(1, map)] :: Data.HashMap.Strict.HashMap Int String)

Results in:

    • Couldn't match type ‘Data.HashMap.Strict.HashMap Int String’
                     with ‘[Char]’
      Expected type: Data.HashMap.Strict.HashMap Int String
        Actual type: Data.HashMap.Strict.HashMap
                       Int (Data.HashMap.Strict.HashMap Int String)
    • In the second argument of ‘($)’, namely
        ‘(fromList [(1, map)] :: Data.HashMap.Strict.HashMap Int String)’
      In a stmt of a 'do' block:
        show
          $ (fromList [(1, map)] :: Data.HashMap.Strict.HashMap Int String)
      In the expression:
        do let map
                 = (fromList ... :: Data.HashMap.Strict.HashMap Int String)
           show
             $ (fromList [(1, map)] :: Data.HashMap.Strict.HashMap Int String)
   |
10 |     show $ (fromList [(1, map)] :: Data.HashMap.Strict.HashMap Int String)
   |             ^^^^^^^^^^^^^^^^^^^

Which is similar to what we had so far in Rust. Adding a non-qualified import of HashMap in addition to the qualified import works as expected.

In C++, Clang and GCC differ. Consider the non-qualified import equivalent using namespace, in this example:

#include <map>

using namespace std;
void test(map<char,char> *m) { }

int main() {
    map<char,int> first;
    test(&first);
}

GCC emits:

test.cpp: In function ‘int main()’:
test.cpp:8:10: error: cannot convert ‘std::map<char, int>*’ to ‘std::map<char, char>*’
    8 |     test(&first);
      |          ^~~~~~
      |          |
      |          std::map<char, int>*
test.cpp:4:27: note:   initializing argument 1 of ‘void test(std::map<char, char>*)’
    4 | void test(map<char,char> *m) { }
      |           ~~~~~~~~~~~~~~~~^

Clang 8 emits:

a.cpp:8:5: error: no matching function for call to 'test'
    test(&first);
    ^~~~
a.cpp:4:6: note: candidate function not viable: no known conversion from
      'map<char, int> *' to 'map<char, char> *' for 1st argument
void test(map<char,char> *m) { }

If std::map is used instead, Clang emits std::map in the diagnostics, even if it is imported via using namespace.

Sure.

Agreed. Types are being pretty printed via Display trait, and TLS usage is already rampant there, so adding more fields to it may not be a big deal. The alternative is using a newtype wrapper to add context to the printing.

Yes, but perhaps in stages, following my suggestion to enable this for rustc source itself first?

Same here.

djc · October 18, 2019, 8:32am

I love this, thank you for working on this! Also cc @ekuber as the empathetic compiler czar.

kornel · October 18, 2019, 12:06pm

I fully support the idea, especially showing short types for unambiguous names from the prelude.

I don't see value in distinction of via imports vs reachable. I generally don't care how a name got there, only whether it's the one I expect.

I'd be fine with just a list of paths in case I really needed to double-check them. I'm not sure how to call that list, but let's say:

  = note: expected type `&Vec<Vec<Vec<u16>>>`
             found type `Vec<Vec<(String, IpAddr, HashMap<_, _>)>>`
             these are `std::collections::HashMap`, `std::net::IpAddr`

Centril · October 18, 2019, 12:46pm

I think longer diagnostics as opt-in is something you'd do mostly on a per-developer basis and not something you check into CI, and sometimes you want to switch so I was thinking cargo check --u or something to get uniform errors. (Eventually tho, first it could just be rustc ... -Z uniform-errors.

Agree; I will take any incremental improvement in this area I think mentioning it as future work and thinking about it a bit would be good tho.

Yep; now you have some text to paste into the prior-art

Yep; in my experience, non-qualified imports are more common tho so in practice I usually saw the reduced paths in error messages when I was writing Haskell.

Yeah it was similar for the AST wrt. Display; I've removed some of the Display impls from the AST now and I think it got cleaner overall.

Seems smart to try it out for rustc developers first, agree.

Nokel81 · October 18, 2019, 12:53pm

Excellent RFC, will definitely make compiler type errors easier to read.

I think that there should be a part in the RFC concerning renaming of imports:

use std::collections::HashMap as Maperoony;

I think that the renamed version should be used in diagnostics in the short form and added as an:

imported by `use std::collections::HashMap as Maperoony;`

da-x · October 18, 2019, 2:17pm

kornel:

  = note: expected type `&Vec<Vec<Vec<u16>>>`
             found type `Vec<Vec<(String, IpAddr, HashMap<_, _>)>>`
             these are `std::collections::HashMap`, `std::net::IpAddr`

This is an option I thought about initially, i.e. listing the fully expanded path. However I realized that emitted use statements have the advantage that 1) you can combine imports into "use trees" such as below and thus make the output more concise 2) in the reachable by case, you can copy the use statements into the code.

  = note: expected type `&Vec<Vec<Vec<u16>>>`
             found type `Vec<Vec<(String, IpAddr, HashMap<_, _>)>>`
             via imports `use std::{collections::HashMap, net::IpAddr};`

illicitonion · October 18, 2019, 4:41pm

I really like this

One additional thing to consider: for generating replacements for cargo fix this is also useful information.

I've written clippy lints before where I've wanted to offer some replacement code to auto-fix which uses whatever the best path to a type is in scope, and instead I generally end up just giving people fully qualified paths because they are definitely correct. It would be great to expose this information to that context too, so I can write something like token_for_symbol(expr_ty) in my replacement snippet. e.g. https://github.com/rust-lang/rust-clippy/blob/c0b2411f06a417266baaac2f0ea431138da33bf3/clippy_lints/src/default_trait_access.rs#L55

kornel · October 18, 2019, 8:03pm

The "compression" of use trees is clever, but I'm not sure if that's the right place to do it, if the error needs to clearly communicate types to user who's apparently confused about types.

Use statements are absolute and can be pasted into most scopes, so for purpose of copying into code, there's no difference between via and reachable.

da-x · October 18, 2019, 8:19pm

I assume that we introduced the ability to specify use trees in the code given that it's clear enough to users. If so, it should be clear enough in messages as well?

The difference is that copy paste from a combined reachable and via may not work because this single import statement may be duplicating what's already being imported in the scope. The separate reachable are only stuff that are not already imported in the scope. If we are combining them, then you'd have to remove the duplicates after pasting.

kornel · October 19, 2019, 2:53pm

The difference is that users write trees of types if and when they choose to, rather than be given one to parse. And custom code can use a multi-line syntax (IMHO more readable), but such syntax would be too long for an error code.

I thought the aim was to use short types only when they're unambiguous. If I have std::result::Result and crate::Result in the same file, I would not want to have just Result in the error message. So I don't expect clash to be a problem, because full paths are needed only for names shortened in the message, and only unambiguous names should be shortened in the message.

samsieber · October 19, 2019, 3:03pm

Would it be within the scope of this RFC to disambiguate between incompatible types when the fully qualified type name is the same? I.e. when there are multiple imcompatible versions of a crate. Or should that be it's own thing.

da-x · October 20, 2019, 10:56am

I'll wait for more opinions on this.

Good point. crate:: being absent from types in errors, means that types in errors are not fully qualified even with the current rustc. In fact, when uniform paths were implemented, the diagnostics did not change accordingly (see last note at: Misleading `use foo::bar` error message · Issue #61942 · rust-lang/rust · GitHub ). @petrochenkov perhaps can provide more recent update on this. So to prevent ambiguity, we need to care for types at the crate root which shadow prelude names. For example:

struct Result(u32);
fn test(_: std::result::Result<(), ()>) { }

fn main() {
    test(Result(42))
}

Current rustc is emitting:

  = note: expected type `std::result::Result<(), ()>`
             found type `Result`

Which we may need to change:

  = note: expected type `std::result::Result<(), ()>`
             found type `crate::Result`

With the RFC implemented we can have this as:

  = note: expected type `Result<(), ()>`
             found type `crate::Result`

Note that we cannot shorten crate:: here because of two types bearing the same name.

However if there's no Result defined or imported from a non-prelude definition in the crate's root, then we can shorten the path, but only for the prelude type, in the next example:

mod submod {
    pub struct Result(u32);
    pub fn test(_: std::result::Result<(), ()>) { }
}

fn main() {
    submod::test(submod::Result(42))
}

And get (with the RFC implemented):

   = note: expected type `Result<(), ()>`
              found type `crate::submod::Result`

As a side note, I would have renamed the RFC to 'Non-uniform Type Paths In Diagnostics' if it wasn't already the case .

Anyway, it would be a worthy goal to solidify the rule as to when a non-uniform type is printed in an error without the note that strictly says where the type came from, and my current opinion about it is that only prelude types should be printed shorted and note-less.

da-x · October 20, 2019, 11:12am

We are currently emitting a note such as Perhaps two different versions of crate foo are being used?, so it's less important right now, but I agree it can be made clearer. I'd leave it out for a future change.

kornel · October 20, 2019, 3:52pm

For the Result case IMHO the ideal would be:

   = note: expected type `std::result::Result`
              found type `crate::Result`

i.e. if there are two types called Result involved, don't shorten either of them (no exceptions for prelude, imports, and such).

da-x · October 20, 2019, 4:44pm

Yeah, I see that it makes sense. I'll update this when I post the RFC.

ekuber · October 21, 2019, 3:55am

Personally I would use the locally available name (the one that would work in the code) as the short version and the one being shadowed with the fully qualified path. Also, we should try and leverage span labels when encountering confusing cases with multiple elements named the same thing:

   |
LL | struct Result(u32);
   |        ----------- the found type defined here
<...>
  = note: expected type `std::result::Result<(), ()>`
             found type `Result`

It would also be interesting to carry the crate version information when relevant.

  = note: expected type `foo::bar::Baz` (foo 0.9.0)
             found type `foo::bar::Baz` (foo 1.0.0)

Finally, it would also be interesting to look at how to leverage type arguments to clean up the output:

  = note: expected type `Foo<A, B>`
             found type `Bar<C, B>`
             where A: String,
                   B: usize,
                   C: foo::bar::Baz<usize, usize>

I think the rule of thumb is that we should always use the short name that would work in the code if we were to replace the primary span, and use longer paths only for disambiguation.

Topic		Replies	Views
Should type errors display type synonyms where applicable? language design	3	1281	March 25, 2019
Older RFCs up for discussion this week	4	1734	March 25, 2019
Should we warn about public function uses private type aliases? language design	2	281	February 10, 2025
Private struct returned by public fn	23	3388	March 25, 2019
Type-level programming - how to provide better compiler errors? compiler	2	930	September 23, 2019