I found this article today, lots of great inspiration for improving diagnostics. We do some of this already, but still, a great read that we should take inspiration from. http://elm-lang.org/blog/compiler-errors-for-humans
rustc’s diagnostics already do these things.
I don’t think we have the greatest formatting at the moment (prefixing every piece of a message with
path/to/file.rs:line:column: can get a little verbose).
This is true, but at least we print relative paths now, which helps a lot.
For me the hints shown there are overly dumbed down / “polite”. In particular, the compiler shouldn’t refer to itself as “I” (“As I infer the type…”), or state that something “looks like missing a field” - it is missing a field, so tell me that without beating around the bush! (Don’t even ask about the silly message about heterogeneous types in lists.)
I hope Rust won’t switch to this style from the concise, correct messages and hints it has today.
The style of the error messages provided in this page seems quite verbose. Users are going to see these error messages over and over again for a long time, so it does not make much sense to use “I see a conflict” instead of just something like “Conflict”. Also “As I infer the types of values flowing through your problem” does not seem to provide much information, it is what we expect the compiler be doing. Style-wise, I like what Rust is currently doing, by being as concise as possible, so that user can get the information with little overhead, while Rust can provide detailed explanation when asked. Otherwise the principles look fine.
I also thought that the “As I infer the types…” error message was stilted, but I think that the points about focusing on the information needed by the user, making suggestions about possible causes of an error, and so on were spot on. I think that where Rust could still use some work would be in lifetime-related error messages, since we’ve still had some odd messages even in the 1.0 release (e.g. suggesting changes that are actually identical to the code as written, meaning that the suggestion trivially won’t solve the problem). I suppose it’s tricky because there are not many languages with a similar system, so there isn’t as much of a tried-and-true vocabulary about how to describe lifetime-related errors.
One advantage of the output exhibited in this article is space.
Today, Rust’s error messages can get a little confusing because there are chained. All errors and notes share the same level of indentation and there is little separation when going from one to the other making it difficult to easily distinguish when one error and its notes stops and the next error begins. Adding a simple blank line between each error would, I think, already be a major gain.
Then, indeed, come the hints. Coming from Clang, I know for a fact that:
- rustc can definitely progress on typo corrections and diagnosing the exact issue
- it takes a lot of effort to get where Clang is today
still, I hope that with time this is an area that will see some love.
One problem about changing error message formatting is that it will probably break the emacs, vim, etc. Rust plugins. Every time a style change was made to error messages, we’ve needed to think about how that affects editor support. I personally think that getting these editors to understand error messages for humans would be a difficult task, or would at least restrict some aspects of the error messages’ formatting. Elm solves this by having a flag (
--report=json) that outputs JSON error messages, so perhaps Rust could do something similar.
I agree: the Unix philosophy of using plain text for communication is very brittle in the face of changes. Instead tools should parse structured output, and this structured output should:
- be versioned; with the version number being a mandatory compiler argument when structured output is requested
- guarantee backward compatibility within a given version (no removing fields, no changing fields format, but allowed to add new fields)
Interestingly, the structured format might well be less verbose than regular output because it would not need to quote the source; just having the source range is sufficient for a tool to go and fetch it after all.