Editor compatibility and the new error format

One of the interesting questions for the new error format is how much to retain compatibility with existing editors. There is a sort of de-facto convention that many compilers and other tools (though certainly not all by any stretch) use which looks like:

filename:line:col: error: message

Many editors have builtin support to look for this pattern out of the box in compiler output. Many editors are also extensible (e.g., emacs has a list of regular expressions that one can add to). I am not sure how many are not extensible, but even if they are, users may not have installed plugins.

The new error format diverges from this format in order to make things more readable. The new format looks like:

error: message
  --> filename:line:col
3 |> quoted text

The advantages of this for humans are that it makes the error message easier to find, and also means that long filenames don't cause the message to be indented way over to the right.

Unfortunately, this breaks that builtin detection. As @josh wrote on GH:

Right now, I have no Rust support in vim (because the Vim mode for Rust isn't packaged in Debian), and I regularly use :set makeprg=cargo and :make build; I can also log the output of a build and run vim -q build.log.

There are a few options on the table:

  1. Detect if stdout is a tty and emit a slightly different format in that case.
  2. Use an environment variable to signal the desire for a different format.
  3. Just adapt editors to the newer format or to use json (perhaps infeasible).

If we do adopt an "editor friendly" variation on the different format (i.e., other than JSON), it is also a bit of an open question as to what it should look like. It's clear it should have the filename/line-number easily extractable, and probably the message as well. But what else you want seems to depend on the editor. e.g., for emacs, I really want to see the human readable output just as it looks now, perhaps with some prefix, so that in my compilation window I get to enjoy the new output. That led me to suggest a header like:

filename:line:col: error: message
error: message
  --> filename:line:col
3 |> quoted text

But this has the downside that the details are duplicated, and worse, naive tools may count each error twice (once with the filename:line:col: part, and once where --> is considered part of the filename).

One possible fix would be to have a commandline flag that let users toggle whether or not to use the new header format. Possibly a --compact-header flag.

The current header, as Niko points out,

error: message
  --> filename:line:col
3 |> quoted text

In compact header mode, we could just use the old style header:

filename:line:col: line:col error: message
3 |> quoted text

One disadvantage of this approach is that for long filenames, things may become less readable. Whereas before things would align because we prepended the file name, with this fix, you might see:

src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:111:22: 111:26 error: cannot borrow `*foo` as mutable because `foo.bar1.int1` is also borrowed as immutable [E0502]
110 |>     let bar1 = &foo.bar1.int1;
    |>                 ------------- immutable borrow occurs here
111 |>     let _foo2 = &mut *foo; //~ ERROR cannot borrow
    |>                      ^^^^ mutable borrow occurs here
112 |>     *bar1;
113 |> }
    |> - immutable borrow ends here

Still, it might be a reasonable compromise.

In general, I'd prefer something that doesn't require me to edit the command line. For example, if we used an environment variable, then in the rust-mode for emacs we could just set this variable globally so that it is always set when you load rust-mode, and users don't even have to know it exists. But it is almost certainly good to allow command line to override as well.

The suggestion by @josh where the default depends on 'isatty' may also be a reasonable compromise, though I am always kind of surprised when redirecting to a file alters the output in a big way.

Yeah, so this has been my traditional concern. Basically in emacs (and I think many other editors) mostly I still read errors in a buffer, so I'd ideally like to get the full readable formatting. But then again repeating information seems pretty bad. So probably this is a better idea, and then I "just" have to adapt the rust-mode to parse JSON output and make it look perty for me. =)

In IntelliJ-Rust we just hyperlink cargo output to the files, like Emacs does. Supporting both output formats is not a problem. I suspect that it actually already works for the new output, but I have not checked it yet :slight_smile:

Changing output format depending on isatty may break things, but it should be an easy fix.

What are you thinking of when you say "may break things"?

I just realized something. I've been assuming we would want to use this "header" mode in emacs, but... of course there is no reason to do that. After all, I already adapted the rust-mode to the new format. So perhaps we should view the "header" mode as a "compatibility fallback" for uncustomizable editors or specific scenarios, versus something we actually recommend for editors to use.

At the moment we execute cargo as a subprocess (so isatty should be false from cargo POV) and present its stderr/stdout to the user. So if the error output is simplified by default when isatty is false, we’ll need to add a logic for overriding this behavior (and that reminds me that we may want to add --color always).

I see. So this comes back to the case of “smart enough editors just want the normal output”. I think an important question is how many editors can be customized in this way – I would assume all the major ones, but I may well be wrong.

I think 1 and 2 are both options: support an environment variable to change the format, and have the default value be auto which determines which format to use based on isatty. You can then explicitly set the value to normal (which forces the normal format even if stderr is not a tty) or to traditional (or whatever we call the editor-friendly-format).

I don’t actually think you need to duplicate the filename:line:col and the error message; in fact, doing so doesn’t work, because editors will detect two error messages there: one for filename and one for --> filename.

What I’d suggest instead is that if the format is “traditional”, you drop the arrow in front of the filename, and switch the filename line and the error message line:

filename:line:col:
error: message
3 |> quoted text
     ^^^^^^ further details

That’d be enough to make sure that vim’s :cn and :cope DTRT, as well as almost any other editor’s out-of-the-box behavior. For anything more than that, a Rust-specific editor mode can provide custom handling.

So vim doesn not actually care about error: message part? But it needs the filename to be at the beginning of the line? In that case another possible solution is just to drop the arrow from the new format.

There is a sort of de-facto convention.

as well as almost any other editor's out-of-the-box behavior.

It would be interesting to know what tools except vim actually will break with new format (though vim alone is a big one).

I was considering this. There are certainly some tools, such as Kate I think, that would like to easily extract the error/warning/message bit, I think. (cc @eddyb)

Vim and other editors care a little about the error message, in that if they can, they show the error message in the status area; however, they can cope. Editors have understood for a while that error messages may involve multiple lines.

Personally, I kinda like the arrow in the new format, though it is tempting to just remove it since that would fix all the compatibility issues.

I am not sure what @matklad meant, but I would not want to remove the --> in the "default" mode. Maybe in the traditional mode. (Like, that could be the entirety of the difference or something.) The arrow serves a real purpose in terms of "instant decipherability" imo.

This is indeed what I have proposed, but I don't like it myself.

Why is auto a better default than normal? I would say that normal is better (as a default) because

  • it causes less surprises (the same error format when redirecting to the file)
  • looks like most users want normal or don't care

There’s no point in autodetection if it isn’t the default; the whole point is for the out-of-the-box behavior to avoid breaking the expectations of existing programmer’s editors, without requiring additional configuration by the user or changes to the editor.

And yeah, for the traditional mode (or auto if stderr is not a tty), dropping the arrow suffices, though I’d also suggest reversing the order of the first two lines (the file:line:column: and error message lines).

I'm not sure I agree with that, I could imagine it still being useful even if you had to opt into it, but I do think it's worth trying to enumerate the possible workflows for using rustc and seeing if we can find a way to ensure they are all satisfied with as little configuration as possible.

I see the following right now:

  • terminal
  • "highly configurable" editor
  • "unconfigured" editor

We all agree that running from the terminal should look like -->.

I think what a "highly configurable" editor probably wants the most is the ability to easily control what kind of output it gets. Ideally then it can request JSON output, parse it, and give a nice display for the user. Or, with less effort, request the "pretty" output that the terminal gets, detect the --> and adjust things appropriately. Or, request the "traditional" output, parse it, and display helpful things.

An unconfigured editor, presumably, wants the "traditional" output. Though I will point out that a quick glance at emacs' list of builtin regular expressions makes it clear that the great things about "standards" are that there are so many to choose from. :slight_smile:

Does this seem correct?

1 Like

FWIW, Vim's errorformat is a bit like a scanf string, and it can handle multiline formats. The arrow should not be a problem for a custom format. The default varies by platform.

So, the problem I found in emacs was buffering. That is, you can specify a multi-line regex, but it did not work reliably, because sometimes we would emit the first few lines of a header, flush, and then emit the -->. I considered changing the compiler to flush the entire message at once, but that seems like a fragile hack.

(I could also tweak how emacs works to make it “rescan the last few lines”, but again…hack.)

I think you do really want atomic messages anyway, so parallel rustc invocations don't interleave their output badly. (Hopefully one's code isn't spitting errors all over the place, but it happens.)