Syntax highlighting in all compiler output?

I've got some WIP refactor moving the rustdoc syntax highlighting code to its own crate, to allow to use it for terminal highlighting as well. As a proof of concept, I used it to highlight the code in rustc --explain (the colors scheme is a placeholder):

@GuillaumeGomez made a comment in #33240 about experimenting with this some time back, but the referenced PR was closed due to inactivity after changing strategies to do what I ended up doing.

From that PR a point was made, that I agree with, the highlighting should (at least by default) minimal.:

I think that for syntax highlighting, I'd prefer to just find keywords and highlight them in bold (something very minimal). We have to be very wary of drawing attention away. –@nikomatsakis

Note that I am not parsing markdown from the explanations, just finding code blocks and highlighting them, but a proper solution should probably use hoedown with a custom renderer.

I believe the same syntax highlighting could be used in other compiler output, specially in diagnostics (but that might need to highlight the entire document before printing out only a few lines if we want foolproof highlighting).

15 Likes

Awesome! I’ve worked on this a very long time ago, it’s nice to see someone finally doing it completely.

@jntrnr let's continue the thread here.

How the current theme looks on a white background is less than ideal:

What would you think of using (and avoiding abusing) a set background for specific tokens?

  • I agree that syntax highlighting in error messages should be minimal.
  • It should also emphasize the error over everything else.
  • Color, if used at all, should be equally readable on a black and a white background (it’s not always possible to tell what color the terminal background is).
  • Color, if used at all, should avoid red-green distinctions, to accommodate colorblindness.
1 Like

Pardon the nosiness, but:

What’s the fuss about how various colors look against various background colors in a 16-color terminal program? This isn’t just an issue of light background vs dark; you have no reliable idea of what the colors look like, either!

I would hope that in any well-configured terminal, all 16 colors ought to be legible against the background.

I can only really see a case being made for e.g. Windows-targeted hacks, where the default color scheme is a minefield of illegible colors, and it affects many people.

1 Like

Your points are true for the general case, but a tool like a compiler must be conservative to be usable in as many places as possible, no matter how hare brained their defaults might be :slight_smile:

I would rather argue there are no conservative color choices beyond not using color. All colors bear a risk of looking like crap somewhere.

But I guess I’m just stating the obvious. :stuck_out_tongue: I admit there must be some colors which are at least less likely to look like crap than others on most systems, and I suppose that finding those colors is the goal here. :slight_smile:

I'm glad to see that there's some hype for the feature :smiley:

The tokens being identified for highlighting are:

None => vec![],
Comment => vec![ForegroundColor(color::GREEN)],
DocComment => vec![ForegroundColor(color::MAGENTA)],
Attribute => vec![Bold],
KeyWord => vec![Dim, ForegroundColor(color::YELLOW)],
RefKeyWord => vec![Dim, ForegroundColor(color::MAGENTA)],
Self_ => vec![ForegroundColor(color::CYAN)],
Op => vec![ForegroundColor(color::YELLOW)],
Macro => vec![ForegroundColor(color::RED), Bold],
MacroNonTerminal => vec![ForegroundColor(color::RED), Bold, Underline(true)],
String => vec![Underline(true)],
Number => vec![ForegroundColor(color::CYAN)],
Bool => vec![ForegroundColor(color::CYAN)],
Ident => vec![],
Lifetime => vec![Dim, ForegroundColor(color::MAGENTA)],
PreludeTy => vec![],
PreludeVal => vec![],
QuestionMark => vec![Standout(true), ForegroundColor(color::BRIGHT_GREEN)],

But we could highlight any specific token rustc can identify.

With the above configuration, they look this way for rustc --explain E0040:

I like how it looks so far, but an attempt at minimal syntax highlighting yields:

None => vec![],
Comment => vec![ForegroundColor(color::GREEN)],
DocComment => vec![ForegroundColor(color::MAGENTA)],
Attribute => vec![Bold],
KeyWord => vec![],
RefKeyWord => vec![Dim, ForegroundColor(color::MAGENTA)],
Self_ => vec![],
Op => vec![ForegroundColor(color::YELLOW)],
Macro => vec![Bold],
MacroNonTerminal => vec![Bold, Underline(true)],
String => vec![Underline(true)],
Number => vec![],
Bool => vec![],
Ident => vec![],
Lifetime => vec![Dim, ForegroundColor(color::MAGENTA)],
PreludeTy => vec![],
PreludeVal => vec![],
QuestionMark => vec![Bold],

I don't think that is as nice though (and I'm yet to check how this looks in Windows).

I like the general idea of adding some syntax highlighting if it makes the output easier to parse. However, let’s make a non-arbitrary decision on what to emphasise.

Underlining strings, for example, seems to me a bit unhelpful. Also, do macros really need to be distinguished?

1 Like

These all look pretty good to me. I suspect that once we start doing it we’ll wonder how we survived w/o it for so long. =)

Agreed. I would think keywords (all kinds) and comments would be the most important parts to highlight and start there.

BTW, regarding the legibility against white backgrounds, even though with the current output I feel it looks well enough under both white and black, I could set the background for the codeblock to black, either using Background Color Erase or writing whitespace with background set to black to fill the width of the largest line in the code and optimize for legibility with only one background, allowing us to use some colors that otherwise would be illegible (bright yellow, for example).

@repax, strings should IMO be differentiated, just like numeral and boolean literals. I used underline because yellow makes it hard to read on white background, wanted to avoid red for anything (which is a shame, because it is the only color that is easy to read on both black and white backgrounds). Blue is hard to read on black backgrounds. Comments already are blocks of green text, which I’d like to differentiate easily. Magenta is already being used for docstrings. Cyan is hard to read in white backgrounds. Bold would be too distracting. Italics are very sparecely supported. Using an underline seemed like the best alternative available.

PR #39300 with the code used for these screenshots.

I could set the background for the codeblock to balck

Please don't. I use light-background terminals because I find light-on-dark significantly harder to read than dark-on-light. People who use dark-background terminals will have exactly the opposite opinion. (If I remember correctly, this is one of those things where it's about fifty-fifty what your preference is, but whatever it is, you have that preference very strongly.)

1 Like

Seconded; this is, like, the worst of both worlds.

My 2 cents is that we shouldn’t use for comments any color other than something in between default background and default foreground. They’re not code and using grey for them in black/white or white/black color schemes works well for me, while the green is blindingly asking for attention.

1 Like

Oh and you have bold, might as well use it. The kate syntax highlighting uses it for keywords in most languages: not distracting when you need to ignore them, but still there to guide perception - just like bumps on a surface.

1 Like

Doesn’t this violates the UNIX philosophy and adds lots of complexity to the compiler?

I also don’t see a wide consensus on what the default settings for this should be so that entails IMO adding more logic to have this configurable by the end user.

It would be easier to provide a separate utility for this bundled with several preconfigured color themes instead of trying to add complex logic to figure out a default setting that would satisfy everyone.

This utility can than be bundled with the compiler itself (just add it to the installer) or installed with cargo.

Usage would than simply be something like:

rustc foo.rs | colorize --dark

Edit: This is also generally useful stand-alone utility. For example, if i want to highlight rust code on a remote machine terminal that doesn’t have any editors installed.

1 Like

I’m very much looking forward to this, personally. The color scheme needs to be chosen carefully though.

The limitation I found is that libterm only exposes the base ANSI colors (8 colors * 2 for light/dark) and bold/italics (very rarely supported)/underlined. Because of that, with the requirement to be readable in both white and black backgrounds, the only readable colors in my tests (on a Mac) are Bright Black, Red, Bright Red, Green, Yellow, Magenta, Bright Magenta, and Cyan. But X and Windows have different color schemes. X for example doesn't have the concept of light black and dark white.

Also, these are short code examples. Comments are likely something you should be taking attention to, as they'd normally point out important caveats. That being said, bright black is pretty readable in both backgrounds for comments.

The compiler already has the logic to colorize code (because of rustdoc) and already has the code to colorize term output (in libterm, to allow colorized messages).

The way I see this is to be very conservative. Other tools can take the json output and give you all the customizability you'd want :slight_smile:

I'm not against such a tool, but the compiler should have basic support for this, IMO, even if it isn't with a scheme that everyone loves.

Most of my points would be moot though, if we expand libterm to support 256 colors with fallbacks.

Beware, Solarized terminals don't do well with bright black:

I've personally moved to base16's solarized theme, but that requires some shell hacking for extended colors.