Style guide for comments

The topic of stylistic decisions in the source code for rustc recently arose after a a large PR of cosmetic and other minor changes that I made. Although I argue I made a lot of the comments more readable and consistently styled/formatted, the point was fairly brought up that others may have different views, explicitly or implicitly, and the codebase may soon drift away from a convention again, without specific guidelines, and their enforcement in PR reviews. Hence, I was hoping to garner the feedback of other rustc devs here, and in particular the compiler and documentation teams. (Is there any way to tag them here?)

Some things I think we need to decide:

  • Whether to enclose all code in comments in backticks.
  • Whether to enclose metavariables in comments in backticks.
  • At what line length to wrap comments, and whether this should be relative to the start of the comment or column 0.
  • Whether to follow all Latin abbreviations (such as ā€œe.g.ā€, ā€œi.e.ā€, and ā€œN.B.ā€) with commas, per American (and possibly other) convention, or without commas, per British (and possibly other) convention.
  • Whether to use American English spelling throughout.
  • Capitalisation rules. For example, whether to start all sentences and phrases with capitals, or just phrases (that is, no verbs).
  • Whether to follow full stops with one or two spaces.
  • Whether to use abbreviations for common words or not.
  • Use of punctuation in general.
  • The desired tone (formal/informal/etc.) of language.
  • The style (and Markdown structure) for file/module headers.
  • Whether (or in which cases) to allow end-of-line comments.
  • When to prefer documentation comments (///) and when to prefer regular comments (//).
  • The use of indented vs. backtick-enclosed code blocks.
  • The use of double-hyphen (--) for the dash.
  • The use of dashes versus colons versus semicolons vs commas.
  • Style/formatting of debug!, assert!, and bug! messages.

Things that probably arenā€™t important:

  • Whether to use the Oxford comma.
  • Humour in comments.

Good, well-written, and well-formatted comments are important to maintaining a clean codebase, in my view. Iā€™ve seen a large variation in the quality and amount of comments in rustc, but going forwards, we should at least be aiming for consistency. All thoughts and opinions welcome.

N.B., so as not to bias replies either way, Iā€™m withholding my own views for now!

3 Likes

I'm not sure about the feasibility of enforcement, but I love a good bikeshed... so here goes:

Yes, for clarity.

Same as any code, at 80 characters :slight_smile: I expect rustfmt should do the job here?

Yes; sadly. Consistency is key. We don't write code with variable names colour so we shouldn't write it in comments either.

Per normal AmE rules you should capitalize sentences, but this isn't German.

One; why would it be two?

Seems fine to use "etc.", "i.e.", "e.g.", "wrt." and so on. Initialisms for common technical terms in rustc seems also fine.

Yes, always.

It seems better to always use /// whenever the compiler permits it to make private APIs as documentation-like as possible. This is especially true on functions.

Formal.

Yes, yes, and yes.

Provided that it is actually funny, yes.

2 Likes

The Rust style guide says non-comment source lines can be up to 100 characters.

It is also 100 in the default configuration for rustfmt

Quoting the sentence about maximum line width:

The maximum width for a line is 100 characters.

Quoting the paragraph about comment lines:

Source lines which are entirely a comment should be limited to 80 characters in length (including comment sigils, but excluding indentation) or the maximum width of the line (including comment sigils and indentation), whichever is smaller:

Testing formatting in the playground it formats the code to fit within 100 characters width:

Unformatted:

fn main() {
    let list = [
        1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 
        11, 12, 13, 14, 15, 16, 17, 18, 19, 
        20, 21, 22, 23, 24, 25,26, 27, 28, 29, 30,
    ];
}

Formatted:

fn main() {
    let list = [
        1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,
        26, 27, 28, 29, 30,
    ];
}

Yeah I know; it's regrettable :wink: whatever rustfmt does is the way to go.

One thing I used to hate about Go, but which I really like now (I am very, very susceptible to Stockholm syndrome) is how it forces you to write all item comments in the form of // A $item blah. or // $item blahs. (i.e., complete, active-voice sentences with the item as the subject). I don't actually like this style, but having a style is very, very comforting; I often have trouble coming up with sentence structure for writing documentation. Having a documented template for item comments would be awesome; I tend to imitate the style of std, which is basically the Go style but removing the subject:

/// A pointer to heap-allocated memory.
pub struct Box<T>(..);

impl<T> Box<T> {
  /// Allocates memory on the heap and then places `x` into it.
  pub fn new(x: T) { .. }
}

As far as I can tell, this actual style isn't even directly recommended by the Book, in contrast with Go, where the linter will give you a hard time for it. I found that being told to write my documentation like that was very helpful.

You'd be amazed how often I've come across this in C++ comments at work. Not a fan, either.

  • Because someone was taught to type that way, as a hold-over from typewriters.
  • To distinguish between periods used to terminate sentences, and periods used for other purposes e.g. initialisms.
  • To visually space sentences apart to make visual scanning easier, especially when you're dealing with monospacing which can cause text to run together.
4 Likes

The professional central-office editors I used to deal with in the major standards agencies: ISO, IEC, IEEE, etc. all required single spaces after end-of-sentence periods, question-marks, etc. That may be a holdover from minimizing the number of pages required to print the documents, but itā€™s also a sensible style decision.

Iā€™m on my phone, but we have two RFCs regarding how documentation is formatted in the standard library. It covers many of these questions.

1 Like

We have the doc_markdown lint in Clippy: Clippy Lints

It's not enabled by default because there's a high risk of false positives due to the nature of natural languages. It currently warns on missing backticks for CamelCase words, words_with_underscores, and words::that::look::Like paths.

Common spelling errors.

We have an open issue with some discussion: Spell checking Ā· Issue #2508 Ā· rust-lang/rust-clippy Ā· GitHub My personal opinion is that there are tools that do this better, like codespell and that spell checking is a very complex topic that warrants its own tools. Feel free to chime in on the issue.

Forgotten backticks around true .

How would you know if true is meant as natural language or meant as a language keyword? For example in this sentence: It's true that sometimes there are better alternatives to passing around true and false in Rust .

3 Likes

It's debated enough to have its own wiki page:

1 Like

I agree it would be great to lint on all these pointsā€¦ some of these arenā€™t so easy though, due to the nature of natural language, as @phansch points out. At the least, Iā€™d like to see them published somewhere and upheld in PR reviews.

Use of the imperative (e.g., ā€œReturnā€) rather than the present tense (e.g., ā€œReturnsā€) is something that rather irks me, I must say.

Yeahā€¦ I think convention has tipped heavily in favour of a single space since the advent of word processing a few decades ago though. Double spacing is typically proscribed here in Britain, everywhere Iā€™ve seen.

1 Like

Glad to hear. Probably most of them canā€™t be automated (put into Clippy or tidy), but if we could a) check all the points mentioned in this thread are covered there, b) drill it into PR reviewers (and authors?) to follow them, I think that would be great.

I agree. Style guides from non-technical agencies also tend to recommend it these days, I believe. Double spaces have (rightly in my view) fallen out of favour since the decline of typewriters, although a substantial minority of people still us them it seems.

I donā€™t really care about most of these issues, although I do think there should be a ā€œhouse styleā€ that has rules for them, because consistency makes things easier to read.

But I do care about how wide the source code gets :slight_smile: I have middle-aged eyes, so I use a larger font than a lot of people seem to prefer, and I want to be able to have two editor windows open side by side with different pieces of code in them. This only works if comments are kept to the same overall width as the code. It is better if everything is no more than 80 columns wide. 100 is acceptable. 120 is too wide.

(Itā€™d be okay to make an exception for extremely arcane stuff, like assembly language and TECO macros, where you may want to put a comment on every single line.)

(Middle-aged eyes also appreciate a wider space after sentence-ending punctuation, but I think thatā€™s something the computer ought to do for me automatically, not something I ought to have to type.)

3 Likes

Just dropping this off here:

4 Likes

Nice. Plans on reviving that project? I think @centril would be a fan of that w.r.t. ā€œin order toā€. :wink:

I like those guidelines too. I think we could take that document and expand on it, perhaps.

How should we take things forward from here?

1 Like

Iā€™d personally prefer this not to go forward, unless 1) the style check is automated and 2) the churn is one-time and is grouped together with the great rustfmt run.

1 Like

You couldnā€™t automate this all without strong AI probably. :stuck_out_tongue: