This is the issue under discussion. Some folks would like rustfmt to care about the existing formatting, while other folks would like rustfmt to canonicalize code into a fixed format that doesn't depend on existing formatting.
It's not totally clear if this thread is talking about gripes about rustfmt in general, or gripes about rustfmt in the rust-lang repos.
Likewise. I brought up that the original post doesn't seem to lock-in to just rust-lang repos, but that was largely ignored.
Can we get clarity on what exactly this topic should be about?
To me it's both.
-
IMHO rustfmt's "canonicalization" approach is too harsh for
--check
. It's causing well-formatted code to be rejected not because of style or readability, but because of rustfmt's heuristics flip-flopping. -
This problem is amplified in case of rust-lang project, because of long compile times and non-standard
x.py fmt
. This made the spuriousrustfmt
failure into a bigger blocker, and the fix was non-obvious.
FWIW, --check
shouldn't be considered the wrong party here, but rather the actual formatting. It's important that fmt --check
passing means that fmt
is a no-op; the benefit of fmt --check
is that you can run fmt
without changing code you didn't edit when you edit code.
So --check
is doing its job perfectly well: making sure fmt
is safe to use on patches. Strictness concerns should be levied at the actual formatting, not check.
I do hope rust-lang/rust can (soon?) get on stable rustfmt rather than requiring using x.py to marshal the correct bootstrap version. Since "rustfmt 1.0" they've actually done a good job of keeping formatting stable (modulo bugs with edge cases (usually around macros)). Ultimately the bootstrap fmt will be the source of truth because of bugs, but once "Version = "Two"
" is GM I hope the rust-lang/rust can stick to the stable subset of configuration. Then just plain cargo fmt
can work in the 90% case where there isn't a format impacting bug fix between stable and bootstrap rustfmt.
So I think this "issue under discussion" is quite key: an automatic formatter has to be idempotent, for sure, and is expected to be consistent, rather than deterministic.
I'd say that the formatter is already not 100% consistent, by formatting short expressions / statements differently than lengthy ones, and most people would agree that it is for the best (I personally don't, but I acknowledge that I am in a minority in that regard).
A feature the current formatter has, but which I think is more of an implementation detail than a carefully thought design choice, is that it is "initial formatting" agnostic: its input, modulo comments (and that's a huge modulo for implementors, by the way!), consists mainly of an AST that has thus discarded whitespace.
The suggested thing here, and I personally agree that it looks like a desirable feature to make the whole experience smoother, is to change that: for a given AST input + whitespaced in some fashion, then there is still a deterministic formatting of it, but such formatting may differ when given the same AST differently whitespaced.
For those who would not like such "inconsistency", I'd say that "short/long formatting" is kind of already doing that, since a purely 100% consistent formatting should ignore the length of identifiers, for instance. Its not being consistent in that regard while being adamant about other thresholds, is precisely what leads to that "peculiar" property of a rename change potentially affecting more lines than those including the renamed variable.
I expect it's unlikely that rust-lang/rust can use stable format at any point soon, simply because it must be able to format inherently syntactically nightly code within the tree -- that stable rustfmt won't even be able to parse.
However, this should hopefully not be a major blocker to editor support for fmt-on-save and such; at least for me I do let g:rustfmt_command = "/Users/mark/Edit/rust/build/x86_64-apple-darwin/stage0/bin/rustfmt"
in my local .vimrc for just rustc and that works out perfectly. We intentionally update the rustfmt there somewhat rarely (once every 6 weeks, on average, unless we need to do so for bugs in rustfmt).
I think it would be useful to split out two threads from this discussion: rustfmt being too aggressive, perhaps, in formatting code, and improvements we can make in rust-lang/rust specifically to user's workflow. I personally would suspect that making the latter easier would be where we can most easily make the largest wins -- I'm open to suggestions and reviewing patches to the fmt integration if that would be helpful to folks.
I submitted an issue suggesting the rustbuild/CI improvement of not blocking test results on tidy/fmt nits. Discussion of that potential improvement specifically can live there.
Since people brought up gofmt, I figured that this post by Russ Cox on the purpose of gofmt's rigid style was extremely relevant: https://groups.google.com/forum/#!msg/golang-nuts/HC2sDhrZW5Y/7iuKxdbLExkJ
I'll let rsc do the talking here for me, but the key point is that how the code looks is secondary. Code must be formatted for mechanical, whole-project transformation. clang-format serves a similar purpose at my workplace.
Formatters are not "prettifiers". They are meant to canonicalize the encoding of an AST as text, much like DER gives an ASN.1-based structure a canonical encoding.
I think the only points of contention here are rustfmt's heuristics, which is a pure engineering problem, and how it gets executed (which is probably a somewhat interesting problem if you're not a card-carrying member of the "format on save" party, like me).
Is there any real-world example of tools using gofmt's "canonical" format the way Ross Cox suggests?
Because it sounds to me like, whatever the benefits of a canonical format for tooling, you could easily get the same benefits from a "canonical + some input-dependent spacing" formatter.
Oh, sure, the fact that it's as rigid as it is is a matter of taste. The only thing that's truly important is that a unique encoding of your AST exists (though, the readability benefits of "there is one format" cannot be overstated... and I say this as a professional C++ programmer).
I am not a Go programmer, so I'm not quite an expert, but the fixup tools in the go
tool rely very heavily on the assumption that they can mechanically update your code without breaking anything. I've written a number of single-use tools at work that operate on C++ that, similarly, assume that mechanical transformations will be formatted with clang-format.
Edit: Oh, I suppose I should have mentioned one other bit with regards to your note. The reason you don't want it to depend on input is so that tooling can just roundtrip your code through an AST, without your modifications having to keep track of any indentation (or similar) information in the AST when creating or splicing nodes. This also means that extremely stupid but effective tools can be built by doing
perl -pi -e "s///g" **/*.go && go fmt
# or even
perl -pi -e "s///g" **/*.cc **/*.h && clang-format ...
I think being able to do the same for big Rust projects like rustc is very valuable.
One concern I have with automatically formatting all commits in a PR is that it makes it impossible to detect which of my local/remote branches have been merged upstream already. Disposing of old merged branches while keeping unmerged branches around can currently happen automatically; that would then require manual work.
(This would be less annoying if GitHub at least had the GitLab feature to automatically delete a remote branch once its PR got merged...)
Ah, that's a well-hidden option. Thanks! I set it for my fork now, we will see if that does anything.
Forgive me for being harsh, nitpicky and uncompromising, but rustfmt is not 100% bijective to the AST. At least it doesn't reject code for one empty line vs no empty line in the middle of a function between statements.
One hopes that whatever AST rustfmt works on understands empty lines in arbitrary locations the way it needs to understand comments in arbitrary locations. Formatters' ASTs tend to need to be very complicated so that round-tripping actually works.
Yeah, but if it understands empty lines in arbitrary locations, then it should also be able to understand line breaks in arbitrary (or semi-arbitrary) parts of an expression.
That is an excellent point.
The Rust repo already has a bot make a commit before anything gets committed to master. When the idea of rustfmting this repo was proposed I advocated for having bors do it on merge and letting PRs stay how the author formatted them: this causes zero friction for the author (including new or infrequent contributors), and if the styling is really a friction for review a reviewer can request reformatting. I strongly wish the rust project would adopt this policy instead of turning PRs red because they have the wrong number of newlines in them.
Or import their types in non-alphabetic ordering...
Do we have a rough estimate how many CI runs are failing b.c. of rustfmt?