Rust CI and submodule crates

We’re currently adding some tools as submodules in the Rust repo. Cargo and the RLS are in already, and I think it makes sense to add Rustfmt (because of the libsyntax/Syntex issues) and Clippy in the same way. An open question is whether these tools’ tests should block the Rust CI.

AIUI Cargo tests already do. We plan for the RLS tests to do so also, and I think that is a good idea. However, I am not sure if it is a good idea to extend the policy to Rustfmt and Clippy. The consequence of doing so is that making breaking changes to the compiler will force the compiler dev to fix Rustfmt or Clippy, rather than the Rustfmt or Clippy devs. I feel like this is not really a fair burden to place on the compiler team (though as the main Rustfmt developer, I’m also kind of happy about the idea :slight_smile: ).

I’d like to know how the compiler team feel about this. In terms of expected levels of breakage, with Rustfmt only libsyntax changes would be an issue and these tend to be small and fairly easy to fix. Clippy has a larger surface area, effectively the whole compiler, but I’m not sure about frequency of breakages. There is also the issue of having to do a bit of a dance with PRs to make this work (PR to the tool, which breaks the tool’s CI, then PR to Rust which makes the breaking change plus updates the tool’s git submodule).

The alternative is that we do not block Rust’s CI on some tools. Since we want to distribute these tools with Rustup (which is why they end up in the repo in the first place) that would be a bit of a problem. Two alternatives are that we teach Rustup that components can be broken and to not update if that is the case (or at least warn the user and ask), or to move to a less frequent release schedule (say weekly, rather than nightly) where we guarantee that tools are not busted for each weekly release. I think I like the first, but not the second. Both would require some engineering effort.

cc @compiler_subteam @Manishearth

2 Likes

I prefer that the CI test everything, and that we not release broken nightlies, that adding more tools to the product we release requires giving them the same commitment as the existing tooling. Our users, for better or worse, expect nightlies to be highly reliable, and if we get into a mode where nightlies regularly don’t work that will be bad for Rust’s perception.

Another alternative is to pull these tools in-tree to avoid the painful submodule dance.

3 Likes

I'm okay with not blocking CI on clippy.

I'll note that in 90% of the cases when clippy breaks it's due to various renames or something that the breaking PR would have had to fix all over the rest of rustc anyway, so it's not like it's a lot of inconvenience to add another crate to that list (assuming clippy is vendored, not a submodule). Clippy maintainers could help with the rest.

Still, it's nbd if we don't block on clippy. What I would like is that CI still builds clippy all the time, and gives clippy maintainers (and maybe even the PR author, by way of a homu comment on the PR) a notification that it failed so that we can try and catch it before the nightly.

or to move to a less frequent release schedule (say weekly, rather than nightly) where we guarantee that tools are not busted for each weekly release. I think I like the first, but not the second.

There's also a middle ground -- create a new release channel, "weekly", which has this guarantee. I don't like this idea much though.


I'm also okay with @brson's suggestion of just pulling clippy in tree; in fact this was the original plan we had discussed before.

I disagree that we must give the same commitment to tools; if they are optional rustup component add tools they can have differing commitments. I would like to see rustfmt and maybe even clippy as core rust tools much like go fmt and go vet (I'm not certain if this is what you're proposing). But that need not happen now, and for that to happen there are many more discussions that would have to happen (involving RFCs and other things). I'm not fond of parceling larger problems as riders on easier problems -- rustup component add clippy is a matter of a few small decisions and some (nontrivial, sure) engineering, making clippy a rustc-owned and maintained tool is a larger matter involving many decisions and stuff. I would like the latter to happen, but I don't think blocking the former on the latter is a good idea.

The UI of the thing can help dispel perceptions that "nightlies don't work", via the warning system mentioned earlier. Instead of having nonworking nightlies, users will update to the last complete nightly (complete = "includes all optional tools that they have installed"), and get a warning that "versions of the optional X and Y tools do not exist for nightly Z so we have upgraded to nightly W. If you would like to force upgrade to nightly W and allow these tools to break, run rustup update --force-blah".

Users do expect nightlies to be reliable, but there is a very small subset of users (compiler devs) that cares about getting the exact bleeding edge nightly. For everyone else "some nightly from the last few days" is enough, really; I do not feel that folks will be overly bothered by this.

(Also, clippy and RLS are AFAICT the major reason that folks are on nightly. fixing this kinda obviates the problem anyway)

2 Likes

I think instead of making contributing to the compiler harder (especially once submodules are involved), one should fix the problem of broken nightlies for clippy/rls on the rustup side: if you have told rustup that you want clippy or rls, it should only update if the new version has the tools you told rustup you want. I think right now this is a manual process?

4 Likes

Nick’s suggestion of submodules with nonblocking CI does not make contributing to the compiler harder. You don’t have to worry about clippy unless you’re a clippy dev (or someone making a release when the clippy build is broken). You can choose to worry about clippy, and that help would be appreciated, but I assume it would be an off by default option.

I mean, we already have cargo/etc submodules and that hasn’t made compiler dev harder. And those do gate on CI.

If we do block on CI then yes, it gets a bit harder, but only as much harder as it got when librustc_mir was added to the build for example. In this case we’d likely not use a submodule since coordinated updates across submodules are hard.

The solution is on the rustup side here. But you want deterministic releases, so you need some way of pinning to a clippy version in the rustc repo, so the solution is submodules or vendoring.

2 Likes

I think I agree with this, though with a slightly different rationale.

We've been increasingly moving toward a model where the build/CI/release infrastructure is completely standardized: we run the same core process whether we're tryign to land a PR, producing a nightly, or producing a release. A big part of the motivation there was eliminating problems where PR tests would succeed, but production of a nightly would fail; that's literally impossible now, but used to be a very stressful and painful aspect of Rust maintenance.

@alexcrichton has often articulated the principle that if you don't test something, it will break. And our experience has been that trying to coordinate with breakage after the fact, and get "the next nightly" working, tends to cause a lot of pain for everyone. It also increases the risk that there will be breakage at the point that it's time to cut a release.

Yet another angle: we surely want stable releases to test these tools (given the elevated status we're trying to give them); if we don't also test on every PR, then we put ourselves back in a situation where CI differs between different kinds of builds/releases, which seems like a step backward.

TLDR, I think based on the lessons we've learned in the past about CI, on the whole it's going to be less overall pain to bake these tests in everywhere. The fact that this leads to higher quality nightlies is a nice side benefit.

Finally, regarding the burden on those contributing to the compiler: Servo has a somewhat similar problem, where a change to Servo could cause breakage in Gecko, which the contributor knows nothing about. I think a good strategy there is for the Clippy or rustfmt maintainers to offer to take the PR over the finish line, or at least to help mentor the contributor through fixing the problems in the case where it's nontrivial.

3 Likes

I never said that. I only responded to the suggestion to make the CI blocking. Apparently a large part of this thread seems to advocate that.

cargo is a tool that only uses rust as a separate process, and the APIs it communicates with rust are very few. So breakage should be pretty little, and if there is some, its most likely a bug. clippy and rls on the other hand use unstable APIs of the compiler, and any change of them might mean little or big refactoring work to follow in their codebases. Also, rls and cargo are only a part of the rust source tree for a very short time.

I mean I understand that when stabilizing you need to update a couple of repos because of documentation, but this task happens almost never. But I don't really want to be forced to refactor clippy just because the AST broke or something. If it is vendored, its much easier yes, but even then I'd prefer not to.

1 Like

Actually cargo blocked rust#41910 recently due to exact-UI tests, not API usage.

1 Like

I see a number of benefits to moving rustfmt, specifically, in tree:

  • This is a core tool we expect basically every rust developer to be using
    • If we are going to do so, I think we have to step up to the plate and make sure that it’s always working, and that it always covers the same language as the compiler
    • For example, there was a long period where rustfmt rendered any use of impl Trait as TODO. As someone who tries to use rustfmt on personal projects, I found this very annoying, because I was in the habit of running it regularly, and it would break my code every time.
  • If rustfmt were in tree, we could test it on all run-pass tests, just as we do the current pretty-printer:
    • This would ensure that, whenever you add a new language feature, you must also have some modicum of support in rustfmt
  • Hopefully, we can kill the old pretty-printer, and replace it with rustfmt.
    • As I understand it, they’re largely interchangeable.
  • Eventually I predict we are going to want to make various changes for which a more advanced rustfix would be useful:
    • Example might be rewrite function signatures to use elision 2.0.
    • This would require a combination of semantic information and rustfmt, and that will be easier to coordinate with both in tree (and to keep it working as we make other changes).

At the moment, I don’ think clippy has the “official blessing” that rustfmt does, so the case there is somewhat less clear. But clearly it’s a popular project, so at minimum making it easier to find out when there is breakage and ensure it is promptly fixed makes sense.

Another factor: before we move clippy in tree, we would definitely have to develop some policy around lints. This would be useful anyway, though. Right now I feel a lot of uncertainty about what kinds of lints we should add as well as whether they belong in clippy or rustc etc. (Like, if clippy is in tree, why does it exist at all, rather than just being part of rustc?) This feels like a big conversation (bigger than this thread).

One other question: I heard mention of adapting rustup to understand not to upgrade nightly until clippy is working etc. Setting up that tooling sounds itself like a fair amount of work, so that makes me a bit nervous.

7 Likes

I, like Brian and Aaron, feel that if we add tools and expect them to work then we basically have to gate on them working. To me, that implies that adding tools like rustfmt, clippy, rls, etc to the repository entails building them and running their tests. Without this tools will inevitably break, and empirically this can take on the order of weeks to fix to finally catch up to master (after which it all restarts again).

I also see adding a tool that compiles on stable Rust to be very different than adding a tool that compiles on nightly Rust. A tool that compiles on stable, like Cargo, can be updated for changes in the compiler without changing the compiler. If a change is introduced that breaks Cargo for whatever reason (such as https://github.com/rust-lang/rust/pull/41910 that @kennytm mentioned) it’s possible to send a PR to Cargo, get everything fix, and Cargo still works.

For a nightly tool, however, this is not be possible. Changes to the compiler that may break the tool require landing a patch to the tool itself which will break the tool until the patch lands. This creates very complication situations, for example, if the patch itself doesn’t land into the compiler.

Given all that, to me adding a tool that only compiles on nightly implies the following features:

  • We must gate on it, both building and tests.
  • When it breaks, the a branch of the tool itself must go to an irrevocably broken state. It will be broken until changes in the compiler have landed and are published.
  • Developers of rust-lang/rust now must always consider fixing this tool. For example developers who break Cargo are currently the ones expected to send PRs to rust-lang/cargo and then update the submodule in the rust-lang/rust PR.
2 Likes

Right, I addressed that in the other half of my answer, with the suggestion that if it's going to be blocking we should vendor, not submodule it, at which point it's no more a problem than adding any other librustc_ crate.

1 Like

I don't really feel like this is necessary. We can move it in tree to support the rustup component add release case without having that discussion. If moved in tree cargo-clippy would still exist out of tree (open to vendoring that too, but it's not necessary). The current policy has clearly been "no new lints in rustc" for a while AIUI. We can discuss changing it, and perhaps even just merging clippy into rustc's inbuilt lints directly, but that doesn't have to block this. Again, we shouldn't be parceling harder problems with easier ones; these are independent issues and can be solved piecemeal.

(At the same time, I do want this discussion to happen -- would love to see at least some lints upstreaming to rustc, since this was originally one of the goals of clippy, before the "no new lints" policy)

Setting up that tooling sounds itself like a fair amount of work, so that makes me a bit nervous.

Doesn't really feel that way to me -- you check if the optional components have a corresponding file on the server, and if not, try previous nightlies until you're back to the currently installed nightly. It's work, but won't be too complex IMO. I might be missing something of course :smile:


I think a good strategy there is for the Clippy or rustfmt maintainers to offer to take the PR over the finish line,

FWIW this has always been part of the discussion and is IIRC agreed upon by clippy maintainers :smile: . While we would appreciate if rustc folks would help, we'd still be driving clippy breakage fixes, either after the fact in a nongated world, or by working with PR authors in a gated world.

I do think that if we are gating then we should move the clippy lint code into tree, not as a submodule. Submodules requiring coordinated updates are annoying. Clippy devs would be able to work around this, but this would be annoying for rustc folks -- for efficiently working with submodules you need push access to the submodule's repo so that you can make temporary branches, which works for a small set of devs but will be annoying for anyone who doesn't have push and wants to make a PR that breaks clippy.


When we're talking about the old "broken nightly" situation I think there's a bit of a false equivalence being made here. A nightly, or cargo, being broken is a very bad thing. But a clippyless nightly is really fine -- assuming that rustup protects you from updating to it (with the option to force update anyway). In case the beta is broken off a broken clippy nightly there are six weeks to fix that, so we don't have to worry much about this affecting stable (and again, clippy maintainers will help with this). So while I am very happy that all PRs are gated on "can this produce a valid nightly" by being equivalent to a dist build, I don't think this is necessary for things which could be optional in a dist build.

Ultimately I'm fine with gating on clippy per build -- in fact, I prefer it, since it's easier to fix things in the same PR that breaks it instead of trying to figure stuff out after it lands, especially if it removes capabilities we depended on. But I don't think this is absolutely necessary for rustup-packaged stable clippy to work.

(My main concern with only fixating on the "gate on clippy" solution is that there may be opposition from the rustc devs, which ultimately ends up just blocking anything from happening. It's always good to have options open.)

empirically this can take on the order of weeks to fix to finally catch up to master (after which it all restarts again).

(Clippy nightly fixes take like a day or two. Can take up to a week, in rather rare situations, which usually are involving PRs to rustc.)

1 Like

Lots to respond to...

in-tree

I want to make a distinction between vendored into the tree and using a Git submodule. We currently do the former for Rustdoc and the latter for Cargo and the RLS. I would be strongly against the former for Rustfmt and Clippy, I think there is a strong benefit in developing these tools in their own repos - managing issues and PRs is easier, landing times are shorter, less intimidating for new users, puts less pressure on Rust/homu queue, etc. I'm not actually sure if anyone is proposing vendoring, I hope not :slight_smile:

I do think that if we are gating then we should move the clippy lint code into tree, not as a submodule. Submodules requiring coordinated updates are annoying. Clippy devs would be able to work around this, but this would be annoying for rustc folks -- for efficiently working with submodules you need push access to the submodule's repo so that you can make temporary branches, which works for a small set of devs but will be annoying for anyone who doesn't have push and wants to make a PR that breaks clippy.

I do not agree with this. I do agree that the submodule dance is annoying, but for me working with a mega-repo is more annoying. I don't think you need push access to the tool repo to make this work either - you can send PRs and point the submodule at that branch instead of master. You can also just accept that CI for the tools will be broken temporarily while stuff lands on the main repo.

stability

My understanding has been that the stability of tools should be independent of the compiler channel they work with. For example, the RLS is currently riding the trains towards inclusion with the stable channel on Rustup, however, that does not imply that the RLS is ready for 1.0. The same should apply for Rustfmt and Clippy. It occurs to me that from the POV of a user using Rustup this will not be clear, we should think about this in detail before the RLS hits stable (i.e., now).

For me this implies that auditing the lints in Clippy and the compiler (which I strongly agree needs to be done) doesn't need to be done before we distribute Clippy with Rustup (and submodule it into tree). I think we should be moving Clippy soon - not necessarily now, but soon - I feel like it is popular and the time is right, but I don't think there is any technical reason to rush. Rustfmt on the other hand is being slowly broken by Syntex being unmaintained and we need a solution very soon.

quality control

It seems like the general direction here is that we should be strict about ensuring quality by blocking CI on tool tests. I broadly agree and think this is important for core tools (Cargo, RLS, perhaps Rustfmt), however, like Manish, I want to push back a little bit - the option of having nightlies without Clippy or Rustfmt seems not too bad, as long as it is temporary. It also makes landing breaking changes much easier.

Again like Manish, I think it is worth making a distinction between broken nightlies and nightlies with a missing component. The former is bad, the latter is only bad-ish.

empirically this can take on the order of weeks to fix to finally catch up to master (after which it all restarts again)

I'm not sure what has taken weeks to catch up with - I don't think we have experience here with Clippy or Rustfmt, so I assume Cargo?

So, to be concrete let me try to outline the scenarios in detail (I'm only considering new tools here, I'm assuming no changes tot he process around existing tools):

Gate Rust CI on tools

  • Rustup always works for all tools on all channels
  • Rust CI blocked on tool tests (and building)
  • Nightly build is exactly the same as commit build
  • Landing a tool-breaking change to the compiler requires first submitting a PR to the tool which breaks the tool's CI. Submodule is then updated to the 'broken' tip, tool CI works again once the Rust PR lands and hits nightly. Or, submodule points at 'broken' branch, tool lands PR when nightly updates (i.e., the nightly update breaks tool CI). I.e., rustc devs must submit a PR to the tool before/at the same time as the Rust PR (tool devs could help with this)

Don't gate Rust CI on tools

  • Rustup works for all tools on stable and beta, sometimes 'broken' for some tools on nightly
  • Rust CI doesn't run tool tests (or build the tool) on commits
  • Nightly only includes tool if tests pass (i.e., nightly never fails unless commits fail, but nightly may exclude a tool)
  • Landing a tool breaking changes breaks the tool (either silently or with some kind of notification). The tool's CI breaks when the Rust PR hits nightly, the tool devs then fix it (no extra obligation on rustc devs).
1 Like

So was proposing vendoring in case we are gating on CI, mostly because of the submodule dance. I really don't want to impact the rustc dev experience. I don't really like vendoring for the same reasons you mentioned (the clippy dev experience gets worse), but I'm afraid the changes won't be palatable otherwise.

However, you're right, it's easy to get around the submodule dance if you allow for stuff to temporarily point at forks. If we're fine with that, then yes, we should definitely avoid vendoring. If palatable to rustc devs I'm definitely all for submoduling.

(Note that when I said vendoring, I actually meant "move clippy_lints sources in tree, and remove from clippy repo". But yes, this makes the issue tracking story terrible)

My understanding has been that the stability of tools should be independent of the compiler channel they work with.

Big +1

I've been doing this now and then; I did want to have a Clippy 1.0. That said lint stability for clippy is basically "the lint will never go away", not "the lint will not break on #[deny(lint)] if you upgrade". This is the same as the deal for rustc, and I prefer it to be that way.

but I don't think there is any technical reason to rush. Rustfmt on the other hand is being slowly broken by Syntex being unmaintained and we need a solution very soon.

If anything clippy has it far worse than rustfmt; it's just that we're actively maintaining stuff because there's no alternative, whereas rustfmt can still mostly work on a stale syntex.

I'm not sure what has taken weeks to catch up with - I don't think we have experience here with Clippy or Rustfmt, so I assume Cargo?

Last year we had a bunch of broken nightly cases that lasted nearly a week. Like I said before, clippy being broken for more than a day or two is rare.

(tool devs could help with this)

+1, we do want to make this as smooth as possible for rustc devs

1 Like

Thanks for writing this up @nrc! Definitely agreed the stability story should be ironed out, but for now I'll respond a bit specifically on the quality control pieces. I personally find this to be a contradiction:

Rustup works for all tools on stable and beta, sometimes 'broken' for some tools on nightly

This means that whenever we branch beta, all tools must be building on that commit. This is, however, never guaranteed because tools can be broken at any time. This in turn means that a beta could be delayed many days if not weeks, and at the absolute worst case means a stable release could be delayed. Today we've got the guarantee at least that there should be nothing blocking a beta as any changes from the master to beta transition are quite minor. (we know all tests are green)

I'm personally very wary to stake stable/beta releases on "we hope someone will come fix this before the release comes around". We I think have had a relatively bad track record of that in the past. When I say that I predict fixes will take on the order of weeks this is accounting for (a) identification there's a bug, (b) actually fixing the bug, (c) landing the fix in the tool, (d) sending a PR to rust-lang/rust, (e) landing in rust-lang/rust, and (f) getting that fix into a nightly build.

Some of this may mix a lot into the stability story for tools. If there's a nightly-only tool and it's broken every few nightlies that seems fine. If we have a stable tool that we're not actually gating on, that seems bad. I'm currently just projecting that if clippy becomes a stable tool it seems development will be too difficult to work with, but that may be projecting too much perhaps. On the other hand though it's sort of a bummer to keep providing carrots on the nightly channel rather than stable.

2 Likes

There’s an assumption here that fixing a broken tool could take a really long time – emperically this has not been the case for clippy and AFAICT rustfmt. Clippy takes a day or two to fix nightly bustage, usually. The times it has taken a week are when we’ve had to coordinate fixes with rustc and wait for a nightly (which is not a problem here). So the beta problem could be solved by closer scrutiny before the beta branch-off, perhaps even a temporary gating (if your PR breaks clippy, don’t land it till the next Monday).

But yes, this is not ideal. Gating is better for the health of clippy and the release process overall.

The intention of this whole thing is to make clippy work for stable users, so it’s kinda a non-starter to make it nightly only.

Sure yes I have no doubt that a PR can be created relatively quickly, what I’m worried about is actually going through the whole process of getting into nightlies. That takes a long time. We’ve done our damndest to make it as fast, smooth, and reliable as possible but things still happen to drag it out over a long time.

And again for me at least, a “week of closer scrutiny” sounds like “we’ll forget to do this a few months down the road and a release gets delayed”. If crates like clippy and rustfmt end up using significant chunks of the compiler it’s also not clear what this would even mean, possibly blocking a lot of other work just to ensure tools don’t break.

FWIW in clippy we try to limit the surface we use, and generally go through some common APIs we’ve built. We don’t intend to expand tendrils into all of rustc if this integration happens.

That said, totally get your concern here.

(Again, I prefer gating, I’m just afraid that the rest of the compiler folks won’t want that)

Hmm, so I think the uplift story is important to get right. ISTM that for a single tool to should be easy - a few days before the beta uplift date we ensure the tool works, then on the date we pick the most recent commit where the tool still works as the beta commit. However, with multiple tools this gets harder - we have to ensure there is a single, recent commit where all tools work. Which seems like it might sometimes not happen without significant effort.

The only alternative I can imagine is that we uplift to beta with potentially broken tools. The tools teams then must push updates to the tools and directly to beta to ensure they work. Once working, we gate beta on all tests, so we always have a fully working stable. Obviously having incomplete tools on beta sucks. If the tools don’t get fixed within a cycle then stable is missing the tools, but if a tool takes longer than 6 weeks to fix, it probably shouldn’t be part of the Rust distro.

ISTM, that gating on tools tests is better than having non-working tools on beta, but I’m not 100% certain.

That's what I thought it was going to work before this thread. We'd have 6 weeks to get a tool to work on beta.

Couldn't we do a trial run where we gate on a tool and check how it works for a planned period (6 weeks) and if any concerns arise in that timeframe, we remove the gating and go "back" to only adding the tool if it works?