Type inference breakage in 1.80 has not been handled well

Rust 1.80 broke builds of almost all versions of the very popular time crate (edit: please don't shoot the messenger in that GitHub thread!!!)

Rust has left only a 4-month old version working. That was not enough time for the entire Rust ecosystem to update, especially that older versions of time were not yanked, and users had no advance warning that it will stop working.

Even though the Rust doesn't count type inference changes as semver-breaking, this is a poor excuse for releasing a change with a widespread impact without any attempt to soften the blow. It made cargo install --locked defunct for a lot of crates. It breaks building of not-very-old versions of Rust projects with their original Cargo.lock.

A crater run found a regression in over 5000 crates, and that has just been accepted as okay without any further action! This is below the level of stability and reliability that Rust should have.

There has been no public communication about this. There were no future-incompat warnings. The affected crates weren't yanked. There wasn't even a blog post announcing the problem ahead of time and urging users to update the affected dependency. Even the 1.80 release announcement didn't say a word about the incompatibility with one of the most used Rust crates.


I'd like to clarify that I'm not looking to blame anybody specifically for this problem. I'm bringing this up as a feedback for the project as a whole. I hope this will lead to changes in policies/tooling/language to help avoid similar problems in the future.

71 Likes

I don't see why you want the crates to be yanked? That is for severe bugs / broken semver / security issues, and does not seem relevant here.

5 Likes

Hrm yeah, at a minimum it should have gone into the release notes.

The PR was labeled with relnotes, so it should have gone in.

18 Likes

I'm surprised it didn't end up in the release notes, yeah.

Also, if I'm understanding the inference failure correctly, this seems like it was another case of "this only worked because it was the only available trait impl, and having a second trait impl broke things". If that's accurate, then it seems like even more evidence that we should rethink the "only one impl" as a stability hazard.

20 Likes

Agreed. Do we have any analysis of how often code is hitting it? Is it just a few traits, or lots of them?

1 Like

I wonder why the breakage was considered okay for that PR but not for, say, 73390.

6 Likes

Confirming this is the case. If I remember correctly, Esteban has said that he's looking into a lint for the case of Into specifically, as it can be trivially removed as an identity conversion (which is what David did in a PR for time months ago).

As to why the code was present in the first place, I have absolutely no idea :person_shrugging: My best guess is I was debugging something and/or reducing code back to only what was necessary after getting it working and missed removing an identity conversion. I'd be surprised if there isn't more similar code out there in the wild that simply isn't used as much.

For what it's worth, this is the first time I've ever encountered something building with one release and not with the next (excluding lints, of course).

5 Likes

As one of the people who previously tried to add a new Add<_> for String impl, I'm all for this (even if that specific war is forever lost).
However, as one of the maintainers of axum, I want to note that there are some patterns out there like axums Handler trait that just wouldn't work in any reasonable way without it¹. Maybe traits would have to opt into single-applicable-impl-affects-inference behavior starting Edition 2027 or somesuch? Seems like this needs a new Pre-RFC thread :slight_smile:

¹ Specifically, we depend on type inference figuring out the first generic parameter T from the fact that only a single of the Handler impls applying to any specific type you'd want to pass to a routing function. Usually, the handler type is a closure but there's also a use for non-closure types like Layered implementing Handler. Specifying T at the callsite of a routing function would be possible in many or all cases, but would completely destroy ergonomics of this pattern.

12 Likes

Why yank?

  1. These crates no longer work on any supported Rust version (which is 1.80, because the Rust project doesn't support past versions). They're permanently defunct.
  2. It makes Cargo alert users of the affected versions that there's a problem with them.
  3. It prevents new users from locking to the broken versions.

and if yanking of them seems like a too drastic measure or done too soon, then breaking them was also done too hard too soon.

10 Likes

Not everyone is on the latest version. In particular there is Ferrocene, which have a legit reason to trail behind a few versions. (I'm less sympathetic to LTS distros and any other use case where you don't need the certification for legal reasons.)

As for the other reasons, the crate not building when the user updates would also work to alert them to the issue. Since new users develop on the latest (probably) they won't lock to the latest anyways.

In summary yanking adds no value here, just churn. Nor would for example filing rustsec advisories (which to me is similar to yanking in severity level). Neither of these actions make sense in this context.

9 Likes

The docs for cargo yank say the following:

Crates should only be yanked in exceptional circumstances, for example, an accidental publish, an unintentional SemVer breakages, or a significantly broken and unusable crate.

If a crate relies on the "only one impl" footgun and doesn't guarantee forwards compatibility, I could see a reasonable dev considering that "significantly broken and unusable."

In summary yanking adds no value here, just churn.

The churn will become necessary unless if a user plans to stay on <=1.79 forever, which seems unlikely to me. So you can't avoid churn, you can only defer it.

4 Likes

If you checked in Cargo.lock for your project, it will keep working just fine even when a dependency gets yanked. If you don't check in Cargo.lock, it can already break on older rustc versions as you likely have at least one dependency whose (effective) MSRV is the latest rustc version.

That seems likely, yes; some discussion on Zulip led to the same suggestion.

Interesting. It seems like there's enough information there for the compiler to infer the type without using the only-one-impl rule, but that's a good use case to verify when checking that.

4 Likes

Closure type inference always throws a wrench into simple intuition due to the hack that changes inference order for closure arguments. E.g. given fn f(impl Handler<(M, T), S>), calling f(|req| async { … } can start type inference on the closure knowing req: T and $return: impl Future<Output=S> iff there's a single Handler impl which could apply to a closure with that airity. Whereas without that hack (i.e. without the closure being a function argument with known FnOnce family), both start out as fresh type inference variables and e.g. methods can't be called on req syntactically before it is further constrained to a known type.

This inference order inversion is certainly extremely useful, but it can be startlingly surprising at times, especially when interacting with other subtle type inference behaviors (such as single applicable impl inference and/or resolution based on function airity).

Closure inference is already touchy/fragile without a direct (or directly implied) generic bound on a Fn* trait, and anyone relying on such should hopefully have figured that out quickly, so I don't think it'd be too horrible of an idea to keep the single applicable impl inference for closures, even if forbidden on other types/impls without an opt-in attribute. After all, the type information from the Fn* bound technically also relies on knowing only a single Fn impl can apply to the closure (even though that type and impl hasn't been fully synthesized yet).

7 Likes

I wrote such a lint, but I hadn't accounted for a severe limitation: rustc doesn't keep track of cfgd type Aliases, which means that if you have a c_int and call .into() on it to turn it into an i32, the current impl complains about it (when it shouldn't). I'm having trouble coming up with a good strategy to be more accurate so this lint can be turned at least warn-by-default.


I'm also dissapointed in how this change was handled. At the very least I would have expected the impl being delayed for at least one additional cycle to give time to the ecosystem to upgrade to a newer time. I also wish I had seen this before the release so that I could have had time to write a lint or even a better explanation in the diagnostic to reduce the fallout and confusion this release caused. Even then, I am slightly concerned that we might be too cavalier with declaring some technically-allowed-breaking-changes "acceptable", because even if we only accept this every few releases, over time it effectively means that building a older project will invariably require using an older toolchain version. I believe we generally are falling short of our backwards compatibility promise when we do things like this. I hope we can take this one case as inspiration for a better approach, including language changes as mentioned above. I've also toyed with the idea of having cargo have a database/mechanism similar to RustSec to track "known crate-version issues" so that compiler errors can be expanded with additional actionable information, which would at least mitigate the general effect.

36 Likes

(NOT A CONTRIBUTION)

I imagine there's an absolutely huge amount of code that continues to compile only on the basis of this behavior (otherwise it would need type annotations, probably of the most syntactically unpleasant variety), but I would agree that this inference rule is clearly at odds with Rust's other design decisions. Rust has gone out of its way with things like the orphan rules to make sure that adding impls is not a breaking change, and then in this case it goes and throws it all away. It's absurd.

22 Likes

seems like a good idea, i independently came up with the idea of an attribute for opting into this behavior, but it seems like someone else beat me too the punch.

1 Like

This seems to use the exact same reasoning.. maybe the 1.80 time breakage was just a mistake?

On the other hand, @dtolnay, who objected to impl AsRef for Cow<'_, str> on the grounds of type inference breakage, announced that the libs team explictly decided to break time's type inference, which is inconsistent. But if this was deliberate and deemed a good outcome, perhaps that AsRef impl should be reconsidered, after all?

Also: could cargo fix fix such cases of type inference breakage?

Another solution: what if a crate could inform it was compiled with a given Rust version, and type inference used only the impls available on that version? (So that a crate that specified it builds with Rust 1.79 would not have impls defined in Rust 1.80 or later participate in inference, and thus would not be subject to this kind of breakage)

8 Likes

The same person can hold one position, but publish another on behalf of the team consensus. It's not like the supreme court which publishes a dissenting opinion at the same time as a ruling. I also would rather we focused on the process and incentives that led to a situation, as opposed to the individuals. People come and go, their opinions change, but we're all unconsciously affected by systemic incentives. I want bollards around releases, not for the people driving to be very, very careful not to stray onto the sidewalk.

Only if rustc knows about the breakage and emits a structured suggestion for it. rustc today only knows about .rs files, so a change to a Cargo.toml is out of scope (even technically, DiagnosticBuilder only knows about .rs files). I know that there are conversations for cargo to learn to emit its own suggestions and if that were to be implemented, then rustfix should also learn to use that.

16 Likes

Ok, what about this: if some code fails to compile due to not succeeding in type inference, remove the impls added in the previous rustc version and try again (maybe repeat this a few times). If the code compiles, then rustc has some evidence that maybe the code used to work, and broke due to a rustc update that wasn't perfectly backwards compatible. Then emit a suggestion to add some type annotation or turbofish that would fix the problem.

If the code already existed before the breakage, then one can be reasonably sure that rustc suggestion was correct.

2 Likes