Concerned about Rust 2018 stability


#1

Maybe I’m imagining things or overreacting to confirmation bias or cherry-picking my quotes/links here, but there seems to be a recent trend of not finding serious problems until after stabilization, and of features relevant to the new edition either causing those post-stabilization fires or still not having key parts of their design settled this close to the edition release.

So… I’m getting worried. Things seem to be slipping through the cracks, and other things seem to be in danger of being completed and/or reworked so close to the deadline that more things might slip through more cracks. It feels like we might’ve bitten off more than we can chew. Clearly the core team has noticed this, but I’m not sure that “we need more eyes on nightly” is the entirety of the problem here, or that mitigating just that subproblem would be enough to ensure the new edition is rock solid.

…thoughts?


#2

I would like more evidence that we were more stable at a previous point in time. You can’t really draw a conclusion from all of the events you’ve pointed out since April without some evidence of what the baseline was prior to that.

My belief, in general, is that the stabilization process has not changed, and, per feature, we have never had better stability than we do now. I don’t think we have any good evidence either way, but I just think that its exceedingly unlikely we were doing better in 2015 than we are now, given that we had less users, a smaller team, and less experience.

Also, some of your examples are unfair. Stabilizing and then unstabilizing a feature before it hits stable is the stabilization process working, not something to be concerned about.

There are two things that I would contend have changed, causing a feeling of less stability:

  1. We are stabilizing more features than we were before. This is the result of a few things: first, we have more bandwidth to stabilize features than we’ve ever had. Second, shipping more features is the core of the roadmap for 2018. If every feature we add to stable has an X% chance of having a bug in it, shipping 10x more features means 10x more bugs in the features we shipped to stable. This does not seem like a reason to ship less features to me, unless we don’t have the capacity to fix those bugs, but we’ve been fixing them promptly.
  2. We are making more point releases than we were before. This point is more interesting, and I want to dwell on it more, so it’ll be the subject of the rest of my comment.

I think Ashley Williams summed up the most obvious, and commonly held, explanation for why we have made so many point releases lately:

We are probably not getting as many eyes on nightly as we want to. As a result, we have seen an uptick in point releases.

I don’t think there is any good evidence to believe that the stabilization process being weaker is the reason we have seen more point releases! As I argued previously, I don’t know of any direct evidence that our stabilization process has gotten weaker. I think there is good evidence that the reason we are making more point releases is that our release process has gotten better, rather than that our stabilization process has gotten worse.

First, there is good reason to think our release process has gotten better. Shortly before we started making many point releases, we organized a release team and an infrastructure team. I would expect having these teams has significantly improved our release process.

Second, I think there is good evidence that our standard for making a point release has changed a lot. Contrasting two bugs, one from 1.15.1 and one from 1.27.1, is instructive I think.

  • 1.15.1: IntoIter::as_mut_slice borrows &self. This is a clear and unambiguous soundness bug in a newly stabilized API; the API is just wrong. And yet multiple core team members argued that it wasn’t worth the effort to make a point release for it. We ultimately did, but its hard to imagine even considering the question today. This is exactly what point releases are for.

  • 1.27.1: We fixed an ICE as a part of this point release related to match ergonomics. The release notes interestingly say it was backported because it could be related to a soundness hole. I haven’t seen a soundness hole demonstrated anywhere related to this ICE. We’re now releasing patch releases with backports of potential soundness fixes.

What we’ve seen is a huge shift in our capacity to make point releases, rather than a shift in the necessity of making point releases. This doesn’t seem to me like a cause for concern.


That said, I do think our stabilization process could be improved! I believe, as I said earlier, that it has never been better, but that doesn’t mean it’s good enough. As other parts of our process have improved, they have revealed the long dormant inadequacy of this part of our process.

But seeing opportunity to improve is not the same as seeing cause for concern. :slight_smile: I’d be excited to see improvements to our stabilization process, which needs no pretense that we have somehow gone astray.


#3

Some unbaked ideas for somewhat improving the situation re. soundness:

  • Let’s have bors / rust-highfive / insert-other-bot emit a comment warning everyone if:

    • a PR modifies any module with unsafe { .. } in it
    • a PR introduces new unsafe { .. }
  • Let’s have tidy refuse all unsafe { .. } blocks that do not have a comment explaining why it is safe. We could also use some sort of rustc internal attribute instead to enforce this. This will likely make a lot of code fail, so the embargo must be placed after some time.


#4

I have to confess that the recent point releases make me feel uneasy too. Don’t get me wrong; I’m not accusing anyone. But I do think we as a community can do better at finding bugs before they are stabilized.

@withoutboats Do you think the approach I proposed in this comment would help: 1.27.1 prerelease testing


#5

Features are on by default in the period after the stabilization PR has landed but before beta has been cut. Perhaps it would be possible to turn them on by default in nightly only to give them a longer period before they hit beta.


#6

That said, I do think our stabilization process could be improved! I believe, as I said earlier, that it has never been better, but that doesn’t mean it’s good enough. As other parts of our process have improved, they have revealed the long dormant inadequacy of this part of our process.

In fact, there have been process improvements floating around within various people’s minds; one thing that we’re hoping to start trying out for point releases and possibly relatively major feature stabilizations is asking the community to help us test them. We’re also unofficially going to try and bake patches to stable/beta on nightly for at least 7 days before releasing the stable backports; you can see an example of this here: https://internals.rust-lang.org/t/patch-testing-for-1-27-2.


We certainly have more capacity and ability to do point releases now than before. That capacity change has not really been met with many process improvements; unfortunately, we are unaware of people with time and interest in proposing and working through a more formal process currently. I think the post-edition state of Rust is likely to have sufficient bandwidth for this, though.


#7

I think this is just a natural consequence of nightly becoming less necessary for bleeding-edge development (as things like impl Trait stabilize), and more and more people doing their everyday work on stable as a result.

The answer will probably be to somehow increase the quality of bot and devteam testing on nightly and beta in order to reduce the amount of bugs which slip through the cracks of stable users until release day.

But ultimately, no voluntary quality assurance process ever beats throwing a large user base at a piece of code, and this is what I think we enjoyed before, but will be losing gradually as a result of ecosystem stabilization.


#8

Another one: the “fix” for the third match ergonomics soundness bug also introduced a different bug, which will lead to the 2nd .2 patch release.


#9

I had this idea when I was reading this thread: Introduce new features to stable behind their unstable feature flags until the next major release. If there are no bugs found the feature flag would be removed with the next release.

This could encourage experimentation because features are clearly advertised as complete and if there is some problem found the code that started to use it is pretty new and thus more easily changed. Also there would be no large piling of feature flags in stable as they are effectively only for one cycle.


#10

I also worry a bit regarding consequences of pressure to release Rust 2018 in the announced time frame. I think that “it’s ready when it’s ready” process a-la C++ 20XX could be a better approach to editions. In other words editions could be defined not by a target year, but by set of target features and only named by release year.


#11

Those are both excellent points I was not at all consciously aware of, so I am a lot less concerned now. Maybe it’s worth touching on this in the public announcement the next time we have a patch release, since there’s probably a lot of people that, like me, are wired to assume more patch releases means more instability.

So I guess “we need more eyes on nightly” is the only problem that might have actually gotten worse, since we’re doing so much better at keeping the ecosystem on stable. I’m not sure how to solve this without eroding the stability we’ve achieved.


#12

I’d like to link to what I said in u.r.l.o back when 1.27 was announced there: https://users.rust-lang.org/t/rust-1-27-0-is-out/18233/12

The gist is soundness bugs are extra problematic in Rust due to the nature of how most Rust code is written. There are also “optics” around issues like this - it makes the adoption and stability argument a bit weaker. I’m personally particularly bothered by this because I never liked match ergonomics to begin with - I’d rather the “bug budget” be spent on more needed features :slight_smile:.

I admire and respect all the work and effort being put into Rust - kudos for that to everyone involved - but I’d be lying if I said I didn’t share some of the stability and “features being rushed” feelings expressed by others.


#13

I wonder if all this is also related to the proportion of nightly vs stable users changing over time. This might be my own bias talking, but I would expect that the stable population increases quite a bit faster than the nightly population, so as that happens the amount of extra scrutiny that will come in once a feature hits stable will also increase. Similarly, it might also be true that crater runs will test a smaller sample of the whole population of Rust code over time.

I have similar concerns. It feels like there is high pressure on the core teams to deliver a whole lotta stuff over the next few months. The bugs found after stable release of match ergonomics indicate that features will get a lot more testing once they hit stable.

Isn’t the entire point of release trains not to try to cram features into a release, because a new one will always be just around the corner? We should try to avoid similar problems from having edition deadlines.


#14

More big features are being implemented at the same time (compared to eg. last year), which is great and explains why there are more issues in absolute terms. So I don’t worry about stability that much.

However I agree with the general sentiment that devs should not feel pressured to any hard deadlines.


#15

The lack of easy visibility into the status of these features bothers me somewhat, too. I asked a little about this in the tracking issue, but I think it would be nice if it were easier to see at a glance what questions are still unresolved and whether there are any opportunities for potential contributors to help out on a specific feature.


#16

All soundness bugs of the point releases except 1.15.1 are due to logical error inside the compiler. Linting unsafe wouldn’t prevent any of these unsound bugs. I doubt there would be much benefit just checking unsafe use in the standard library.


#17

To be clear, I’m not saying that those ideas would have fixed the soundness bugs as of late, but nonetheless, they may catch future ones of the 1.15.1 flavor. It’s not a silver bullet, but with a hail of bullets we can perhaps improve the situation.