Making certain annotations optional for non-public items


#1

I often see questions along the lines of “Why do I need to explicitly write foo here, when it’s obvious the code won’t work without fooing this?” to which the usual explanation is something like “It’s important that foo appear in the function signature/type declaration because it’s part of that type’s/function’s public API”. But for types/functions that are not in any way public, this doesn’t apply.

So I wanted to ask if it seems reasonable to consider proposals to allow downgrading the lack of certain annotations on non-public items from a hard error to a mere warning. The motivation is that, although these annotations should remain mandatory in public APIs, they tend to interfere with prototyping, experimenting and refactoring. Also, while newcomers will eventually have to learn what all of these annotations do and why they exist, turning them into mere “speed bumps” in the form of warnings often seems preferable to a hard error that forces them to go learn about a certain feature/annotation before allowing them to get any further work done.

Some concrete examples to clarify what I’m talking about:

  • #[derive(Debug)] and #[derive(Display)]. If all I want to do is a one-off println!("{:?}", x); for debugging purposes, it seems reasonable to auto-derive Debug/Display and emit a warning, as long as x is not in any way public, especially since these are “viral annotations” that may require being temporarily added to several types just to make that println! statement compile.

  • Lifetime parameters in structs. If I want to experiment with changing struct Foo{ x: T } to struct Foo{ x: &T }, I have to temporarily add an <'a> to a lot of the code that mentions this type. This is also “viral”, and also a bad idea if Foo is public.

  • Unused type/lifetime parameters. I didn’t even realize until very recently that there was a good reason we force users to wrap unused parameters in PhantomData, i.e. allowing the compiler to figure out the correct variance for you. But we could also default to invariance and print a warning saying that you should use PhantomData and it might allow more code to compile if you do.

  • Orphan rule exemptions. Not really an “annotation”, and I feel like I’m missing some really obvious reason why this can’t possibly work, but it seems like I should be able to do things like impl Serialize for Vec<Arc<MyStruct>> as long as MyStruct is not in any way public. Of course, it should still generate a warning so that I know I’m effectively making MyStruct perma-private by doing this.

I’m not necessarily proposing we implement any of these examples. I’m mostly interested in how everyone feels about the general principle of downgrading certain errors into warnings for types/functions that are not in any way public, since I see a lot of potential for improving ergonomics this way, assuming it’s not wildly unsafe to relax these rules.

When I say “not in any way public”, I’m trying to handwave away whatever problems would be caused by private-in-public rules. Unfortunately I don’t understand those rules very well.


#2

Reading through the examples, I get the impression that there are two potential motivations for lifting such restrictions:

Cases where the loosened restriction is simply more ergonomic or enables new patterns.

I welcome such proposals (in principle), but I don’t think they necessarily need to be tied to the type being private (e.g. lifetime elision for structs), nor should they be framed primarily as removing a roadblock for beginners or quick-and-dirty prototypes. As was pointed out several times during discussions of the “Improve ergonomics” item of the road map, many ergonomic improvements can benefit everyone, not just beginners.

Cases where the looser restriction merely allow one to be more sloppy.

Loosening them only for private items feels like a way to “contain” that sloppiness. While that can certainly avoid some frustration with beginners, it also has negative effects which must be weighted: These liberties will not only be used by beginners in their first weeks, and private code is still code whose quality matters. And then consider how frustrating it is to just make one thing public and having to iteratively add pubs everywhere else because of private-in-public errors. Now imagine having to re-architect your entire code because it turns out a certain impl is fundamentally incompatible with adding the API you want.

Furthermore, as with many “easy mode” proposals, they merely delay having to deal with the problem — in contrast with pure ergonomics improvements, which likewise help beginners but keep being positive as they progress. That is not to say these changes are useless, of course (the other extreme, dumping the entire language on someone when they write their first program, is clearly unacceptable), I’m just saying we need to look at how the shape of the entire learning curve changes. That is, if a feature is adopted,

  • how much further can a beginner get without facing down the roadblock it addresses?
  • does that make it more complicated to teach the rest of the language later, or make it more likely beginners later have to unlearn something?

Regarding the first point, I’d like to point out that quite a few beginners start to write (or port) a library as a learning project, and libraries have plenty of public items. So while there may be fewer instances of a roadblock overall (as most libraries have some private types), it may still hit the same people at around the same time.


#3

There’s no relationship between the orphan rules and your type being public. This would conflict with the right of serde to add impl<T> Serialize for Vec<T> (which it currently has, in fact).

Coherence applies whether your types are public or not.

I’m not in principle against any of these but other than the second point about lifetime params I don’t think the complexity/convenience trade-off favors changing things.


#4

The reason to be explicit about things (or perhaps a better way to put it is having redundancy), is that it allows the compiler to check multiple things that have to align. In a sense, this is like “checks and balances”. For example, it’s impossible to communicate intent with only a single action. People usually communicate intent by verballing stating intent along with an action that likely aligns with the statement.

That said, if everything in Rust (or life) required an explicit statement of intent with every action, things would be unmanageable. Requiring explicitness is beneficial when the cost of doing so is low and the benefit of verifying the intent is high. Function signatures is a perfect example of this (at least for functions which are not trying to do lots of metaprogramming).

An interesting side effect of requiring multiple matching data points in a language is that when they don’t match, error messages end up teaching people about the requirements of the language.

One place that I think rust gets the tradeoff right is with lifetime elision. This is mostly because the compiler can verify that the code inside the function actually obeys the assumed intent. Therefore, the common cases are handled while complex situations end up resulting in errors when the behavior of the code implementing the function don’t match it’s assumed lifetimes. Note, the interesting thing here is that the reason I think it works well is that the compiler can rely on the function body to know when the user’s intent and the assumed intent based on the elision rules mismatch. Without that, I don’t think it would work as well.

One place I think rust gets the tradeoff wrong is with derefs and all of the various rules, which I still don’t understand, in part, because the compiler always does it and therefore, I can’t learn via getting it wrong. I remember Niko mentioning this as a ‘success’ because before, he just typed *s between the & and the expression. Unfortunately, the part that is missing now is that users never have to learn when derefs are required and therefore, adding *s never transitions from a learning experience to a chore, the learning part just never happens. Perhaps if I understood the rules, I’d think that autoderef was awesome, but unfortunately, I don’t know how I’d learn the rules without Niko (or some other ‘old-timer’) writing a blog post or reading the compiler source code.

In conclusion, just because you have to type seemingly redundant information doesn’t mean that doing so is useless. And just because errors are annoying, doesn’t mean that they don’t provide value to users.