RFC Mentoring Opportunity: Permit `?` in `main()`

If you've ever had any interest in writing an RFC, I've got a proposal for you. =) The recent release of tokio -- exciting! -- has reminded me about a nagging issue that I think we should try to fix. It's not a big thing, but I think it's an important thing, and it fits squarely into the roadmap's goals around "quality of life" improvements in ergonomics. Take a look at this example from tokio's front page. Do you see anything wrong?

fn main() {
    // Create the event loop that will drive this server
    let mut core = Core::new().unwrap();
    let handle = core.handle();

    // Bind the server's socket
    let addr = "127.0.0.1:12345".parse().unwrap();
    let sock = TcpListener::bind(&addr, &handle).unwrap();
    ...
}

Elegant abstractions? check. Easy-case-made-easy? Check. But what are all these calls to unwrap() doing here? As sidlls on Reddit put it:

My view is that sample code such as this should be as idiomatic as possible and that means providing a sample demonstrating a typical real use case. So seeing "unwrap" in this context doesn't sit well with me.

I agree! So let's do something about it. This message is serves a few purposes. First, it's an advertisement for a mentoring opportunity: I'd love to find someone who wants to work on this RFC with me (and others on the lang-team). Second, it's a vague proposal, making the case for this change and laying out a few different ways we could go about it.

What is the problem exactly?

There is a real danger, I think, that people will copy-and-paste code from these examples, which use unwrap(), into their own functions, where unwrap() is quite likely the wrong thing to do. Accumulated over many projects, this can lead to a lot more panicking and a lot less correct error handling. This applies not only to front-pages, but also to rustdoc documentation, where the use of unwrap() is sadly rampant. It would be so much nicer if it were possible to write example code that used ? instead of unwrap(). As a side benefit, the example code would be more concise and attractive anyhow:

fn main() {
    // Create the event loop that will drive this server
    let mut core = Core::new()?;
    let handle = core.handle();

    // Bind the server's socket
    let addr = "127.0.0.1:12345".parse()?;
    let sock = TcpListener::bind(&addr, &handle)?;
    ...
}

Of course, the code above will not compile. The main() function has a return type of (), but the ? operator wants to propagate the result, and so it expects a function whose return type is Result<(), E> for some E (Box<Error> would likely be a good choice, here). Of course, if I tried to change the type of main() to match, that too will not compile:

fn main() -> Result<(), Box<Error>> {
    let mut core = Core::new()?;
    ...
}

Now the problem is that main() is required by the language to have return the type (). This makes sense, on the one hand: what would we do with the return value? But it's also an obstacle here.

But what about non-novice users? Do they benefit?

So far, I have framed this question as being all about novice users who don't know that they should be translating unwrap() calls. But I believe solving this problem would benefit all Rust users at any skill level. If nothing else, it's annoying to have to translate unwrap() calls when copying-and-pasting code from examples (not just examples on the front page; similar concerns apply to many bits of Rustdoc). More deeply, it just simplifies our lives if ? is the uniform way of saying "propagate this error up higher" -- in the case of main(), that means propagating the error up to the standard runtime, which can "handle it" by printing it out to the screen (or, perhaps, in some way that you as a user can customize, read on).

How can we solve this?

I don't have a fully developed solution to this problem. I see two basic approaches to solving it:

  1. Allow main() to return more kinds of things (in particular, results).
  2. Change the ? means when it is used in main().

If you were interested in working on the RFC, you could help decide which one of those makes the most sense. Hint, hint. =) Anyway, let's look at them in a bit more detail.

Allow main() to return more kinds of things

We saw earlier that f main could return a Result<(), Box<Error>>, then we could use ? within the body in a pretty natural way. We might generalize this concept by saying that main() must return some type T that defines a new trait; let's call it Terminate:

trait Terminate {
    fn terminate(value: Self);
}

The standard library (which calls main()) will now call Terminate::terminate(main()) instead. So the terminate method basically defines what happens when main() terminates by returning a value of that type. For example, we would define terminate for () as a no-op:

impl Terminate for () {
    fn terminate(_: ()) { }
}

The standard library could define Terminate for Result<(), Box<Error>> as something like:

impl Terminate for Result<(), Box<Error>> {
    fn terminate(error: Self) {
        error.unwrap();
    }
}

Users could, if they wanted, define their own terminating types. (This, combined with the ? operator supporting a customizable trait, might allow for people to more readily customize their error handling, as well). This same trait could be used when spawning threads, so that thread::spawn() could be updated to take a clsoure which returns some type T where T: Terminate.

This would require defining the trait Terminate (probably in libcore) and making it a lang-item (so that the compiler can check that main() returns a type which satisfies it). But it's otherwise a fairly simple change.

One downside is that our example code now needs a more complex signature for main(). It'd be nice to be able to write something more concise -- or maybe just fn main(), as we write now. That brings us to the next thought.

We could also change how the question mark operator works

UPDATE: Everybody agrees this is a bad idea. Let's not do it. =)

I'm not sure if I like this idea, but it's also a possibility to say that ?, if used in the main() function (which is well-known to rustc) and outside the scope of any catch, would desugar differently. Instead of desugaring to something which returns the error, it would desugar to something that invokes a handler in the standard library:

// <expr>? becomes:
match <expr> {
  Ok(v) => v,
  Err(e) => ::std::unhandled_error(e), // or whatever
}

Where unhandled_error() probably looks like:

fn unhandled_error<E: Error>(e: E) { ... }

The upside of this is that the return type of main() can remain (). The downside is that the return of main() remains (). In other words, the example code becomes less representative of what users would have to write in their own code, since unless they too are copying-and-pasting this example code into main(), their function would need (and want) a Result return type. So I'm inclined to say this is a bad alternative. But I thought I'd throw it out there.

Conclusions

We should solve this problem. It is eminently solvable and will benefit everyone, in some small way. Is anyone interested?

As far as the proposals in this message, the first proposal in this message is probably better: we can consider addressing the wordy return type separately -- at worst with a type alias in the standard library, e.g., std::Fallible<()> -- or maybe it is not such a problem. We should also look into how to make rustdoc permit ? (if it doesn't already; that problem seems a bit easier).

17 Likes

As far as I recall rustdoc just wraps examples into a fn main() with some extra prelude, so the second approach should "just work". It might be possible to easily have the first approach work by changing that to fn main() -> Result<(), Box<Error>> if the example contains the ? character.

I actually think the rustdoc example test cases are the best motivator for this, I have many many examples where I've used unwrap just because doing anything else is too much effort.

Right. This is roughly what I had in mind. =) We can probably make it customizable via some flags, too, e.g., writing "```?\n...\n```" or something like that.

Alternative: implement catch and write

fn main() {
    catch {
        let mut core = Core::new()?;
        ...
    };
}

in examples (and rustdoc internally).

fn main() {
    catch {
        let mut core = Core::new()?;
        ...
    }.unwrap();
}

surely. Otherwise where does the error go?

3 Likes

Thanks for writing this up @nikomatsakis! I think this is one of the highest-profile warts in Rust right now. It hugely affects public perception. Whoever fixes it will be a great hero.

6 Likes

This is an interesting thought. I think I prefer to implement catch{ } and have ? work in main (and threads, rustdoc, etc) – this way you don’t risk people being confused that they should echo the catch { ... } pattern in their own code.

The advantage I see of writing

fn main() -> Result<(), Box<Error>> {
    do_Stuff()?
}

is that it is exactly what people actually ought to write in their own functions.

I’d sort of like to see an alias for Result<(), Box<Error>>, like Throws<T>, in the prelude, so that we can write

fn main() -> Throws<()> {
}

but one thing at a time. And I guess that name is bad (it sounds like it throws a () value, even overlooking the use of the word “throw”).

4 Likes

RFC issues discussing this:

https://github.com/rust-lang/rfcs/issues/1176

https://github.com/rust-lang/rust/issues/35946

2 Likes

So no matter what approach, yes please. I’m not a novice and I frequently complain about this.

That being said, and I will likely fall in the minority, in that I think main should be magical (also sorry it already is) and should work with ? or not, without changing the type signature.

Here’s my reasons:

  1. It already is special. The compiler specifically complains about it under certain conditions.
  2. Having the user type main with Result will be esoterica to a novice, and equally magical as a ? working in main. I know what a result type is and you know what it is and its generic parameters, and a box, but will a novice just wanting to run some code in main? A box what’s that? Why am I typing this? Imho this is exactly the same as javas public static void main(stuff I forgot) - it’s stuff new people type to appease the compiler god(s), which gets explained later on. If they come to the point where they’re like hey, how come main doesn’t need a different type, well that’s awesome they’ve progressed that far in their understanding of the language, we can say oh the compiler treats main specially to make our life’s less burdensome, isn’t programming great?
  3. For non novices typing the signature is an extra burden for no real gain. You want to provide type documentation on main? I literally don’t care :slight_smile: . Main is special in Rust. So what. It’s special in almost every language ever, because well, the entry to a program is kind of special. I can live with main being an exception.

That being said I kind of really like the idea of the Terminator trait…

Oh, and lastly, while we’re at it, please, please do the same thing for #[test]s I find it substantially more annoying because I write way more of them than main functions…

2 Likes

C11 explicitly allows either int main(void) or int main(int argc, char* argv[]), but it’s not magic – you can only access arguments if you declare the latter form. So if we follow C’s example (:astonished:), we would allow different main signatures but still require it to be explicit. And in general, functions signatures are one place where Rust prefers explicitness.

2 Likes

Those are parameters, not return types which is being discussed here.

And yes, as I said I will be in the minority thinking that annotating main is worthwhile, as Rust majority prefers this kind of explicitness.

And we can argue about magicalness all day long but main in C is highly complicated - it’s not even the program entry point on most platforms, a host of things happen before it’s even called, etc.

My only point is most programmers understand there is something special about the main function in most languages, and most languages treat them specially.

Someday impl Trait will land, then perhaps we can just write: fn main() -> impl Terminate. Not completely explicit, but still clear to the type system and consistent with how other functions are written.

FWIW that's also true in Rust, with C stuff and more, e.g. a backtrace with linux-gnu is like:

  • crate::main
  • panic_unwind::__rust_maybe_catch_panic
  • std::panicking::try
  • std::panic::catch_unwind
  • std::rt::lang_start
  • __libc_start_main
  • _start
1 Like

Maybe Unwraps? fn main() -> Unwraps<()> {}

1 Like

Ah, this is a good point! Yes, I agree.

I know, this is my point that it's special... But good point about impl return that will be awesome!

I guess an error would result in a fail? I do like the current semantics where test failure means panic. Having two ways to signal failure feels kinda unclean. I actually can't say I've ever had this desire, and never considered unwraps in tests to be a problem. Each one is a testable assertion, which is fine.

2 Likes

Not sure which proposal is better but shouldn’t Terminate::terminate return !?

Possibly we could combine the two proposals by having main run Terminate::terminate on every return statement in main.

Not if Self is Ok. (A question we should all ponder from time to time...)

2 Likes

I don’t follow, terminate is run after main returns, why shouldn’t it diverge if main returns Ok?

Ah, you mean like std::process::exit(0)?

1 Like