RFC Mentoring Opportunity: Permit `?` in `main()`

@withoutboats I could imagine a scheme where if we all ? in main but it also has the type signature fn() everything could work out.

That is today ? is analgous to:

match e {
    Ok(e) => e,
    Err(e) => return Err(e)
}

We could just define it in the main function as:

match e {
    Ok(e) => e,
    Err(e) => std::rt::main_error(e),
}

where main_error is something like fn<E: Error>() -> ! (or something like that)

In that sense the behavior of ? changes, but the type signature of main remains the same (and faithful) to the actual type.

1 Like

Edit: Sorry if this is a bit rambly; it’s hot and I only just woke up. Brain might not be in completely working order just yet.

I think it would be a mistake to have ? behave differently in main. All that does is trade the possibility that some copy+paste coding will work versus now having to explain that the behaviour of ? is context-dependent, and having the behaviour of ? be context-dependent.

If this is principally based around helping new users and people who are copy+pasting code from examples without understanding what’s going on, and then being confused when it doesn’t work, then I don’t think any of the proposals actually address that.

First of all, let’s say we allow main to have a different signature that permits errors to be returned. Given a user who is blindly copy+pasting code, or trying to write error handling without understanding how error handling works, how do they know they need to change the signature of main? This is almost the same as just using a try_main function and having main call that: it only works if you know to do it.

If we give main special, magical properties (which includes giving ? magical properties inside main), we now have a new problem: the moment the user tries to move code that works in main to some other function, it’ll stop working. At that point, we’re back to square one again. Plus, we now need to explain this magic behaviour, and users need to remember it.

Any solution to this needs to be global and consistent across all functions: it cannot just affect main. It also has to be something that requires no knowledge or action from new users.

Given that, I can only think of one solution to the novice user problem (that doesn’t involve completely changing how error handling works): add a diagnostic that specifically detects ā€œtried to use ? in a function without a Result return typeā€ and tells the user what to do. As in, it flat-out tells the user the return type they should use, as accurately as possible. The user should be able to copy+paste from the error message into their code. This should work everywhere, no matter the function, not just main.

At that point, having main allow for Result<(), _> as a return type is a logical next step, so that users can act on what the diagnostic tells them. I also think it should be Result: even if the compiler internally uses some trait to wire this up, we should probably only ever mention Result to new users, or it might confuse matters (ā€œso I can return ExitStatus from main instead of Result, why can’t I do that for other functions?ā€).

That said, I think the important part is the diagnostic. Changing the return status of main is a nice little improvement, but it only improves one function, and only helps in the specific case of trying to blindly copy error handling code into main as opposed to any other function the user may be writing.

9 Likes

Changing return type of fn main also allows easier and more automatic mapping to zero/non-zero exit code.

Currently if error is handled manually (e.g. with match), user may do println! (to stdout instead of stderr by the way), but forget to ::std::process::exit(1).

6 Likes

I do think requiring users to annotate main with a new return type is not a bad idea.

2 Likes

Assuming we go that route, we should make sure that using ? in a normal main() function prints a very clear error message saying they need to add ā€œ-> Fallibleā€.

5 Likes

So, if you are referring to the alternative proposal that I mentioned, where the meaning of ? changes in main(), then there is no semantic problem here, but it would mean that if main() used ?, then it would be equivalent to unwrap(). But this example does seem to illustrate one of the reasons why having ? mean a different thing in main() might be confusing -- though I think the bigger concern (to me) would be that if somebody copied and pasted the body of main() into another fn that has the same signature, they would get errors in that other function, and that feels surprising. It'd be different if we didn't write fn main() { } but rather main { } or some other distinct item syntax.

2 Likes

OK, so, we all agree it seems that having ? work differently in main() is a bad idea. Forget that I suggested it. =) Let’s focus instead on just permitting main() to have distinct signatures. I would personally be in favor of some type alias like Fallible in that case, probably defined like so:

use std::error::Error;
type Fallible<T = ()> = Result<T, Box<Error>>;

This way you can write -> Fallible or -> Fallible<T> and both work out just fine. This seems like it would be useful not only in main but just more broadly. One quibble I have is that the typical convention that I see is that people use Result<T> for such aliases (e.g., io::Result<T>, the Result<T> type introduced by @brson’s excellent error-chain crate, etc). But I think that convention is sort of confusing and I would argue that introducing a Fallible<T> alias is a superior convention. =)

4 Likes

The idea I like is defining a new trait and having main be allowed to return any type that implements that trait. std could define it for () and Result<(), Box<Error>>. For most users, this would mean they can return that result, but frameworks or large applications might define a custom implementation of this trait.

Adding the Fallible alias seems like a fine idea but orthogonal to this change.

3 Likes

What I somehow don’t like about the ā€˜Fallible’ name is, that it puts the focus on the error case and reading 'Fallible’ might seem like it fails with ā€˜T’.

It would have been nice to have from the beginning: =)

pub enum Result<T, E = Box> { Ok(T), Err(E), }

Hm, would this even be a breaking change?

1 Like

I believe that’s a non-starter because Result is in core, but Box isn’t. You could have std redefine Result, but then core::..::Result and std::..::Result would be different types.

E = Box is also not a very good default in the typical case where you can simply return a proper error type directly without pushing it onto the heap. If we did that, we'd end up having to teach everyone that when they want to do "proper error handling" they have to stop using that default.

It does make sense for main() specifically to box its error type by default because that makes it straightforward for all types implementing Error to be compatible with ? in main(), and that cost will only be paid once at the end of a program. That's part of the reason I support a separate type name for what main() returns; it is different from your usual Result type, and for good reason.

What I somehow don't like about the 'Fallible' name is, that it puts the focus on the error case and reading 'Fallible' might seem like it fails with 'T'.

I agree with this regarding Fallible<T>, and it's one of the reasons I prefer the version with just Fallible and by default no explicit type parameters. The other reason is that T has to be () anyway, so there's not much point forcing the user to type () after it.

Agreed.

Probably Result<(), T> where T: Error but yes.

I agree it's orthogonal but it seems important to reaching the final goal of pretty examples on the front page. =) Without it though we can at least get examples that show off best practices, I guess.

2 Likes

I’m thinking that Fallible might or might not be a good idea (or a good name) but it is not that important. For most examples, after all, there may already be an existing Result alias that one could use. e.g., if the error is an io error, one could use the existing io::Result<T> alias:

use std::io;

fn main() -> io::Result<()> {
    let mut f = File::open("foo")?;
    f.write_all("Hello, world!")?;
}

Now one might argue that io::Result should default its argument to (), but that seems like a minor thing.

5 Likes

An alternative to Throws<T> was needed. The nice thing about Fallible, I think, is its two literal meanings:

  • Liable to err.
  • Liable to be erroneous (e.g. fallible information).

The first definition matches the imperative case where T=(), i.e. a fallible function. The second definition fits a type theoretic perspective such that Fallible<T> is a T that you need to check for errors analogous to the Option<T>.

I'm thinking that Fallible might or might not be a good idea (or a good name) but it is not that important.

If this is mostly about newbies, the documentation and simple code examples, then the naming seems quite a bit important.

For most examples, after all, there may already be an existing Result alias that one could use. e.g., if the error is an io error, one could use the existing io::Result alias:

Especially because this pattern will be almost everywhere - an alias which name contains Result - it would be nice if simple code examples follow it as much as possible and don't use a completely different name.

My idea would be to add this to the prelude:

pub fn run<E: Error>() -> Result<(), E> {
    Ok(main())
}

The runtime would call this function instead of main.

  • If main() is used, everything works as expected.
  • if run() -> Result<(), E> is defined it overrides the default one.

The main ā€œdisadvantageā€ would be to teach defining the run() function instead of main(). This would also break existing code that defines a run() function in main.rs

edited to use ToExitStatus:

pub fn run() -> Result<(), impl ToExitStatus> {
    Result::<(), std::io::Error>::Ok(main())
}

Right now, to write Unix CLI utilities in Rust, you wind up doing something like this:

fn inner_main() -> Result<(), HLError> {
    let args = parse_cmdline()?;
    // all the real work here
}

fn main() {
    process::exit(match inner_main() {
        Ok(_) => 0,
        Err(ref e) => {
            writeln!(io::stderr(), "{}", e).unwrap();
            1
        }
    });
}

So I like the fn main () -> something_Result_ish proposal, because it basically paves this cowpath.

However. It is very important for this use case that returning an Err from main does not trigger a panic. It normally would not represent a bug, and in some cases, it needs to produce no output other than the exit code –

$ grep -q root /etc/passwd ; echo $?
0
$ grep -q notthere /etc/passwd ; echo $?
1
$ grep -q notthere /etc/shadow ; echo $?
grep: /etc/shadow: permission denied
2
6 Likes

I would agree to sebk’s suggestion, except that run is a [C-T] polymorphic function (E isn’t even defined in this case). This highlights one of the major differences between Rust’s strict-type error handling and C++'s exceptions (conventions but no actual rules on the return type).

zackw: If you want to reimplement a Unix utility with exact handling of errors, I think you need to do something like this anyway.

IMO this is more about lazily written utilities and example code doing something sensible without any explicit error handling (other than the ?).

dhardy: I don’t see why we couldn’t find a sensible default behavior that is compatible with doing the Right Thing for CLI utilities. Thinking out loud, there are two cases that are common enough that I think libstd should support them:

  • Successful unless an I/O error occurred. Corresponds directly to Result<()>; the runtime should map Ok to exit code 0, and Err(e) to exit code 1 + print the Display of e to stderr.
  • grep-like: three-way distinction (yes, no, I/O error). Result<bool> can represent this; Ok(true) maps to exit 0, Ok(false) to exit 1, and Err(e) to exit 2 + print the Display of e to stderr.

Anything more complicated than that probably does need to be handled by the application, but ideally ā€œhandled by the applicationā€ would mean ā€œthe application implements a trait for the type it’s going to return from main, defining both the mapping to exit status and what, if anything, should be written to stderr for each case.ā€ Hypothetically

trait ToExitStatus {
    fn exit_status(&self) -> u32;
    fn report_failure(&self, stream: Write);
}

// libstd provides:
impl ToExitStatus for Result<(), E> where E: Display {
    fn exit_status(&self) -> u32 { match self { Ok(_) => 0, Err(_) => 1 } }
    fn report_failure(&self, stream: Write) {
        if let Err(ref e) = self {
            writeln!(stream, "{}", e).unwrap();
        }
    }
}

impl ToExitStatus for Result<bool, E> where E: Display {
    fn exit_status(&self) -> u32 {
        match self { Ok(true) => 0, Ok(false) => 1, Err(_) => 2 }
    }
    fn report_failure(&self, stream: Write) {
        if let Err(ref e) = self {
            writeln!(stream, "{}", e).unwrap();
        }
    }
}

// for compatibility only;
// documentation warns that ? will not work in `main` if it returns ()
impl ToExitStatus for () {
    fn exit_status(&self) -> u32 { 0 }
    fn report_failure(&self, stream: Write) { }
}
2 Likes