As @ryanavella mentioned in many cases this requires a breaking change or a new function that does the same but with a different error message.
Even if that where not an issue: Which one should libraries choose? They would either have to select one of them or be generic over the error type to use. Likely resulting in slower compilation and required opt-in on the library side.
At best it solves this specific case of developers wanting different behavior from the stdlib. Think about those that want to avoid std::fmt because of binary size (e.g. for embedded development). They currently need to manually avoid those functions, compile them and let llvm optimize them away.
This might not always be necessary. We (kind of) already have 3 versions of the stdlib: core, alloc and std. Someone building with no_std does not need to recompile the stdlib because a preset of functionalities is already available as a precompiled artifact.
It is implemented like different crates and not one crate with features (not sure if that means that each one has its own precompiled artifact), but I wouldn't be surprised if we end up with multiple precompiled presets of std, which enable different features. For example the one we currently have (focus on performance), one for binary size (no fmt, abort instead of rewind) and one that prefers verbosity over performance (i.e. alloc in errors).
Whether this is practical is a different thing of course, building std might be easier (even though it needs the beta? compiler instead of stable).
The bulk of this discussion has been focused on allocation, but...
(e.g. many C errors of this type get confusing when you do not have permissions to one of their parent directories)
... for this particular subproblem I think the solution would also involve stdlib breaking the path into parts and attempting to open path fds (or some equivalent) for each intermediate directory in turn, right?
That seems like quite a heavy performance cost unrelated to allocation, and might also change the details of how the operation behaves if stdlib doesn't exactly match how the kernel parses paths.
Is there a different way to get information about which path step failed that I'm not thinking of?
But core and alloc are present in std. For targets with std, we only ship std, and even if you build #![no_std], all of std is still there. (That's why extern crate std works, or dependencies that use std, even when the top crate is #![no_std].) The only difference is in what symbols your crate knows how to link against, effectively.
A std profile which puts more information in io::Errors would require shipping a complete separate build of the stdlib, or at least duplicating all symbols that transitively do IO somewhere internally. It's doable, but far from ideal.
Bootstrap was reordered somewhat recently. rustc-stable-N is built from the -beta-N-1 toolchain (with a special magic environment variable to say “this is a toolchain build, allow #![feature] please”, and then rebuilt from the -stable-N toolchain). std-stable-N is now always built with the -stable-N compiler (with that same special environment flag enabling unstable nightly features).
So build-std "works" today without needing any separate toolchains present. The difficult part comes more from the stdlib using a custom build process; either Cargo needs to learn to orchestrate that build directly, or stdlib needs to reduce its usage of the custom build environment and use the same Cargo environment the rest of the ecosystem uses. The proper solution is likely some of both.
I know this isn't exactly what @DragonDev1906 suggested, but rather than a compile time feature flag it could be a runtime check of a static bool (linked in) or static UnsafeCell<bool> (initialized before main)
The primary performance hit would be an easily-predicted branch, which is dwarfed by syscall/IO overhead. Impact on the Rust runtime itself would be negligible too, compared to what currently runs before main.
Rust can't rely on before-main initialization, because the cdylib target where we don't control the entry point exists. An AtomicBool defaulting to current behavior could work, though, if we really want to avoid adding a new required runtime symbol.
Hmm, surely this isn't the whole story? How does a cdylib interact with those items in std::{env, io, os, process} which rely on lang_start having already run?
They don't rely on it. std::env::args() can return an empty list when lang_start didn't run depending on the OS (only some OSes provide a way to get the arguments in a way other than being passed to the main entrypoint), but everything else has no dependency on lang_start at all.
That is... very surprising. Is that worth documenting?
The std::env::args docs only have this to say regarding cdylibs:
On glibc Linux systems, arguments are retrieved by placing a function in .init_array. glibc passes argc, argv, and envp to functions in .init_array, as a non-standard extension. This allows std::env::args to work even in a cdylib or staticlib, as it does on macOS and Windows.
At first blush, I thought this was saying that only Linux needed a workaround, and that other targets were fine. I read it as: "as it does on macOS and Windows [and all other non-Linux targets not worth enumerating here]."
That paragraph of the docs has way too much emphasis on "here's our clever trick for how we made this work in cdylibs when glibc provides the dynamic loader,"[1] which is not gonna be relevant for most people. I suggest it should be rewritten entirely, along the lines of
Some platforms only give a program access to its command line arguments via the function arguments of C main. Therefore, std::env::args() may fail when called from a cdylib or staticlib crate.
At present, Rust's stdlib is only able to access the command line arguments from cdylib and staticlib crates when the target OS is Windows, macOS, or Linux with GNU libc.
Linux-the-kernel doesn't appear to have anything to do with it; the platform feature being relied on is entirely implemented within glibc. If Rust becomes usable on Hurd+glibc in the future, I expect it'll work there too. ↩︎
We should document our use of .init_array in a similar way to how we document the implementation in some other areas. The vast majority of people won't need to know however there are edge cases where it won't work.
Intentionally, or just because nobody has submitted patches for other platforms yet? On FreeBSD, OpenBSD, and NetBSD I believe you can get them through the kern.proc.args sysctl.
On Linux with musl you should in theory be able read /proc/self/cmdline too, but I'm not sure how that works with argument splitting (I assume you have to do that yourself).
I think the path should be stored (even at a significant cost) in io::Error by default for File::open/create and fs::read.
For the rare weird edge case where for some bizarre reason failure is the common path and is common enough to be a performance bottleneck, OpenOptions can gain an option to skip storing the path. Or users can call lower-level syscalls themselves (or use a crate for that).
To me it's weird that the standard library optimizes first and foremost for a rare hypothetical case (with AFAIK no convincing real-world need for this) while ignoring the common obvious deficiency that is a chronic highly visible problem plaguing Rust programs, which users complain over and over again. It's taking imaginary users seriously and dismissing the real ones!
Let's flip the default and have more useful io::Error until authors of high-frequency filesystem-failing code show up and complain as much as sufferers of unhelpful errors do now. It's always possible to add open_with_cryptic_error() functions for such use cases, if a real program needing them is ever discovered.
This is likely to be the best we can do with the current error-handling ecosystem, but if we could go back in time and change the error-handling rules from the start, I suspect the correct approach would be for errors to indicate the problematic piece of input by reference to how the input was obtained. For example, File::open says "my filename argument caused the problem", then when the error bubbles up to the caller, it looks at how it calculated that argument and adjusts the error to refer to that, e.g. "this particular line of the config file that I read the filename from caused the problem". Just storing the path loses information – and although it's often possible to reconstruct that information by grepping for the path, that isn't necessarily be the case.
The problem, of course, is that this sort of bubbling can't be done automatically because it depens too much on the intent of the caller.
The anyhow concept of a chain of contexts come close to this. Unfortunately it becomes very cumbersome to add such ad-hoc info unless you fully opt-in into one of the type erased error crates (anyhow, eyre, possibly others). If you want more custom errors (like thiserror) there is no good way to also add additional free form info "on the side" that is useful for logging/debugging (but info that is not useful for the API consumer in determining how to programtically handle the error).
There's a fundamental tension between providing lots of information for a human operator (debugging, logging) and providing good errors that are actionable by an API user programtically. In the former case I want all the details. In the latter case I only really care about "should I retry or abort?". I don't want dozens of error variants that all boil down to "well, I better abort because I can't handle this".
The ideal error handling approach for me would be one where I could have typed errors but attach additional context on the side that doesn't go into the main enum. Thankfully people are still experimenting with new ways of doing error handling in Rust, I see a new error handling crate every few months on the subreddit. So hopefully we will get something better eventually.
In an ideal world, I'd love to see this handled through a generic error type with a default. (Handling the default ergonomically would require some additional type-system features we don't have.)
Imagine if, when File::open encountered an error, it constructed whatever error type the user desired using a trait-based construction mechanism, and called a trait method to attach moderately expensive context to it. The default implementation of the trait method could discard the context, so that monomorphization will result in dead-code elimination and the caller of File::open will pay zero cost. A non-default implementation of the trait method, such as one for an error-type that favors more detailed error messages at the expense of having to allocate, could save the context and display it when rendering the error.
We could argue about what the defaults should be, but leaving that aside, if the above system worked ergonomically, and had reasonable defaults so that File::open(...)? worked as expected, then I think that would work for both types of use cases.