Introduction (skip this)
I've (finally!) been trying to get fully up to speed with I/O safety in the context of Windows. Rather than have adhoc discussions I think it better to consolidate my thoughts here before adding more to various issues and PRs. So sorry if this is rambling, incomplete or any details are incorrect. I'll try to get to some sort of concrete proposal by the end but my main goal is to try to document the relevant points that have come up in conversations.
I'll start by going over how handles work in Windows and in particular std handles. I'm obviously going to be covering a bit of the same ground as the RFC but specifically from a Windows perspective. Please bare with me if any of this is very familiar to you.
Disclaimer: I've written this after discussions with others but the words herein are my own. As such I take full responsibility for errors, contradictions, omissions, misunderstandings, etc.
Background
Handles
A Windows handle is a pointer-sized value that is used to identify a kernel object. Think of them as being like a key in a HashMap
. These handles can be created, destroyed, duplicated and optionally be inherited by child processes.
Windows API functions that take or return a handle may also accept magic values in place of a handle. These are known as "pseudo handles". For example, the magic value -1
can mean "the current process" in some contexts. Pseudo handles always have the high bit set, which distinguishes them from real handles. Pseudo handles are static values so can't be created or destroyed, though a couple of pseudo handles can be turned into real handles by calling DuplicateHandle
.
Neither a real handle nor a pseudo handle will be NULL
. Some functions may have optional parameters where NULL
means None
and some functions that return handles may return NULL
to indicate that an error occurred. However, this depends on the function. Some will return INVALID_HANDLE_VALUE
, which confusingly has the same value (-1
) as the current process pseudo handle.
At this point it's important to emphasise that the current process pseudo handle is not an I/O handle (in fact no pseudo handles currently are). Therefore there isn't a conflict if you know what kind of handle to expect. Or to put it another way: INVALID_HANDLE_VALUE
is never valid for a file handle.
Handle implementation notes
- Handles are not in fact a key into a
Hashmap
. They're more like an index into an array. But either way it's an implementation detail that shouldn't be relied on. - Kernel handles also have the high bit set. However, Rust's std targets are not designed for use in drivers or system components and generally expects to be running as a "normal" Windows API application so I don't feel this is relevant here.
- I'm also ignoring other things that may be called "handles" but have nothing to do with the sorts of handles being discussed here.
Std I/O handles
You can get the std handles using GetStdHandle
. E.g.
let handle = GetStdHandle(STD_INPUT_HANDLE);
The constant STD_INPUT_HANDLE
is just a value used to select the handle to get, it's not a real handle itself. The other constants are STD_OUTPUT_HANDLE
and STD_ERROR_HANDLE
.
Std handles can similarly be set using SetStdHandle
:
SetStdHandle(STD_INPUT_HANDLE, handle)
But more commonly they are set by the OS or the parent process when spawning a new process.
There are a few important things to note here:
- The std handles should normally be valid file-like handles (aside from the exceptions in the next two bullet points). However, this is not enforced. They can be any random value. And even if they are valid handles, they may not necessarily be handles to file-like objects. But they should be.
- It's fairly common for them to be set to
NULL
at startup, e.g. in GUI applications that are not attached to a console. - Another common value is
INVALID_HANDLE_VALUE
. This can be the result of an error somewhere but Rust also uses it to mean "no handle" when spawning a new process.
Rust's std::io::std*
When using stdin
, stdout
or stderr
, the Rust standard library handles the complexities as follows:
- It assumes any value except
NULL
andINVALID_HANDLE_VALUE
is a valid I/O handle. Currently I do agree that it is safe to assume they're valid in this case. Or more correctly, it's unsafe to set them to something that's not an I/O handle. - If the std handle is
NULL
orINVALID_HANDLE_VALUE
it silently pretends any operations are successful (a kind of DIY/dev/null
, if you will). This prevents panics when using, for example,println!
in a GUI application.
Additionally safe code can assume std handles are never closed (i.e. if unsafe code closes a std handle then it has the responsibility to make sure it's not breaking anything by doing so). On Windows if you close a handle then that handle value is free to be used by the next thing that creates a new handle. So obviously closing std handles is unsafe because something already using them may end up using an arbitrary object.
The std handles can also be changed during runtime (see SetStdHandle
). I'm not sure but I think this is I/O safe so long as the old handles aren't closed (though of course it may not necessarily be a good idea). But if this is right then safe code can't assume that two calls to, e.g. stdout().as_raw_handle()
will return the same value. I'm not sure if this is an issue or not in practice but I thought it worth keeping in mind. And I'm not suggesting the standard library expose an API for changing the handles (nor am I suggesting it shouldn't).
Outdated, see discussion
Proposals
I do not claim these as my own original ideas. These have been brought up in discussions on the RFC tracker and elsewhere. I'm merely attempting to consolidate them. They do however reflect my current thinking on the topic.
I/O safe handles should be I/O handles
Rust has traditionally allowed access to raw handles via traits like AsRawHandle
. However, this does not encode the type of handle. For example, it's implemented for JoinHandle
which is not an I/O handle.
So I think the new, I/O safe, types and traits from the RFC should make it clear they're not for any arbitrary handle but are specifically for file-like handles. They can then be free to act and optimize accordingly. And people can use them with less risk of misuse.
Maybe they also need a name that expresses this intent more clearly.
Try as I/O handle
Some types, like Stdout
et al, may or may not have an actual handle behind them. In essence, getting the handle can fail. One way to express this would be to have try_
functions that return an io::Result
instead of just returning the handle "raw".
Raw handle backwards compatibility
In order to reinstate a safety assert and recover a niche in File
, it's been suggested that stdin
, stdout
and stderr
should return INVALID_HANDLE_VALUE
when the real value is NULL. This also better reflects how the standard library itself behaves internally. Once a try
function has been stabilized, people should be encouraged to use that instead so as to force handling of the failure case.
Further to this, all File
functions should endeavour to ensure that INVALID_HANDLE_VALUE
is never used for a File
(even if only behind an &File
reference). The exception would be for from_raw_handle
, which may need to for backwards compatibility reasons. Again, use of from_raw_handle
could be discouraged.
This may mean that File
internally uses a slightly more relaxed type than the public I/O safe handle types.
In conclusion...
Others have discussed these topics in both public and private conversations. I mean this to be a consolidation and continuation of that discussion rather than a finalized proposal, though it may form the basis of one after feedback.
cc @sunfishcode