I/O Safety: Windows edition

Introduction (skip this)

I've (finally!) been trying to get fully up to speed with I/O safety in the context of Windows. Rather than have adhoc discussions I think it better to consolidate my thoughts here before adding more to various issues and PRs. So sorry if this is rambling, incomplete or any details are incorrect. I'll try to get to some sort of concrete proposal by the end but my main goal is to try to document the relevant points that have come up in conversations.

I'll start by going over how handles work in Windows and in particular std handles. I'm obviously going to be covering a bit of the same ground as the RFC but specifically from a Windows perspective. Please bare with me if any of this is very familiar to you.

Disclaimer: I've written this after discussions with others but the words herein are my own. As such I take full responsibility for errors, contradictions, omissions, misunderstandings, etc.

Background

Handles

A Windows handle is a pointer-sized value that is used to identify a kernel object. Think of them as being like a key in a HashMap. These handles can be created, destroyed, duplicated and optionally be inherited by child processes.

Windows API functions that take or return a handle may also accept magic values in place of a handle. These are known as "pseudo handles". For example, the magic value -1 can mean "the current process" in some contexts. Pseudo handles always have the high bit set, which distinguishes them from real handles. Pseudo handles are static values so can't be created or destroyed, though a couple of pseudo handles can be turned into real handles by calling DuplicateHandle.

Neither a real handle nor a pseudo handle will be NULL. Some functions may have optional parameters where NULL means None and some functions that return handles may return NULL to indicate that an error occurred. However, this depends on the function. Some will return INVALID_HANDLE_VALUE, which confusingly has the same value (-1) as the current process pseudo handle.

At this point it's important to emphasise that the current process pseudo handle is not an I/O handle (in fact no pseudo handles currently are). Therefore there isn't a conflict if you know what kind of handle to expect. Or to put it another way: INVALID_HANDLE_VALUE is never valid for a file handle.

Handle implementation notes
  • Handles are not in fact a key into a Hashmap. They're more like an index into an array. But either way it's an implementation detail that shouldn't be relied on.
  • Kernel handles also have the high bit set. However, Rust's std targets are not designed for use in drivers or system components and generally expects to be running as a "normal" Windows API application so I don't feel this is relevant here.
  • I'm also ignoring other things that may be called "handles" but have nothing to do with the sorts of handles being discussed here.

Std I/O handles

You can get the std handles using GetStdHandle. E.g.

let handle = GetStdHandle(STD_INPUT_HANDLE);

The constant STD_INPUT_HANDLE is just a value used to select the handle to get, it's not a real handle itself. The other constants are STD_OUTPUT_HANDLE and STD_ERROR_HANDLE.

Std handles can similarly be set using SetStdHandle:

SetStdHandle(STD_INPUT_HANDLE, handle)

But more commonly they are set by the OS or the parent process when spawning a new process.

There are a few important things to note here:

  • The std handles should normally be valid file-like handles (aside from the exceptions in the next two bullet points). However, this is not enforced. They can be any random value. And even if they are valid handles, they may not necessarily be handles to file-like objects. But they should be.
  • It's fairly common for them to be set to NULL at startup, e.g. in GUI applications that are not attached to a console.
  • Another common value is INVALID_HANDLE_VALUE. This can be the result of an error somewhere but Rust also uses it to mean "no handle" when spawning a new process.

Rust's std::io::std*

When using stdin, stdout or stderr, the Rust standard library handles the complexities as follows:

  • It assumes any value except NULL and INVALID_HANDLE_VALUE is a valid I/O handle. Currently I do agree that it is safe to assume they're valid in this case. Or more correctly, it's unsafe to set them to something that's not an I/O handle.
  • If the std handle is NULL or INVALID_HANDLE_VALUE it silently pretends any operations are successful (a kind of DIY /dev/null, if you will). This prevents panics when using, for example, println! in a GUI application.

Additionally safe code can assume std handles are never closed (i.e. if unsafe code closes a std handle then it has the responsibility to make sure it's not breaking anything by doing so). On Windows if you close a handle then that handle value is free to be used by the next thing that creates a new handle. So obviously closing std handles is unsafe because something already using them may end up using an arbitrary object.

The std handles can also be changed during runtime (see SetStdHandle). I'm not sure but I think this is I/O safe so long as the old handles aren't closed (though of course it may not necessarily be a good idea). But if this is right then safe code can't assume that two calls to, e.g. stdout().as_raw_handle() will return the same value. I'm not sure if this is an issue or not in practice but I thought it worth keeping in mind. And I'm not suggesting the standard library expose an API for changing the handles (nor am I suggesting it shouldn't).


Outdated, see discussion

Proposals

I do not claim these as my own original ideas. These have been brought up in discussions on the RFC tracker and elsewhere. I'm merely attempting to consolidate them. They do however reflect my current thinking on the topic.

I/O safe handles should be I/O handles

Rust has traditionally allowed access to raw handles via traits like AsRawHandle. However, this does not encode the type of handle. For example, it's implemented for JoinHandle which is not an I/O handle.

So I think the new, I/O safe, types and traits from the RFC should make it clear they're not for any arbitrary handle but are specifically for file-like handles. They can then be free to act and optimize accordingly. And people can use them with less risk of misuse.

Maybe they also need a name that expresses this intent more clearly.

Try as I/O handle

Some types, like Stdout et al, may or may not have an actual handle behind them. In essence, getting the handle can fail. One way to express this would be to have try_ functions that return an io::Result instead of just returning the handle "raw".

Raw handle backwards compatibility

In order to reinstate a safety assert and recover a niche in File, it's been suggested that stdin, stdout and stderr should return INVALID_HANDLE_VALUE when the real value is NULL. This also better reflects how the standard library itself behaves internally. Once a try function has been stabilized, people should be encouraged to use that instead so as to force handling of the failure case.

Further to this, all File functions should endeavour to ensure that INVALID_HANDLE_VALUE is never used for a File (even if only behind an &File reference). The exception would be for from_raw_handle, which may need to for backwards compatibility reasons. Again, use of from_raw_handle could be discouraged.

This may mean that File internally uses a slightly more relaxed type than the public I/O safe handle types.

In conclusion...

Others have discussed these topics in both public and private conversations. I mean this to be a consolidation and continuation of that discussion rather than a finalized proposal, though it may form the basis of one after feedback.

cc @sunfishcode

This seems like a reasonable approach.

How confident are you that:

  1. there is no other kind of "hold this handle for me and give it back to me later" API in Windows that permits NULL/INVALID_HANDLE_VALUE other than SetStdHandle/GetStdHandle, and
  2. no other such API is likely to be added in the future?

If this is the only such API, then I do think we could make the safe handle types just not support NULL or INVALID_HANDLE_VALUE, use exclusively raw handles for stdio, offer FFI-safe types that express "valid handle | NullHandle | InvalidHandle" or similar, and have the non-raw safe handle functions on stdio handles return that type. That said, that'll mean stdio types can't automatically work with APIs that accept "anything that can be turned into an I/O safe handle".

Hm, I do think that there should be an IoHandle type that is only for valid handle values. Because otherwise it can end up being necessary to defensively check (and re-check) on every use. Which may not always happen, especially if downstream users of the API don't realise that failure is a possibility (as has happened with stdio types). So there should be something that can be put into an Option or Result which means "I promise this is a file-like handle".

That said, there are indeed various FFI ways to pass around a "maybe handle, maybe not" value (e.g. it's fairly common in some callback functions) which are effectively similar to Get/Set std handle case. And stdio types being able to work with APIs that accept "anything that can be turned into an I/O safe handle" is a use case that I think needs to be supported.

I think there could be a TryAsIoHandle trait that can be implemented for all I/O types, such as Stdout and File that can be used in more generic contexts (File would always return Ok). Also a IoHandle::try_from(raw_handle) function would help people to implement this or to otherwise deal with raw handles.

So, to address your actual point, I'm envisioning that functions that wrap FFI calls should use raw handles in the call itself but immediately convert to/from the safe equivalents as close as possible to the FFI boundary, or at least at the public API boundary.

That said, an additional FFI safe "valid handle | NullHandle | InvalidHandle" type may be more convenient and less likely to be misused. Assuming valid handle has two niches, this would essentially be an FFI-safe enum, right? Or maybe a Result<IoHandle, NullOrInvalidHandle>.

I would love to see that enum constructed such that it naturally works in FFI. That plus a TryFrom impl would make it quite usable.

I don't think we should try to fit it into the shape of Result or Option.

Thanks for helping think through the design space! This is a tricky design space and I appreciate help and ideas for new approaches.

Most handles are usually wrapped in higher-level types most of the time, like File and JoinHandle, which cover most use cases. So it isn't clear to me that a file-like vs. non-file-like distinction at the RawHandle/BorrowedHandle level would be worthwhile.

And, there are APIs like WaitForMultipleObjects, which work on both file-like things like standard input ("console input"), and non-file-like things like threads or mutexes. So there's even a risk that making a file-like vs. non-file-like distinction at this level could harm ergonomics in some use cases.

And on the other side, there are also APIs like SetNamedPipeHandleState, which operate on file-like handles, but only work on non-file file-like handles, so file-like vs. non-file-like isn't specific enough to prevent dynamic type errors.

I think, in terms of I/O, WaitForMultipleObjects and SetNamedPipeHandleState are of the same kind. The only "I/O like" type WaitForMultipleObjects is documented to work with is console input. SetNamedPipeHandleState only works for named pipes. So those functions already necessitate knowing the specific type of handle. Using just any handle is at best an error and at worst leads to the INVALID_HANDLE_VALUE problem (where it is a valid value in some contexts).

On the other hand, abstracting over raw handles to I/O devices is a common need (as I think recent events have inadvertently shown).

I do agree that BorrowedHandle and OwnedHandle are useful for adding ownership semantics to any random handle. But beyond that the only thing they can say about the handle they wrap is that it's not null. They can't otherwise say anything about validity because all other values may be used in some context or another. And may have different meanings depending on that context.

I guess an alternative to IoHandle types would be to have a TryAsFileRef trait that could be implemented for Stdout et al and File, etc.

I'm still unclear on what the need is here, or about how it's not already met with the currently-available APIs (not counting the recent bug, which is being fixed).

As I understand it, there is no UB and no broken encapsulation if one calls a Windows API function with a handle of a dynamic type that it doesn't support, or INVALID_HANDLE_VALUE or NULL. As such, the problem seems to be in the domain of ergonomics, rather than safety.

On Unix-like platforms, where I have more experience, "file" descriptors are used to represent all manner of non-file resources, such as processes, timers, signal delivery, and shared-memory objects, and there's not even a way to tell the difference from the raw handle value alone, and it doesn't seem to be a significant ergonomics problem in practice.

Sure, "everything is a file" as they say. If you use the wrong handle with the wrong function it should be an error. My main concern for Windows specifically is the magic values. Using a NULL can mean None, which may be an error or may simply change the behaviour of the function in unexpected ways (if the handle was optional). Using pseudo handles may have different meanings in different contexts.

I don't think the recent "bug" was actually a bug. I think it was a reasonable assertion that highlighted an API defect elsewhere. Or at least it highlighted a mismatch between user expectations and the actual reality. It seems many people were not even aware that Windows programs can not have stdio handles attached. And to be fair, the standard library does attempt to pretend they're always attached even when they're not.

My goal here is to try to find a way to bridge that perception gap. And I've now had some more time to consider your point of view so I'll write some more limited proposals in my next post.

Ok, I've had time to consider this some more. This is my current thinking on how the types should work. This should be very close to how they currently work, albeit with some tweaks.

Generic handles

OwnedHandle:

  • It must not be NULL, which is not a valid handle.
  • It must not be a pseudo handle, which cannot be dropped. [is this too strict? or should they simply not be dropped?]
  • We should document that the handle must be valid for use with CloseHandle.

BorrowedHandle:

  • It must not be NULL, which is not a valid handle.
  • It can be any pseudo handle.
  • We should document that if it's not a pseudo handle then the handle must be valid for use with CloseHandle. [EDIT: hm, that wording isn't quite right. I mean to say it should be a kind of handle that's used with CloseHandle, not necessarily that doing so is valid]

Implement the AsHandle trait for all the std types it currently does, except Stdin, Stdout and Stderr.

Implement a new TryAsHandle trait for T: AsHandle and for Stdin, Stdout and Stderr.

It would be up to third party wrappers if they want to enforce stronger guarantees than the core types provided by the standard library.

On Windows console handle (stdin) can not be used in overlapped I/O (async), and so there is special function for deal with it (like WaitForSingleObject + PeekConsoleInput + ReadConsoleInput). Also if stdin is redirected to pipe, and was closed on other side then all three functions mentioned above can not handle this. This is very different from stdin in unix, where stdin file handle can be handled like normal file handle. So it would be nice somehow mark stdin as special handle.

It seems many people were not even aware that Windows programs can not have stdio handles attached.

I expect this is true, however I'm unsure how to calibrate my intuition here. windows_subsystem = "windows" and NULL handles have been in Rust for years, and I can't find reports of it being a problem for anyone else. And the code that hit the problem was not typical user code.

As such, I don't know how to evaluate this. Making AsHandle be implemented on a different set of types from AsRawHandle makes it harder to port code from one trait to the other. And making AsHandle be implemented on a different set of types from AsFd on Unix-family platforms makes it less convenient to write Windows/Unix portability abstractions. Are those downsides worth the upside here? I don't know.

Rust's internal get_handle function looks like this:

pub fn get_handle(handle_id: c::DWORD) -> io::Result<c::HANDLE> {
    let handle = unsafe { c::GetStdHandle(handle_id) };
    if handle == c::INVALID_HANDLE_VALUE {
        Err(io::Error::last_os_error())
    } else if handle.is_null() {
        Err(io::Error::from_raw_os_error(c::ERROR_INVALID_HANDLE as i32))
    } else {
        Ok(handle)
    }
}

Whereas the external functions we currently provide publicly to users is:

impl AsRawHandle for io::Stdin {
    fn as_raw_handle(&self) -> RawHandle {
        unsafe { c::GetStdHandle(c::STD_INPUT_HANDLE) as RawHandle }
    }
}
// ...and similar for `Stdout`, etc.

I do feel like if I were to write a public wrapper around GetStdHandle, the return value would always look more like the former than the latter.

And in general I'm even less keen on the idea of something called BorrowedHandle which represents a reference to a handle... except in this one weird case where what's called a handle may just be an error sentinel value or other artefact of the underlying function used to get it (edit: and worse, the sentinel value overlaps with an actual handle value). That doesn't sit well with me.

I don't think we should attempt to paper over platform differences when we're talking about something at the level of handles and file descriptors.

I sympathize that things don't sit right here. But some of it comes from the underlying platform, and some from code in std that is difficult to change for compatibility reasons, such as Stdio::to_handle, which is where the conversion to a sentinel that overlaps with an actual handle value happens today. This feels like more than I myself am prepared to take on here, especially as I'm still not clear whether this is a problem in practice, outside of my own patch.

That makes sense. So I'm still concerned that if we don't make corresponding changes to the existing AsRawHandle and/or Stdio, changing AsHandle like this could do more harm than good.

Ok, I think we're at the point where I should open a github issue specifically for the stdio issue and see if there is any wider feedback. And link from the RFC tracker of course.