Pre-RFC: I/O Safety

sunfishcode · April 28, 2021, 12:00am

Summary

Close a hole in encapsulation boundaries in Rust by providing users of AsRawFd and related traits guarantees about their raw resource handles, by introducing a concept of I/O safety and a new IoSafe trait. Build on, and provide an explanation for, the from_raw_fd function being unsafe.

Motivation

Rust's standard library almost provides I/O safety, a guarantee that if one part of a program holds a raw handle privately, other parts cannot access it. FromRawFd::from_raw_fd is unsafe, which prevents users from doing things like File::from_raw_fd(7), in safe Rust, and doing I/O on a file descriptor which might be held privately elsewhere in the program.

However, there's a loophole. Many library APIs use AsRawFd/IntoRawFd to accept values to do I/O operations with:

pub fn do_some_io<FD: AsRawFd>(input: &FD) -> io::Result<()> {
    some_syscall(input.as_raw_fd())
}

AsRawFd doesn't restrict as_raw_fd's return value, so do_some_io can end up doing I/O on arbitrary RawFd values. One can even write do_some_io(&7), since RawFd itself implements AsRawFd.

This can cause programs to access the wrong resources, or even break encapsulation boundaries by creating aliases to raw handles held privately elsewhere, causing spooky action at a distance.

And in specialized circumstances, violating I/O safety could even lead to violating memory safety. For example, in theory it should be possible to make a safe wrapper around an mmap of a file descriptor created by Linux's memfd_create system call and pass &[u8]s to safe Rust, since it's an anonymous open file which other processes wouldn't be able to access. However, without I/O safety, and without permenantly sealing the file, other code in the program could accidentally call write or ftruncate on the file descriptor, breaking the memory-safety invariants of &[u8].

This RFC introduces a path to gradually closing this loophole by introducing:

A new concept, I/O safety, to be documented in the standard library documentation.
A new trait, std::io::IoSafe.
New documentation for from_raw_fd/from_raw_handle/from_raw_socket explaining why they're unsafe in terms of I/O safety, addressing a question that has come up a few times.

Guide-level explanation

The I/O safety concept

Rust's standard library has low-level types, RawFd on Unix-like platforms, and RawHandle/RawSocket on Windows, which represent raw OS resource handles. These don't provide any behavior on their own, and just represent identifiers which can be passed to low-level OS APIs.

These raw handles can be thought of as raw pointers, with similar hazards. While it's safe to obtain a raw pointer, dereferencing a raw pointer could invoke undefined behavior if it isn't a valid pointer or if it outlives the lifetime of the memory it points to. Similarly, it's safe to obtain a raw handle, via [AsRawFd::as_raw_fd] and similar, but using it to do I/O could lead to corrupted output, lost or leaked input data, or violated encapsulation boundaries, if it isn't a valid handle or it's used after the close of its resource. And in both cases, the effects can be non-local, affecting otherwise unrelated parts of a program. Protection from raw pointer hazards is called memory safety, so protection from raw handle hazards is called I/O safety.

Rust's standard library also has high-level types such as File and TcpStream which are wrappers around these raw handles, providing high-level interfaces to OS APIs.

These high-level types also implement the traits FromRawFd on Unix-like platforms, and FromRawHandle/FromRawSocket on Windows, which provide functions which wrap a low-level value to produce a high-level value. These functions are unsafe, since they're unable to guarantee I/O safety. The type system doesn't constrain the handles passed in:

    use std::fs::File;
    use std::os::unix::io::FromRawFd;

    // Create a file.
    let file = File::open("data.txt")?;

    // Construct a `File` from an arbitrary integer value. This type checks,
    // however 7 may not identify a live resource at runtime, or it may
    // accidentally alias encapsulated raw handles elsewhere in the program. An
    // `unsafe` block acknowledges that it's the caller's responsibility to
    // avoid these hazards.
    let forged = unsafe { File::from_raw_fd(7) };

    // Obtain a copy of `file`'s inner raw handle.
    let raw_fd = file.as_raw_fd();

    // Close `file`.
    drop(file);

    // Open some unrelated file.
    let another = File::open("another.txt")?;

    // Further uses of `raw_fd`, which was `file`'s inner raw handle, would be
    // outside the lifetime the OS associated with it. This could lead to it
    // accidentally aliasing other otherwise encapsulated `File` instances,
    // such as `another`. Consequently, an `unsafe` block acknowledges that
    // it's the caller's responsibility to avoid these hazards.
    let dangling = unsafe { File::from_raw_fd(raw_fd) };

Callers must ensure that the value passed into from_raw_fd is explicitly returned from the OS, and that from_raw_fd's return value won't outlive the lifetime the OS associates with the handle.

I/O safety is new as an explicit concept, but it reflects common practices. Rust's std will require no changes to stable interfaces, beyond the introduction of a new trait and new impls for it. Initially, not all of the Rust ecosystem will support I/O safety though; adoption will be gradual.

The `IoSafe` trait

These high-level types also implement the traits AsRawFd/IntoRawFd on Unix-like platforms and AsRawHandle/AsRawSocket/IntoRawHandle/IntoRawSocket on Windows, providing ways to obtain the low-level value contained in a high-level value. APIs use these to accept any type containing a raw handle, such as in the do_some_io example in the motivation.

AsRaw* and IntoRaw* don't make any guarantees, so to add I/O safety, types will implement a new trait, IoSafe:

pub unsafe trait IoSafe {}

There are no required functions, so implementing it just takes one line, plus comments:

/// # Safety
///
/// `MyType` wraps a `std::fs::File` which handles the low-level details, and
/// doesn't have a way to reassign or independently drop it.
unsafe impl IoSafe for MyType {}

It requires unsafe, to require the code to explicitly commit to upholding I/O safety. With IoSafe, the do_some_io example should simply add a + IoSafe to provide I/O safety:

pub fn do_some_io<FD: AsRawFd + IoSafe>(input: &FD) -> io::Result<()> {
    some_syscall(input.as_raw_fd())
}

Gradual adoption

I/O safety and IoSafe wouldn't need to be adopted immediately, adoption could be gradual:

First, std adds IoSafe with impls for all the relevant std types. This is a backwards-compatible change.
After that, crates could implement IoSafe for their own types. These changes would be small and semver-compatible, without special coordination.
Once the standard library and enough popular crates utilize IoSafe, crates could start to add + IoSafe bounds (or adding unsafe), at their own pace. These would be semver-incompatible changes, though most users of APIs adding + IoSafe wouldn't need any changes.

Reference-level explanation

The I/O safety concept

In addition to the Rust language's memory safety, Rust's standard library also guarantees I/O safety. An I/O operation is valid if the raw handles (RawFd, RawHandle, and RawSocket) it operates on are values explicitly returned from the OS, and the operation occurs within the lifetime the OS associates with them. Rust code has I/O safety if it's not possible for that code to cause invalid I/O operations.

While some OS's document their file descriptor allocation algorithms, a handle value predicted with knowledge of these algorithms isn't considered "explicitly returned from the OS".

Functions accepting arbitrary raw I/O handle values (RawFd, RawHandle, or RawSocket) should be unsafe if they can lead to any I/O being performed on those handles through safe APIs.

Functions accepting types implementing AsRawFd/IntoRawFd/AsRawHandle/AsRawSocket/IntoRawHandle/IntoRawSocket should add a + IoSafe bound if they do I/O with the returned raw handle.

The `IoSafe` trait

Types implementing IoSafe guarantee that they uphold I/O safety. They must not make it possible to write a safe function which can perform invalid I/O operations, and:

A type implementing AsRaw* + IoSafe means its as_raw_* function returns a handle which is valid to use for the duration of the &self reference. If such types have methods to close or reassign the handle without dropping the whole object, they must document the conditions under which existing raw handle values remain valid to use.
A type implementing IntoRaw* + IoSafe means its into_raw_* function returns a handle which is valid to use at the point of the return from the call.

All standard library types implementing AsRawFd implement IoSafe, except RawFd.

Drawbacks

Crates with APIs that use file descriptors, such as nix and mio, would need to migrate to types implementing AsRawFd + IoSafe, use crates providing equivalent mechanisms such as unsafe-io, or change such functions to be unsafe.

Crates using AsRawFd or IntoRawFd to accept "any file-like type" or "any socket-like type", such as socket2's SockRef::from, would need to either add a + IoSafe bound or make these functions unsafe.

Rationale and alternatives

Handles as plain data

The main alternative would be to say that raw handles are plain data, with no concept of I/O safety and no inherent relationship to OS resource lifetimes. On Unix-like platforms at least, this wouldn't ever lead to memory unsafety or undefined behavior.

However, most Rust code doesn't interact with raw handles directly. This is a good thing, independently of this RFC, because resources ultimately do have lifetimes, so most Rust code will always be better off using higher-level types which manage these lifetimes automatically and which provide better ergonomics in many other respects. As such, the plain-data approach would at best make raw handles marginally more ergonomic for relatively uncommon use cases. This would be a small benefit, and may even be a downside, if it ends up encouraging people to write code that works with raw handles when they don't need to.

The plain-data approach also wouldn't need any code changes in any crates. The I/O safety approach will require changes to Rust code in crates such as socket2, nix, and mio which have APIs involving AsRawFd and RawFd, though the changes can be made gradually across the ecosystem rather than all at once.

And, the plain-data approach would keep the scope of unsafe limited to just memory safety and undefined behavior. Rust has drawn a careful line here and resisted using unsafe for describing arbitrary hazards. However, raw handles are like raw pointers into a separate address space; they can dangle or be computed in bogus ways. I/O safety is similar to memory safety, both in terms of addressing spooky-action-at-a-distance, and in terms of ownership being the main concept for robust abstractions, so it's natural to use similar safety concepts.

New types for `RawFd`/`RawHandle`/`RawSocket`

Some comments on rust-lang/rust#76969 suggest introducing new wrappers around the raw handles. Completely closing the safety loophole would also require designing new traits, since AsRaw* doesn't have a way to limit the lifetime of its return value. This RFC doesn't rule this out, but it would be a bigger change.

I/O safety but not `IoSafe`

The I/O safety concept doesn't depend on IoSafe being in std. Crates could continue to use unsafe_io::OwnsRaw, though that does involve adding a dependency.

Define `IoSafe` in terms of the object, not the reference

The reference-level-explanation explains IoSafe + AsRawFd as returning a handle valid to use for "the duration of the &self reference". This makes it similar to borrowing a reference to the handle, though it still uses a raw type which doesn't enforce the borrowing rules.

An alternative would be to define it in terms of the underlying object. Since it returns raw types, arguably it would be better to make it work more like slice::as_ptr and other functions which return raw pointers that aren't connected to reference lifetimes. If the concept of borrowing is desired, new types could be introduced, with better ergonomics, in a separate proposal.

Prior art

Most memory-safe programming languages have safe abstractions around raw handles. Most often, they simply avoid exposing the raw handles altogether, such as in C#, Java, and others. Making it unsafe to perform I/O through a given raw handle would let safe Rust have the same guarantees as those effectively provided by such languages.

The std::io::IoSafe trait comes from unsafe_io::OwnsRaw, and experience with this trait, including in some production use cases, has shaped this RFC.

Unresolved questions

Formalizing ownership

This RFC doesn't define a formal model for raw handle ownership and lifetimes. The rules for raw handles in this RFC are vague about their identity. What does it mean for a resource lifetime to be associated with a handle if the handle is just an integer type? Do all integer types with the same value share that association?

The Rust reference defines undefined behavior for memory in terms of LLVM's pointer aliasing rules; I/O could conceivably need a similar concept of handle aliasing rules. This doesn't seem necessary for present practical needs, but it could be explored in the future.

Future possibilities

Some possible future ideas that could build on this RFC include:

New wrapper types around RawFd/RawHandle/RawSocket, to improve the ergonomics of some common use cases. Such types may also provide portability features as well, abstracting over some of the Fd/Handle/Socket differences between platforms.
Higher-level abstractions built on IoSafe. Features like from_filelike and others in unsafe-io eliminate the need for unsafe in user code in some common use cases. posish uses this to provide safe interfaces for POSIX-like functionality without having unsafe in user code, such as in this wrapper around posix_fadvise.
Clippy lints warning about common I/O-unsafe patterns.
A formal model of ownership for raw handles. One could even imagine extending Miri to catch "use after close" and "use of bogus computed handle" bugs.
A fine-grained capability-based security model for Rust, built on the fact that, with this new guarantee, the high-level wrappers around raw handles are unforgeable in safe Rust.

Thanks

Thanks to Ralf Jung (@RalfJung) for leading me to my current understanding of this topic, and for encouraging and reviewing early drafts of this RFC!

cgwalters · April 28, 2021, 7:50pm

Thanks for posting this! Agree this is indeed a real problem, I have hit "double close" issues myself. Particularly in my case interacting with C libraries via FFI it can get unclear who owns the file descriptor and when. Particularly AsRawFd is definitely a trap. And double close issues are almost like use-after-free except for files, so other code can end up writing the wrong data to fds etc. As you say.

By the time you're passing a memfd to another process it should be sealed though. (Also btw there's a crate for this).

Anyways, I need to think about OwnsRaw a bit more, but I just wanted to say initially we should do something in this area for sure!

sunfishcode · April 28, 2021, 10:00pm

That's a good point! In my use cases for memfd_create, the file descriptor is not sealed or passed to another process, and is instead just encapsulated. But you're right that sealing the file descriptor would avoid this problem. I'll update my draft to mention that.

matklad · April 29, 2021, 10:31am

Is there a way to express this without using unsafe keyword?

We already have a second kind of safety, unwind safety: std::panic::UnwindSafe. It works without using unsafe keyword.

I feel rather strongly that keeping unsafe strictly for memory safety has a lot of value.

matklad · April 29, 2021, 10:40am

How much of a problem is this in practice? One difference between memory safety and IO safety is that we always mess up the former in a memory-unsafe languages, but the letter seems to work well enough most of the time? Are there are specific examples of bugs that this system would prevent?

I totally see how doing FFI can lead to double closes, but I don't immediately see why this RFC helps. When doing FFI, we often have double frees as well, despite memory safety being in there anyway.

Again, the analogy with panic safety is useful. I think that the current consensus is that panic safety is not a clear win -- often times, it's more annoying then helpful, and people just AssertUnwindSafe.

ckaran · April 29, 2021, 1:19pm

While I do see your point, I strongly prefer what @sunfishcode is trying to do with this proposal. Done correctly, it means that there is one less thing that we need to worry about during a code audit (or worry less about). Since auditing code can be hard, anything that simplifies, speeds up, or better yet, fully automates the task is a +1 in my book.

sunfishcode · April 29, 2021, 6:15pm

I'm open to suggestions, but I don't expect there is. I/O safety is similar to memory safety in that it's built on rules that apply at the lowest level of abstraction. These rules conceptually apply everywhere, even through FFI calls where Rust can't enforce them.

Most programs have a great many times more pointers and dereferences than handles and system calls, so they naturally get more visibility :-).

Use-after-close and double-close are well known. Rust today doesn't see a lot of them, but this may be because it's already mostly enforcing I/O safety, even if it doesn't call it that yet. File, TcpStream, and numerous other types in std and independent crates already do. The motivation here isn't specific bugs, but just to close a hole in a property that Rust in practice already mostly has.

We obviously can't statically enforce rules on FFI code :-). This RFC just says that if you have an FFI call that takes a raw handle, you need to explicitly take responsibility for proving that the code is using it safely.

In Rust, one of the big reasons memory safety is important is that it's the difference between a bug meaning "the program might do the wrong thing" and "it's difficult to bound the set of things the program might do". I/O safety is also in this category. Especially from the perspective of a crate that doesn't know what other resources may exist in programs it's used in, it's impossible to know what one might be corrupting, or exposing. My understanding of Rust's position on unwind safety is that it's not in this category (on its own).

And, unwind safety's usefulness or lack thereof is likely specific to the kinds of things people are doing after a catch_unwind. Ord and Eq are also "assertion" traits, used in different circumstances, and as far as I'm aware, people generally see these as useful (provided one keeps in mind that for floating point, they're only the messengers).

CAD97 · April 29, 2021, 10:36pm

s/difficult/impossible, and you're right.

Is there any OS where improper use of a file descriptor (or equivalent) causes unspecified results? I suppose file descriptor reuse is fairly close in practice, as your FD may be to a closed file or it may be to an arbitrary file in an arbitrary state.

(This indicates an interesting attack against Rust programs that do IO, where you replace an open FD with /proc/mem/self or similar.)

Rust's general mentality when it comes to the unsafe barrier is that unsafe guards against breaking invariants that could lead to Undefined Behavior in safe code, and only those invariants. Closing an FD and replacing it with the FD of the production database is safe in Rust's definition of safe, because the worst thing that could happen is the computer does exactly what you told it to do: overwrite the production database.

On the other hand, though, there is a very good argument for IO safety being protected by unsafe (conventionally, at least): raw FDs are global state, which Rust does consider unsafe, at least before putting a lock on it. Even if concurrent writes to an FD is "safe" in that "only" some of the writes may be lost or otherwise interleaved, I think there's a strong argument that unlocked access to FDs could be considered unsafe global state.

(Still, I think a better solution might just be a new trait rather than AsRawFd that is OS-independent, even if it provides OS-dependent functionality, that can be used in place of AsRawFd and only accepts the "IO safe" types as described in the OP.)

rpjohnst · April 30, 2021, 12:21am

File handles are fundamentally already synchronized in this sense, though- they have to be for the kernel to be able to do anything reasonable with them in response to untrusted userspace.

RalfJung · May 1, 2021, 10:09am

I like the idea of expressing this via a trait, but the current documentation of the trait leaves me quite puzzled -- it talks about some internal property of Self, but what I think you actually care about is a property of as_raw_fd. So maybe it should be stated like that? You later go into more detail in the "reference" section, but I feel like this needs to be in the doc comment.

I think the guarantee you want is something like

When this trait is implemented, as_raw_fd(&self) -> RawFd grants temporary access to a file descriptor that is I/O-safe to use for the duration of the lifetime of &self.

Note that this is slightly different from what you said:

A type implementing AsRaw* + OwnsRaw means its as_raw_* function returns a handle which is valid to use as long as the object passed to self is live. Such types may not have methods to close or reassign the handle without dropping the whole object.

Your statement does not follow the usual Rust "reborrowing" rules -- you are saying that after as_raw_fd, that FD may be used for the lifetime of the underlying object as opposed to the lifetime of the reference. This makes a difference for code like

let f: File = ...;
some_function(&f); // Imagine this calls `AsRawFd` and stores the result in some `static mut`
f.write(...);
some_other_function(); // is it legal for this function to access `f`?

I think the question in the last line should be answered with "no"; this is consistent with the rules for references.

I don't understand the point of this guarantee. As a client of some T: OwnsRaw, what can I do with this? I also think this is redundant -- no type with a safe constructor from RawFd can satisfy the other two constraints (assuming it implements the AsRawFd/IntoRawFd trait). So this is just confusing IMO and should be removed -- or recast as a (non-normative) note after the list of requirements, saying something like "Note that this implies that a type implementing OwnsRaw must not be safely constructible from a Raw*."

I am not particularly happy with the name OwnsRaw. (Part of) The meaning of "raw" is "not owned", so this is somewhat self-contradicting. Also the first question coming to my mind is "owns a raw what?". However, off the top of my had I cannot come up with anything better either...

I don't think we want handle aliasing rules. Aliasing rules are about UB; this RFC does not add any new UB. What it does is it makes the "underlying logic in which Rust types can express invariants" more powerful, it adds some new vocabulary -- Rust type invariants can (and do) already refer to the concept of "owning/borrowing a piece of memory"; now they can also be "owning/borrowing an I/O handle". This logic, however, is not described anywhere in the Rust documentation. (We proposed one possible such logic in the RustBelt work. Sadly, this work is not very accessible to non-researchers, so it is not at all clear how to even have a reasonable discussion about "officially" picking one possible logic as the one that Rust will use -- and given the requirements for a "logic of Rust invariants", I do not have good ideas for how to improve that situation.)

Btw, there is another aspect of I/O safety that is worth documenting: do "I/O references" (aka File Descriptors) follow the usual mutability rules? &File implements Write, so that type certainly does not -- but maybe other types want to do that? If yes, I think we would need to add AsMutRawFd and say that the result of as_raw_fd may only be used for read-only operations. But I am not sure if that would actually be useful.

Note that UnwindSafe cannot be used to do anything potentially memory-unsafe, such as the proposed memfd usecase. This problem is related to memory safety. So I think true unsafe is required here.

RalfJung · May 1, 2021, 10:16am

It might be worth figuring out the intended usecases of AsRawFd and friends from when those traits were added. With I/O safety, they become basically useless unless the type also implements OwnsRaw. Essentially what this RFC says is that AsRawFd was a mistake, there is no way to reasonably use it in generic code, and you need some unsafe trait to express the guarantee that generic code actually needs.

sunfishcode · May 1, 2021, 1:04pm

RalfJung:

Your statement does not follow the usual Rust "reborrowing" rules -- you are saying that after as_raw_fd, that FD may be used for the lifetime of the underlying object as opposed to the lifetime of the reference. This makes a difference for code like
let f: File = ...;
some_function(&f); // Imagine this calls `AsRawFd` and stores the result in some `static mut`
f.write(...);
some_other_function(); // is it legal for this function to access `f`?
I think the question in the last line should be answered with "no"; this is consistent with the rules for references.

RawFd corresponds to a raw pointer, rather than a reference/reborrow. as_raw_fd corresponds to functions like Vec::as_ptr, where the documentation talks about the vector itself, rather than &self. So in your example, the answer is yes, some_other_function() can access the stored handle from f.

Good point; I've now removed that requirement in my draft.

What would you think about the name IoSafe?

I think this is where the analogy between pointers and handles starts to break down. For example, a seemingly "readonly" operation like Read::read is actually mutating—it updates the "current position". Also, in general there is no expectation that mutable access to an external resource is exclusive access.

Yes, this is effectively saying it was a mistake that {As,Into}Raw* are safe traits, when from_raw_fd is unsafe. from_raw_fd was made unsafe here; my interpretation is that there was a desire for something like I/O safety, but the subtle consequences for {As,Into}Raw* weren't anticipated. I likely would have made the same mistake.

RalfJung · May 1, 2021, 1:39pm

I agree with the first 2 sentences and strongly disagree with your conclusion. Replace f.write by v.push and it becomes clear that some_other_function can not access the pointer any more if it was obtained by as_ptr!

Certainly an improvement over OwnsRaw.

I also like the trait proposed at the end of this comment. Basically, if we agree that AsRawFd was a mistake, maybe instead of hotfixing the trait with a marker trait, we should have proper replacement traits?

sunfishcode · May 1, 2021, 3:04pm

Right, so there's nothing inherently wrong with squirreling away a *const T in a static mut somewhere, and letting it outlive &self. It really is about the underlying object, and not the lifetime of the reference.

Agreed. I'll update the proposal

I like the idea here. However, many use cases will still need the RawFd to pass to FFI functions, so it's unclear how ergonomic BorrowedFd would be in practice.

The unsafe-io crate has as_file_view() which returns a View and related functions which are similar, but provide a thing which dereferences to &File. Also relevant is from_filelike, mentioned above. These are automatically implemented for types that implement AsRawFd + OwnsRaw and IntoRawFd + OwnsRaw, and they work well in some real-world use cases. If this RFC is accepted, I could imagine adding such things to std in the future. So it's possible to build convenient and safe interfaces on top of the current AsRawFd and IntoRawFd, with just this one small fix.

RalfJung · May 1, 2021, 4:05pm

From the outside you have no way to tell, though. Here you are exploiting intricate knowledge of the implementation details of Vec. In general, any mutating operation of an object must be treated as invalidating all previously created pointers into that object. That's exactly why Rust references work the way they do.

IMO we should treat "borrowing of an FD" the exact same way we treat "borrowing a pointer", and bound everything by the lifetime -- this is more consistent, and easier to explain. It also means we can exactly characterize these guarantees using Rust types such as BorrowFd, which is not possible with your proposal.

BorrowedFd would definitely implement AsRawFd, but this would make the ownership story clear: you are only allowed to use that raw FD for the duration that you have the BorrowedFd. It's like being given an &T and passing that on to some C code as a *const T.

Sure, I didn't say it was not possible. Basically, AsRawFd+IoSafe is the same as AsBorrowedFd. But IoSafe feels like a hack developed to rescue the broken/ambiguous AsRawFd trait whereas AsBorrowedFd is properly using types to express the level of ownership of the underlying FD.

sunfishcode · May 1, 2021, 8:46pm

Raw fds in FFI use cases can have entirely dynamic lifetimes, so while this would work in some cases, it wouldn't work in all. In other words, we could add a Vec::as_slice analog, but it wouldn't eliminate the need for a Vec::as_ptr analog.

I agree, but at the same time, if compatibility weren't a concern, I don't think BorrowedFd would be what I'd design. It'd be too specific to be portable to Windows or to have a natural interface, and not specific enough to add value in the main use case, FFI.

And, people will still need the RawFd, with a dynamic lifetime, for FFI. So they'd call BorrowedFd::as_raw_fd(), and then they'd need the same concept of I/O safety I'm proposing to say what they can do with the RawFd, and it'd need to be defined in terms of the underlying resource rather than in terms of a reference/borrow lifetime.

So while BorrowedFd is nice in that it exercises a nice tool in our conceptual toolbox, it's not clear that it would help with the problems at hand.

RalfJung · May 2, 2021, 9:53am

Sure, and data structures can easily also provide an as_ptr analog (in fact they already do, it's called as_raw_fd, they just need to document more precisely what happens). But the default should follow the usual Rust patterns and properly use lifetimes.

Traits like IoSafe describe the minimal guarantees that implementations have to provide. Particular types can always decide to provide stronger guarantees. But your proposal makes the minimal guarantees needlessly strong; it guarantees way more than clients like SocketRef need. And it does so without precedent -- the entire point of this proposal, in my eyes, is to handle FD ownership more like memory ownership; currently, it fails to do so. We have no trait on the memory side that would be equivalent to your proposed guarantee for AsRawFd+IoSafe.

What is the problem on Windows? We just do what we did for AsRawFd and also have corresponding BorrowedSomething types (and corresponding traits) for Windows handles.

People can do FFI on memory with raw pointers without a special trait. The same would work here. So yes, they still need RawFd, but nobody needs the AsRawFd trait. I am not arguing against RawFd (that would be like arguing against raw pointers; clearly we need raw pointers). I am arguing against the AsRawFd trait -- it should not be hotfixed with an unsafe IoSafe marker trait, it should be deprecated. (Note that AsBorrowedFd is a safe trait, so this proposal entirely avoids having unsafe traits, which IMO is a big advantage.)

The problem with AsRawFd is that it is like a trait with the signature

trait AsPtr {
  fn as_ptr(&self) -> *const u8
}

This trait is useless since for all we know, implementations will just always return NULL! So for a trait we need something that actually expresses, in its docs or ideally its type, the general flow of ownership. That's why we don't have a AsPtr trait. Instead we have traits like AsRef and Borrow, that do document the ownership flow.

However, having as_ptr on a particular type still makes perfect sense; that type can then document what one may do with the resulting raw pointer. Similarly, as_raw_fd on a particular type makes perfect sense.

The FFI case is easily handled by equipping BorrowedFd<'a> with an as_raw_fd method. That method can then precisely document for how long the resulting FD may be used (namely, for the lifetime 'a), and now it is up to the code to ensure that this guarantee is upheld. No IoSafe trait is needed for this, just like we don't need a MemSafe trait to be allowed to convert an &T to *const T and use it for the lifetime of the reference. This works fine when unsafe code passes raw pointers over FFI; I claim it will work just as fine when unsafe code passes raw FDs over FFI.

So yes we do need the concept of I/O safety (I never said we didn't, so I am not sure why you bring that up). What we do not need is the IoSafe trait. BorrowedFd::as_raw_fd is a specific concrete method that can be documented to provide I/Os safety guarantees without the need for an IoSafe trait.

RalfJung · May 2, 2021, 10:12am

Perfect World

So, to summarize my thoughts of the ideal end state (that we cannot reach because of backwards compatibility): there is no AsRawFd trait, but each type that currently implements AsRawFd has an inherent method as_raw_fd and documents for how long it is I/O-safe to use the resulting FD. This could be "for the lifetime of the reference passed to as_raw_fd" or "for the lifetime of the File object" or whatever makes sense. (RawFd itself would have no such method as it would be silly... but if it did, the documentation would say that the resulting FD is in general not I/O-safe to use.)

To also support code that abstracts over this kind of stuff, we have a BorrowedFd<'a> type describing an FD that is I/O-safe to use for lifetime 'a (and so that's what its as_raw_fd docs say), with a corresponding (safe!) AsBorrowedFd trait. This corresponds to &'a T and AsRef (and ref-to-raw-ptr casts), so given that we have ample evidence that this works well in the space of memory safety, including when interacting with C/C++ libraries via FFI, I am confident the same pattern will work equally well in the realm of I/O safety.

All of this generalizes from FDs to Windows handles.

Real world

Now, reality isn't perfect, but we can still see how close we can get to this state.

The "radical" approach would be to deprecate AsRawFd and then basically directly implement the "perfect world".

A more "incremental" approach might be to instead introduce an IoSafe trait and say that AsRawFd+IoSafe is exactly the same as AsBorrowedFd except that we don't have a type like BorrowedFd<'a> making this explicit, so we need to use an unsafe trait instead. We just say in words that with AsRawFd+IoSafe, calling as_raw_fd returns an FD that has all the properties of a BorrowedFd<'a>. This is very close to the currently proposed RFC, except for the point about the lifetime of the FD that we have been discussing -- and I hope with this framing it now also becomes clear why I feel strongly that AsRawFd+IoSafe should only guarantee the returned FD to be valid for lifetime 'a.

sunfishcode · May 2, 2021, 2:39pm

My primary goal here is a path to being able to say "safe Rust cannot do I/O on forged or closed raw fds".

A borrowing discipline for raw fds is an interesting idea, and I don't wish to preclude it.

However, I interpret your Real world proposal as saying that we should do a borrowing discipline instead of pursuing my primary goal here. That version of AsRawFd + IoSafe would imply the borrowing discipline, which would be ok for many use cases, but not all. That means AsRawFd would need to remain, unchanged, and my primary goal here would not be achieved. If I've misunderstood, please correct me.

At present, the best idea I have for achieving my primary goal here is to continue with IoSafe in the form of my proposal (open to suggestions for improving it, of course), and to say that borrowing can be added in a separate RFC, as a layer on top. This is analogous to how Rust has basic rules for what you can do with raw pointers, and a borrowing system as a distinct layer on top. As a secondary consideration, I also believe that the design for this separate layer should consider the needs of I/O use cases specifically, which have some differences from the needs of memory use cases.

I'm open to other approaches, provided they have a path to being able to say "safe Rust cannot do I/O on forged or closed raw fds".

RalfJung · May 2, 2021, 3:23pm

AsRawFd does not exist in my "ideal world" situation, which is approximated to varying degrees of elegance by my real-world proposals. So I don't understand why you say "it needs to remain" -- of course we cannot actually remove it due to backwards compatibility, but I think this trait is a lost cause, and there is literally nothing we can do for code like

fn foo<T: AsRawFd>(x: &T) {
  let fd = x.as_raw_fd();
  ...
}

This is true in both of my "real world" proposals, and in your proposal. So I am very confused by your statement.

I emphasize again that T::as_raw_fd() is not a problem -- for any concrete T, this function is fine, just like T::as_ptr() is fine. T::as_ptr() doesn't need any accompanying trait to be useful, it needs documentation. The documentation of T::as_raw_fd() can say explicitly for each individual T what one may I/O-safely do with the resulting file descriptor. A problem arises once code abstracts over T via the AsRawFd trait. It is only that code that the RFC needs to be concerned with.

The only difference between my "incremental" proposal and yours is the exact guarantees that a function like this can rely on:

fn foo<T: AsRawFd+IoSafe>(x: &'a T) {
  let fd = x.as_raw_fd();
  // Can we use `fd` only for the duration of `'a`,
  // or "until `*x` is destoyed"?
}

In a function like this, the two options are not even meaningfully different, since foo has to assume that *x is destroyed immediately after foo returns. But in other situations there might be a difference.

That is certainly not what I intended, so either you misunderstood or I am missing an inadvertent consequence of my proposal. What I am suggesting is to establish a borrowing discipline in order to pursue that goal.

We agree with that end goal. All of my proposals are on that path. I have yet to understand why you don't like them.

Do you have a concrete usage scenario in mind where AsBorrowFd (or, equivalently, AsRawFd+RalfIoSafe as "AsBorrowFd without type safety") would be insufficient? We do not have any trait on the "memory safety" side that corresponds to AsRawFd+DanIoSafe, which makes it hard for me to imagine a situation where this guarantee is needed. We do have a trait on the memory safety side that corresponds to AsBorrowFd (or, equivalently, AsRawFd+RalfIoSafe); that trait is called AsRef. That's why I think AsBorrowFd is a good abstraction -- it is already battle-tested on the memory safety side.

You are treading new ground, deviating from the recipe Rust uses to handle memory safety, and I do not understand why you are doing this, which problem this is solving. I feel very strongly that such a deviation needs at least solid motivation; "by default" we should err on the side of doing what we already have experience with and hence know will work: something like the AsRef/Borrow and ToOwned traits.

Topic		Replies	Views
Why don't the AsRaw{Fd, Handle, Socket} traits require a mutable receiver? libs	4	1095	March 25, 2019
Pre-RFC: Mark all APIs that allow access to arbitrary files as unsafe language design	5	4029	March 25, 2019
Pre-RFC: RawSlices	9	2140	March 25, 2019
Uninitialized memory	57	10201	March 25, 2019
Pre-RFC: making unsafe more safe to use language design	12	3263	March 25, 2019

Pre-RFC: I/O Safety

Summary

Motivation

Guide-level explanation

The I/O safety concept

The IoSafe trait

Gradual adoption

Reference-level explanation

The I/O safety concept

The IoSafe trait

Drawbacks

Rationale and alternatives

Handles as plain data

New types for RawFd/RawHandle/RawSocket

I/O safety but not IoSafe

Define IoSafe in terms of the object, not the reference

Prior art

Unresolved questions

Formalizing ownership

Future possibilities

Thanks

Perfect World

Real world

Related topics

The `IoSafe` trait

The `IoSafe` trait

New types for `RawFd`/`RawHandle`/`RawSocket`

I/O safety but not `IoSafe`

Define `IoSafe` in terms of the object, not the reference