Can std::fs::File statically track permission?

For example when prototyping I made a mistake, similar to as follows:

let mut file = File::open("foo.txt")?;

file.write_all(&1234_u32.to_be_bytes())?;

Assuming backward compatibility was not a concern, and File::open return different struct ReadOnlyFile which doesn't support writing this mistake would be detectable at compile time instead of run time. (Or the same struct but with generic parameter that indicates read only mode)

I know this is probably not worth it, due to the complexity cost. Maybe it is possible to have clippy lint for this?

There's an obstacle right at the beginning here: There is no way to make OpenOptions::open() return a different type depending on the runtime value of self.

But that's actually how we could get around the backward compatibility problems: Define a new set of File-like types, for concreteness let's call them ReadableFile, WritableFile, and UpdatableFile. Have ReadableFile implement io::Read but not io::Write, and so on. Pair these with a set of new file-opening functions that cover all the sensible ways to open a file and return the appropriate member of the set. I think this is an exhaustive set of combinations of the basic Unix open(2) flags that make sense:

new_fs::[function] returns Unix open(2) flags
read ReadableFile O_RDONLY
rewrite WritableFile O_WRONLY | O_TRUNC
append WritableFile O_WRONLY | O_APPEND
update UpdatableFile O_RDWR
create_new WritableFile O_WRONLY | O_CREAT | O_EXCL
create_or_rewrite WritableFile O_WRONLY | O_CREAT | O_TRUNC
create_or_append WritableFile O_WRONLY | O_CREAT | O_APPEND
create_or_update UpdatableFile O_RDWR | O_CREAT

Each of these would be shorthand for new_fs::OpenOptions::new().[function](path); we'd keep OpenOptions around strictly for the more exotic O_ flags and the mode argument, and their equivalents on non-Unix platforms.

One would also like to expose "is this a seekable file?" in the type system. This is independent of how the file is opened, except that opening a file for appending means it's definitely not seekable. So we would need another three types, SeekableReadableFile etc. distinguished from the unqualified ones by the fact that they implement io::Seek. Whether an OS-level file handle is seekable depends on exactly what you opened, and that can't be put into the type system, but whether the program needs the handle to be seekable can be put into the type system. It's just another batch of factory functions.

new_fs::[function] returns Unix open(2) flags
read_random_access SeekableReadableFile O_RDONLY
rewrite_random_access SeekableWritableFile O_WRONLY | O_TRUNC
update_random_access SeekableUpdatableFile O_RDWR
create_or_rewrite_random_access SeekableWritableFile O_WRONLY | O_CREAT | O_TRUNC
create_or_update_random_access SeekableUpdatableFile O_RDWR | O_CREAT

These would have to fail if they discover that the thing you opened is not seekable. Fortunately, lseek(h, 0, SEEK_CUR) fails when applied to a non-seekable handle h but has no other side effects, so that's easy to implement.

(The append functions never give you a seekable, as I mentioned earlier, and create_new might as well always give you a seekable because the fact that you just created it means it's definitely a "regular file" and therefore seekable. So that does cut down on the combinatorial explosion here a bit.)

This is quite a bit of new complexity, and we'd have to think carefully about all the names and what's left for OpenOptions to do and the interaction with the rest of std::io. But I would encourage you to try prototyping it in a crate anyway. Nothing here requires tight integration with the core language, and it could turn out to be worth the hassle of a changeover.


Sure would be nice if there was a way to not need so many new top-level factory functions, huh? Well, maybe there is. We could always go through OpenOptions, and have it take a type argument that specifies what kind of File we want to get in the end. Let's rename it to just Open for shorter.

let fh: fs::SeekableReadableFile = fs::Open::new()
    .custom_flags(libc::O_DIRECT)  // for example
    .read(pathname);

The difference between read and read_random_access is handled by using (implicitly) an Open<SeekableReadableFile> instead of an Open<ReadableFile>.

I'm not sure this is actually better. I might need to try writing a bunch of code that uses both versions of the API to get a proper sense of what works more smoothly.

1 Like

I think the best fix might be just to make impl trait work for locals.

If you could just

let mut file: impl Read = File::open("foo.txt")?;

then that might be a nice way to "give up" the extra File things -- kinda like casting to an interface in something like Java.

8 Likes

You could "just" lift the options to a type-level enum/bitset and have a single generic File<Caps> or File<R, W, U, S> or whatever.

1 Like

I don't see how that solves the problem of not being able to return a different type depending on the value of the OpenOptions...

Well, the OpenOptions equivalent would use the typestate pattern:

File::options() // returns OpenOptions<No, No> or whatever default
    .read() // returns OpenOptions<Yes, No>
    .write() // returns OpenOptions<Yes, Yes>
    .open("foo.txt") // returns File<Yes, Yes>

where

impl<W> Read for File<Yes, W> { ... }
impl<R> Write for File<R, Yes> { ... }

and so on.

1 Like

The new File::open_buffered and File::create_buffered are statically read or write only, given by the direction of the buffers they are wrapped in. Although in either case, you could still call into_inner() to get the underlying File without typed direction.

2 Likes

Even if a new API is created the old one cannot go away for compatibility reasons.

So is it worth the effort, churn and maintenance? I haven't really come across any bugs related to this in my programs, but that doesn't mean they don't exist in other types of programs. Some concrete example of real bugs from real code bases that could not have happened with an approach like this would be helpful as motivating examples.

But even if this was a clean slate design, there are enough weird file like things on Unix with files (can't speak about other platforms) that you likely want/need the escape hatch not having type safety. (What happens when you seal a memfd at runtime on Linux? What about FD things that are not files nor sockets (eventfd, timerfd, inotify, epoll etc)? What about O_PATH?[1])

Or use cases where you only know the file opening mode at runtime (when implementing a shell or scripting language for example).


  1. Okay that one probably shouldn't be a File at all, just a OwnedFd. ↩ī¸Ž

1 Like

We can mark the old API deprecated. Only If there is a nice solution that doesn't increase complexity of the language which could be very hard in this case.

Even if it is unlikely to make it to production, I think disallowing this or adding a lint could increase reliability and learnability of the language. Sometimes people can make innocent mistakes like this which may be undetected to due to lack of testing, etc.. and end in production. Reducing the surface of possible mistakes due to detecting them in compile time instead of runtime is a neat advantage for Rust. (vs Python for example). I made this issue due to having this innocent mistake as I was quickly prototyping so this is a sample of size one, maybe if other people later make the same mistake they could discuss the issue. But in general, I feel that few people would discuss such issues instead of simply moving on as it is very minor paper cut in my opinion.

Yes, we would need an escape hatch, it doesn't have to be unsafe as it simply move detection from compile time to runtime.