Idea aboute BufReaderWriter

We have std::io::BufReader and std::io::BufWriter in the std. However, they cannot be easily composed to create a thing similar to BufWriter<BufReader<T>> for some arbitrary T: Read + Write, because BufReader<T> does not implement Write and BufWriter<T> does not implement Read. Further more, we cannot create a BufWriter<&mut T> and a BufReader<&mut T> at the same time since this violates the borrow check rules.

There are a few types in the std such as File and TcpStream that can be read and written through buffered readers and writers at the same time. Since &File and &TcpStream implement both Read and Write, we can create BufReader<&File> and BufWriter<&File> at the same time without violating borrow check rules. This approach introduces unnecessary unsafe code in certain circumstances, though. For example, if I want to put a BufReader<&File> and a BufWriter<&File>, which read and write into the same underlying File, into a single struct:

struct Foo {
    reader: BufReader<&'static File>,
    writer: BufWriter<&'static File>,
}

impl Foo {
    fn new(f: File) -> Self {
        let boxed_file = Box::leak(Box::new(f));
        Self {
            reader: BufReader::new(boxed_file),
            writer: BufReader::new(boxed_file),
        }
    }
}

impl Drop for Foo {
    fn drop(&mut self) {
        self.writer.flush().unwrap();
        let boxed_file = *self.reader.get_ref();
        unsafe {
            Box::from_raw(boxed_file as *const File as *mut File);
        }
    }
}

I come up with 2 approaches to support buffered reader and writer:

First, we can add a new struct named BufRw or similar to the std, which behaves exactly as the simple combination of BufReader and BufWriter:

pub struct BufRw<T: Read + Write> { /* fields omitted */ }

impl<T: Read + Write> Read for BufRw<T> { /* omitted */ }
impl<T: Read + Write> Write for BufRw<T> { /* omitted */ }
impl<T: Read + Write> BufRead for BufRw<T> { /* omitted */ }
impl<T: Read + Write> Seek for BufRw<T> { /* omitted */ }

Second, we can let BufReader and BufWriter implement Write and Read, correspondingly:

impl<R: Write> Write for BufReader<R> { /* omitted */ }
impl<W: Read + Write> Read for BufWriter<W> { /* omitted */ }

So that BufReader and BufWriter can compose.

Do you think that supporting buffered reader + writer in std is a good idea? If so, which approach do you prefer?

3 Likes

While it makes perfect sense for TcpStream, BufRw<File> is a footgun. The Read impl and the Write impl of the File share same storage and cursor. The purpose of the Buf{Reader, Writer} is to hide actual amount of read/write from the user code, which can be hazardous when they interfere each other. For example, even if write!("foo{}bar", 123) succeeded the file may store bytes like fooxyz123bar, since the .read() call advanced the cursor while remaining writes are buffered.

You can use Arc to share ownerships. I believe it's better approach than the box-leak-and-reclaim approach as the only overhead it pays is to allocate only few byte larger memory. Also never use Box::leak() if you want to reclaim it at some point - that's Box::into_raw()'s job.

1 Like

Maybe we can make BufRw<File> flush the writer's buffer automatically before each read operation. Also, I'd argue that this problem is not specific to BufRw<File>. It's always a bad idea to access a file via BufReader / BufWriter / BufRw if you want precise control over how many bytes and exactly where you want to read from and write into the OS. In other words, the user should be informed that BufRw should only be used for reading and writing IO devices whose read end and write end do not interfere.

AFAIK, Arc<T> implements neither Read nor Write, so it's impossible to have BufReader<Arc<T>> and BufWriter<Arc<T>> for now.

Get it.

You normally want separate buffers for the input and output side of a TCP stream, since their content is unrelated. This is normally implemented in Rust by using the try_clone() method to separate the reader from the writer:

let reader = socket;
let writer = reader.try_clone()?;
let reader = BufReader::new(reader);
let writer = BufWriter::new(writer);

See http://github.com/pdx-cs-rust/net-15 for a worked example.

IMO std should just impl<T: Write> Write for BufReader<T> (and vice versa)

How would work? Surely it would not just forward the write call to T since it may not be in the state the user expects due to the BufReader overfetching.

Could we just instruct the user that it can easily mess up if the read end and write end of T interfere with each other and do it in your own risk? After all, you can consume a BufReader and get the underlying impl Read with the risk that the pre-fetched data loses.

The difference is that getting the inner reader is an explicit operation that's documented to be inadvisable. Implementing a trait however is generally considered to do what you expect. You'll rarely find people reading the documentation of trait implementations, so this has the potential to be a much bigger footgun.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.