Why are the Read/Write traits so different


#1

We have a method like this:

fn read(&mut self, buf: &mut [u8]) -> Result<usize>;

I would find it more intuitive if it was something like this:

struct Output {
    buf: [u8],
    count: usize,
}
fn read(&mut self) -> io::Result<Output>;

Why was it designed like this?

As a sidenote, we could also do the count ourselves on buf to check how many bytes were read, which would simplify the API to something like this:

fn read(&mut self) -> Result<[u8]>;

#2

What is the size of Output?

(Passing the buffer as a parameter permits the caller to control it.)


#3

It was designed like this so that you could reuse the same buffer across multiple calls to read. This avoids unnecessary extra heap allocations.


#4

Because being symmetric is not the only goal of API design. Specifically for Rust, performance is another important goal, current API does not allocate, while your interface needs an allocation like Result<Vec<u8>> (Result<[u8]> does not work).


#5

Why won’t Result<[u8]> work?


#6

What is the size of io::Result<[u8]>?


#7

(burntsushi is trying to get you to think carefully about the feasibility of your proposal. if you’re still a bit uncertain, have another look at https://doc.rust-lang.org/book/unsized-types.html )


#8

I see. Thanks.


#9

Unless this was an equivalent of read_to_end, I now see that the API would have to accept another argument, one that specifies “how much to read at most”.


#10

It’s more than that though. Even if your signature was:

fn read(&mut self, len: usize) -> io::Result<Vec<u8>>

then you’d be forcing an allocation on the caller, which has significant implications for performance. For example, if I don’t need to hold the entire contents of my underlying reader in memory at once, then I can allocate a single buffer and process the input incrementally:

let mut buf = vec![0; 1024]; // process 1024 bytes at a time
loop {
    buf.clear();
    let n = try!(reader.read(&mut buf));
    if n == 0 { // EOF
        break;
    }
    let block = &buf[0..n];
    // do something with the bytes in `block` ...
}

If read always returned a Vec, then I wouldn’t be able to amortize the allocation of buf like in the above code. Instead, I’d be forced to create a new one for every call to read.