This has bitten me in the butt before and I ended up having to work around it.
I’ve run into a situation where I need to express to BufReader
that I need at least X bytes in the buffer without consuming them. Currently, the only way to get BufReader
to read more bytes from the underlying stream is to consume all the bytes in the buffer; this is a problem if you want more data but want to keep what you have in the buffer already.
My use-case involves searching for specific byte sequences in HTTP streams:
The idea is that this Read
implementation will yield bytes up until the boundary sequence, which signals the end of a string or byte-stream in the HTTP request. Then, .consume_boundary()
must be called on it to move it to the start of the next field.
My problem arises when a partial read cuts off the next boundary. I can’t naively consume all the bytes in the buffer because they may or may not be part of the boundary, but I can’t get any more data unless the buffer is empty.
The only way I could fix this now is by reading bytes into a temporary buffer, but I’ve tried this before. I ended up going so far as to reimplement the functionality of BufReader
as part of the BoundaryReader
struct, with the additional methods I needed to make it work. The result was not pleasant to look at or work on.
I’m thinking BufRead
(trait) needs a method that allows the user to express when they need more data in the buffer without having to consume it. Perhaps something like:
pub trait BufRead: Read {
fn fill_buf_min(&mut self, min: usize) -> io::Result<&[u8]>;
}
Here, if the inner buffer does not contain at least min
bytes, the implementation should perform another bulk read from the source. The user should still check if the returned buffer is long enough; if it isn’t, they can decide to retry by calling this method again.
Alternately, this can start life as an inherent method on BufReader
. Then, if the functionality is desired widely enough, it can be moved to the BufRead
trait.
Edit: I copied BufReader
to my repo in order to implement this method on it. You can see the implementation (minus some unnecessary documentation) here:
After refactoring for the new semantics, the problem has been fixed, along with a couple other bugs.