Reader, stablization, errors and iterators


#1

I think we can all agree the current Reader and Writer traits are clunky. Not only are they clunky, but have a lot of missed potential. The I/O reform RFC doesn’t really help, and makes some parts even uglier by introducing non-atomic error types everywhere instead of simply removing the offending methods (as they really shouldn’t exist, I’ll go into why later).

Reader also has a lot of overlap with a Iterator<Item = u8>, something which is currently completely ignored aside from a small helper function. Instead, I propose the following:

Amend Iterator with a read() method like so:

pub trait Iterator {
    type Item;
    
    fn next(&mut self) -> Option<Self::Item>;
    fn read(&mut self, buf: &mut [Self::Item]) -> Option<usize> { ... };
}

Then Reader can simply be an extension trait around Iterator where Item = u8. Of course what I haven’t mentioned here is the error types. To preserve the functionality of Reader Iterator needs to be able to return an error. But for Iterator to still be able to return an Option the error type would need to be higher kinded. This is where I got stuck. Iterators being able to return errors is also desirable in various other use cases.

As for why read_to_end & co. shouldn’t exist:

  • They’re extremely arbitrary. They will return an error after reading nothing an unspecified number of times, which the documentation nicely specifies as “too many”.
  • They’re dangerous. read_to_end and read_to_string allocate a buffer of 64k, which ties into the above. This makes it easy to unintentionally use copious amounts of memory.

Of course those flaws could be amended, but at that point they’re probably better off reimplemented by the user. With this design read_to_end and read_to_string could easily be replaced with an .extend() call instead (assuming the “no progress” check isn’t needed).


#2

I like this a lot! The overlap between Iterator and Reader has also bugged me too. Hopefully rust supports mutually recursive default methods so either read or next can be defined alone. To solve your HKT problem, @chris has an excellent proposal for why iterator should return result anyways, see https://github.com/rust-lang/rfcs/pull/352 and http://chrismorgan.info/blog/rust-proposal-result-based-iteration.html .


#3

I strongly agree with both bringing the iterator and reader near together and having iterators returning results! I thing both changes would really have a lot of benefits and no big mali.


#4

As an aside, most of the problems raised with the IO reform RFC have now been addressed: https://github.com/rust-lang/rfcs/pull/576


#5

Hmm I like the new changes, but they also should make combining Read and Iterator even easier! What if this was done, and for loops were amended to only work where the error type is () or Void? Then we have all the benefis of fewer traits and impls, while not needing to amend the language with “functional break” etc. Those changes could be dealt with later in a backwards comparable way.


#6

IMO the main problem with combining Reader & Iterator is how errors are handled. In the reader case, errors are often transient. Aka, read -> error, do something, read -> success. Iterator does not at all cover this sort of behavior.


#7

I agree that is generally a point of distinction between instances of the two, but nothing in the current trait definitions enforce that. On one hand, there is BufReader and friends. While I can’t think of a transient None from an iterator, the iterator docs specifically say:

The Iterator protocol does not define behavior after None is returned. A concrete Iterator implementation may choose to behave however it wishes, either by returning None infinitely, or by doing something else."

For a concrete example of where this would help, http://doc.rust-lang.org/std/io/struct.IncomingConnections.html really wants to be an iterator returning Result, and those errors could be transient.

If the newly proposed WriteAllError is used for reading too (something I mentioned on the RFC comments) It, or something similar, could be used to delineate between fatal error and transient error, benign EOF, etc.


#8

I have thought about it a bit and I thing it would be really useful if iterators would return a IteratorResult witch is a tristate like

pub enum IteratorResult<P, N> {
    Ok(P),
    End,
    Err(N)
}

With this you could implement a kind a not fail save iterators without panic. Note that this also means you should have some kind of recovery/errorhandle branch entered if Err is returned (or panic if such a branch doesn’t exist). For example fore something like this:

let iter = some_source.iter();
for item in iter {
   /* do something with the content of the returned OK*/
} recover {
   println!(”error: {}", item);
   iter.reconnect()
    if iter.connected() {
        true
     } else {
        false
    }
}

Where in the case of a true returned in the recovery branche the iteration will continue or else end.

Not that on the other hand it is probably to late to change the return value of a iterator and that this can always be archived by iterating over a Iterator returningResult and then matching it for Ok /Err manually maybe with help of a macro resulting in code not much more verbose than the one shown above