Pre-RFC: Migration of Read and Write to core/alloc

mettke · December 13, 2020, 11:19am

A few years ago there was a discussion here (rust-lang 1314) about moving the Read and Write Traits to libcore.

The design in this RFC is similar to the one of in core-io and provides blanked implementations to provide some form of backward compatibility. A working POC can be found here: mettke/rust_read_write_core_rfc.

Summary

This RFC aims to provide Read and Write Traits for the core and alloc module while trying to keep backward compatibility with the current Read Trait.

This RFC requires:

Motivation

Currently, there are no official Read and Write traits available for no_std environments. These traits, however, are crucial for providing general layouts developers agree on when transferring data via a network, storage, or other sources.

Guide-level explanation

There are two new traits called something like ReadCore and ReadExt which the user can implement.

ReadExt resides in alloc and provides methods that require Vec or String. Its functions are automatically implemented for all ReadCore Types.

The ReadCore Type provides methods that do not rely on any allocation and thus provide a structured way to read into a Stack allocated array.

Also, the Read trait implements ReadCore to allow using Read in methods that require a ReadCore implementation.

The implementation for Write is similar to the one for Read.

Reference-level explanation

There will be five changes in core, alloc and std:

Core gets a new trait called something like ReadCore:

pub trait ReadCore {
    type Error;

    fn read(&mut self, buf: &mut [u8]) -> Result<usize, Self::Error>;

    fn read_exact(&mut self, buf: &mut [u8]) -> Result<(), Self::Error> {
        // Implementation from std::io::Read
    }

    fn by_ref(&mut self) -> &mut Self
    where
        Self: Sized,
    {
        // Implementation from std::io::Read
    }
}

Alloc gets a new trait called something like ReadExt. In addition it gets a blanket implementation for every type implementing Read:

use alloc::string::String;
use alloc::vec::Vec;
use core::io::ReadCore;

pub trait ReadExt: ReadCore {
    fn read_to_end(&mut self, buf: &mut Vec<u8>) -> Result<usize, Self::Error>;

    fn read_to_string(&mut self, buf: &mut String) -> Result<usize, Self::Error>;
}

impl<T: ?Sized> ReadExt for T where T: ReadCore {
    default fn read_to_end(&mut self, buf: &mut Vec<u8>) -> Result<usize, Self::Error> {
        // Implementation from std::io::Read
        unimplemented!();
    }

    default fn read_to_string(&mut self, buf: &mut String) -> Result<usize, Self::Error> {
        // Implementation from std::io::Read
        unimplemented!();
    }
}

And in std the Read trait is adapted to require ReadCore. ReadCore and ReadExt get a blanked implementation for every Read:

use alloc::io::ReadExt;
use core::io::ReadCore;
use std::io::Error;

pub trait Read: ReadCore<Error = Error> {
    [...]
}

impl<T: ?Sized> ReadCore for T where T: Read {
    type Error = Error;

    fn read(&mut self, buf: &mut [u8]) -> Result<usize, Self::Error> {
        Read::read(self, buf)
    }

    fn read_exact(&mut self, buf: &mut [u8]) -> Result<(), Self::Error> {
        Read::read_exact(self, buf)
    }
}

impl<T: ?Sized> ReadExt for T where T: Read {
    fn read_to_end(&mut self, buf: &mut Vec<u8>) -> Result<usize, Self::Error> {
        Read::read_to_end(self, buf)
    }

    fn read_to_string(&mut self, buf: &mut String) -> Result<usize, Self::Error> {
        Read::read_to_string(self, buf)
    }
}

Drawbacks

While it is possible to use a Read in a function requiring ReadCore or ReadExt (due to blanket implementation), it is not possible to use a ReadCore for a function requiring a Read. This, however, can be solved by using a Compatibility Layer like the following:

use alloc::io::ReadExt;
use core::io::ReadCore;
use std::error::Error as StdError;
use std::fmt;
use std::io::{Error as IoError, ErrorKind, Read};

#[derive(Debug)]
struct LegacyError<E: fmt::Debug>(E);

impl<E: fmt::Debug> StdError for LegacyError<E> {}

impl<E: fmt::Debug> fmt::Display for LegacyError<E> {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        write!(f, "Unable to read from Reader. Error: {:?}", self.0)
    }
}

struct LegacyRead<R: ReadExt>(R);

impl<Error, Reader> Read for LegacyRead<Reader>
where
    Error: fmt::Debug + Send + Sync + 'static,
    Reader: ReadExt + ReadCore<Error = Error>,
{
    fn read(&mut self, buf: &mut [u8]) -> Result<usize, IoError> {
        self.0
            .read(buf)
            .map_err(|err| IoError::new(ErrorKind::Other, LegacyError(err)))
    }

    fn read_to_end(&mut self, buf: &mut Vec<u8>) -> Result<usize, IoError> {
        self.0
            .read_to_end(buf)
            .map_err(|err| IoError::new(ErrorKind::Other, LegacyError(err)))
    }

    fn read_to_string(&mut self, buf: &mut String) -> Result<usize, IoError> {
        self.0
            .read_to_string(buf)
            .map_err(|err| IoError::new(ErrorKind::Other, LegacyError(err)))
    }

    fn read_exact(&mut self, buf: &mut [u8]) -> Result<(), IoError> {
        self.0
            .read_exact(buf)
            .map_err(|err| IoError::new(ErrorKind::Other, LegacyError(err)))
    }
}

Rationale and alternatives

Alternatives are using or writing crates like not-io, which, however, do not sufficiently provide commonly agreed traits and thus may require Transformation types.

The impact of not providing common types for core and alloc is, that e.g. microcontroller vendors will create custom traits making interoperability difficult and painful.

Future possibilities

Long term, this RFC paves the road to deprecating Read and Write and fully porting them to core and/or alloc, bringing the full power of rust to the system developing world.

It also may lead the way to breaking off more functionality from std and putting them into dedicated sub-crates like net and io, which would allow a target to exactly specify what is supported and what not (one example is web assembly which works with std, but uses unimplemented in a lot of methods).

RustyYato · December 13, 2020, 12:51pm

If you're creating new traits, can we also fix the uninit buffer problem with the current Read?

Right now it's not possible to safely pass in an uninit buffer to read because it takes &mut [u8], and MaybeUninit wasn't around at 1.0. But if we're making new traits now, we can fix that by providing the read_buf interface instead.

mettke · December 13, 2020, 1:46pm

That does sound interesting. But I'm wondering, is that really relevant for this RFC? As far as I understand, the RFC is about adding new methods and not about changing current methods. Or am I missing a crucial point?

djc · December 13, 2020, 2:00pm

In my understanding, the ReadBuf RFC is about replacing the Read trait to the extent possible within the backwards compatibility constraints.

RustyYato · December 13, 2020, 2:36pm

Yes, we can't actually remove methods from Read, so we have to settle for purely adding new methods.

Nemo157 · December 13, 2020, 4:59pm

How do you plan to deal with EOF here? A fully opaque error means you have no way to construct an instance to return.

mettke · December 13, 2020, 5:11pm

That's a very good point. After taking a look at core-io I think there are two ways to deal with that.

The core-io | lib.rs#L111 way is:

fn read_exact<E>(&mut self, mut buf: &mut [u8]) -> Result<(), E>
        where E: From<Self::Err> + From<EndOfFile>

which requires that the Error Type implements a From for the EndOfFile Type.

Alternatively one could create a basic error type similar to the existing taking a generic type argument like:

pub enum IoError<E: ...> {
   UnexpectedEof,
   Inner(E),
}

Personally, I would prefer the first, as the second introduces a second enum layer and a From using a ZST most likely turns to a no-op.

gThorondorsen · December 13, 2020, 5:53pm

Another pain point of the std::io::{Read, Write} traits that could be fixed at the occasion of adding new I/O traits is that the methods read_exact and write_all should be renamed to read and write, respectively. That would probably point people to the correct way of handling partial reads and writes when they don't have a reason to stray from the default.

mettke · December 14, 2020, 7:18am

How would you call the existing read and write methods?

Lonami · December 14, 2020, 7:32am

Something like read_partial or read_up_to, probably.

kornel · December 14, 2020, 2:40pm

The problem is caused by the std/alloc/core divide. Adding two new traits to solve just one of the symptoms seems like tech debt to me. There are other similar problems caused by this crate split, and it would be better to tackle that problem head-on, instead of patching it trait by trait. For example, if std could be made to support Cargo features, then it could be one monolithic crate with alloc as a feature.

Nemo157 · December 14, 2020, 3:01pm

Continuing this line of thinking, core::io::Error could be just the parts of std::io::Error that support being constructed from a (ErrorKind, &'static str) (or something similar), activating the alloc feature would then extend that to (ErrorKind, Box<dyn Error>) and activating the std feature would enable all the errno and other platform-specific APIs. Then there wouldn't be the need to separate out an associated error type (though there are other reasons we may want this, but it would be nice to get that in the same trait via associated type defaults rather than having a trait hierarchy for it).

mettke · December 14, 2020, 3:13pm

I'm currently trying to design the traits and after all that time I think that splitting the error type into those three pieces is not enough.

My Point is that the box was made to allow custom error messages which weren't thought of while creating the error type. Its very hard to create a Type implementing Read if we are restricted to the existing error types and without the Box (as it would be in core) we are either able to map our situation to the available error cases or have to take shortcuts. And that is a very dangerous road to take.

I'm also don't think that we are creating a technical dept, because, at least for Read and Write, a new definition gives us the opportunity to create Read and Write Traits which are a lot better than the existing ones. One example is the Interrupt Error which always bothered me. We have an Error case, which is not an error. Instead of treating it as one, we should completely redesign the return value, to further improve usability. Just take a look at this as an early draft:

/// Response of a Read or Write Operation
pub enum OpRes {
    /// Operation did not complete and should be retried.
    Retry,
    /// Render was completly read and does not have any more data.
    /// To signal an zero sized buffer, use `OpRes::Completly`
    Eof,
    /// Buffer was completly filled. There may or may not be data left.
    Completly,
    /// Buffer was partial filled. There may or may not be data left.
    Partial(NonZeroUsize),
}

pub trait ReadCore {
    type Err: From<UnexpectedEndOfFile> + From<InvalidUtf8>;

    fn read(&mut self, buf: &mut [u8]) -> Result<OpRes, Self::Err>;

    [...]
}

system · March 14, 2021, 3:14pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Core::io outside of official rust libs	1	636	June 28, 2023
[Idea] Lowering `std::io` to core libs	4	1121	March 25, 2019
Idea: trait std::io::Communicate libs	5	201	October 7, 2024
Let trait aliases be used in impls, or how to add io::{Read, Write} to libcore libs	3	894	October 5, 2021
Where's the catch with Box<Read + Write>? language design	39	6328	March 25, 2019