pre-RFC FromBits/IntoBits

joshlf · May 23, 2018, 7:53am

OK, I’ve started working on a draft RFC. The first draft is not complete, but you folks may have feedback nonetheless. https://github.com/joshlf/rfcs/blob/joshlf/from-bits/text/0000-from-bits.md

A few things to note:

I’ve decided to include SizeLeq and AlignLeq in the proposal because I think they give us a lot more power, but I could be convinced to remove them if folks think the proposal encompases too much. A good compromise might be to just remove SizeLeq since it’s less useful than AlignLeq, although my preference is to keep them both.
I’ve opted to include a derive for FromBits in addition to it being an auto trait. This is so that authors of types with private fields can still opt in if they want.

newpavlov · May 24, 2018, 11:30am

Can’t we utilize From trait for this problem by making a wrapper type Bits<T> with appropriate From implementations for numeric types? So instead of f32::from_bits(0x0_i32) we will write f32::from(Bits(0x0i32)).

On a side note I would really like to see as operator to be a sugar for Into trait, for example it could be really convenient to use it with units of measure: let dist_meters = dist_miles as Meter;. What is the main reason for not doing it?

gnzlbg · May 24, 2018, 11:47am

I think that might work. Users would need to impl From<Bits<T>> for U and the transitivity magic would need to work on that.

On a side note I would really like to see as operator to be a sugar for Into trait, [...] What is the main reason for not doing it?

These are just different types of conversions: as performs fallible non-value-preserving zero/sign extending and truncating "conversions", while From and Into perform value-preserving infallible conversions.

joshlf · May 24, 2018, 11:55am

Hmmm interesting. You could do something like #[repr(transparent)] struct Bits<T>(T) and then give Bits<T> two from_ref(&T) -> &Bits<T> and from_mut(&mut T) -> &mut Bits<T> constructors so it'd work for references too.

The big question I'd have is: how do you construct the From impl? One of the things that I like about this proposal is that, if we go with having either compiler assistance or a custom derive, the user doesn't have to reason about the (very subtle and complex) memory safety themselves.

Also, since From and From::from are safe, there's nothing stopping somebody from implementing From<Bits<T>>::from in a way that doesn't actually depend on the argument, but instead produces some default value. That, in turn, means that you can no longer use U: From<T> as a signal that it's safe to coerce a reference to T into a reference to U.

gnzlbg · May 24, 2018, 1:52pm

If coerce would work with references then maybe something like:

impl From<Bits<T>> for U where U: FromBits<T> {
    #[inline]
    fn from(x: Bits<T>) -> Self {
        coerce(x.0)  // EDIT: fixed bug, had coerce(x) before
    }
}

joshlf · May 24, 2018, 1:54pm

gnzlbg:

If coerce would work with references then maybe something like:
impl From<Bits<T>> for U where U: FromBits<T> {
    #[inline]
    fn from(x: Bits<T>) -> Self {
        coerce(x)
    }
}

So how does coerce work then? It looks like it's a safe function, which means that there's some mechanism for deciding whether coerce::<T, U> is valid for T and U.

gnzlbg · May 24, 2018, 2:14pm

The same way that it is currently implemented:

fn coerce<T, U>(x: T) -> U where U: FromBits<T>

For references one needs to provide different blanket impls of From. I don't know if that can be done with specialization or not.

EDIT: had a bug in the way coerce is called, but like this it should work because the constraints are the same:

impl From<Bits<T>> for U where U: FromBits<T> {
    #[inline]
    fn from(x: Bits<T>) -> Self {
        coerce(x.0)
    }
}

joshlf · May 24, 2018, 2:16pm

Oh, I read @newpavlov’s proposal as a way to replace FromBits, not augment it.

gnzlbg · May 24, 2018, 2:19pm

If anything it could be a way to replace coerce AFAICT. That way you can do let y: U = Bits(x).into(); or let y = U::from_bits(Bits(x));. Whether that’s better than let y: U = coerce(x); or not… is debatable.

One thing that allows is for users to provide their own From<Bits<T>>::from implementations (if not right now, via specialization later), which might be something that is or isn’t desired.

RalfJung · May 27, 2018, 1:36pm

(Sorry it took me a while to respond, this thread also moved so fast I couldn't just do this quickly on the side.^^ I'm probably late to the party but here you go.)

I usually consider padding to be uninitialized memory. This arises naturally because when you put a struct into unintiailized memory and the initialize it by writing to all fields, the padding will remain uninitialized. AFAIK C treats both as indeterminate values, and LLVM treats both as undef.

So, I think it is fair to consider these the same problem. That also reduces our number of problems by one

On IRC, @hanna-kruppe asked about what is okay here from a pure language operation perspective, not just from a type perspective. Ignoring types, I think reading uninitialized memory is fine, but you can expect any operation on it to be immediate UB -- this includes bit-masking, or multiplying by 0. The only thing you can do with such an uninitialized value is store it back to memory. Moreover, conservatively, better assume that when you load a u32 and any byte is uninitialized, then the entire value you are loading is uninitialized. We may end up allowing more, but if you follow these rules you should be fine from an operations/LLVM standpoint.

Now, types may place additional restrictions, like e.g. &T cannot be NULL. From what I can tell, this proposed FromBits instance could would end up with a &[u8] that contains uninitialized memory, and the question is whether that's okay? Essentially, this amounts to the question of whether "uninitialized" is a valid value for u8. I think the answer should be "no", and with my "Types as contracts" that's certainly the intention (though that is not implemented currently). I think when safe code calls a function returning u8, it should be able to rely on the fact that this u8 is initialized. Everything else is just a big hazard. With this interpretation, there would be UB the moment you load the uninitialized data into the u8, because the intention is that all types in scope are always valid. That's just like it's not okay to load the value 3 into a bool even if you never look at the bool.

These issues are the reason why the MaybeUninit type is being introduced. So, turning any sequence of bytes into a &[MaybeUninit<u8>] should be okay because you are no longer claiming that this is a valid u8.

C's "character types" rules are a crazy hack that I'd rather not replicate in Rust. Also, the "character types" exception is about TBAA/strict aliasing, which Rust doesn't have anyway. That's independent of whether, in C, a value of a type can be a "bad" indeterminate value even if it does not have a trap representation. In that regard, all the integer types are likely the same in C.

But anyway, we don't need such a strange hack in Rust. We have MaybeUninit, and if you implement memcpy in Rust you should do it by reading and writing MaybeUninit<u8>.

Yes. Essentially there is a special exception if you write &... as *..., and we pretend the reference never existed and you directly created a raw pointer. For now, better assume you have to exactly write this, syntactically.

I think there is one "bad" value, called "poison", that represents uninitialized data. Also see this paper that defines LLVM with posion (and without undef).

system · March 25, 2019, 8:29am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Pre-RFC: PlatformFrom and PlatformInto libs	14	1682	June 25, 2020
pre-RFC: default fn impl in std::convert::From libs	7	1138	March 25, 2019
Pre-RFC: Add explicitly-named numeric conversion APIs libs	26	4945	March 11, 2020
Proposal: Platform-dependent conversions libs	9	962	June 25, 2020
New trait: core::convert::IntoUnderlying libs	2	593	March 28, 2021

pre-RFC FromBits/IntoBits

Related topics