[Pre-RFC] `field_projection`

y86-dev · September 21, 2022, 10:38pm

To be clear, I am not preferring #[pin], I am still torn whether to use it or #[unpin].

If you place no #[pin] on your fields, then all projections would result in unwrapping. So there would be projections.

TBH I am not very fond of the example. Many things become unsound when you expose internal fields that have invariants associated with them. That is the reason why I was talking about encapsulation.

I think I understand where you are coming from: the proposal removes the need for unsafe at the projection site. Hence projections are safe. It now enables me to exhibit UB without writing any additional unsafe code.

Would this produce a compile error then?

let u: Pin<&mut Unmovable> = ...;
let _ = &mut u.data;
//           ^^^^^^ error cannot access field `data`

y86-dev · September 21, 2022, 10:56pm

Ok I think I finally got what you wanted to tell me... (sorry that it took so long)

I have following library:

pub struct BufPtr {
    buf: [u8; 64],
    // null, or points into buf
    ptr: *const u8,
    // for some legacy reasons this is `pub` and cannot be changed
    pub len: u8,
    _pin: PhantomPinned,
}

impl BufPtr {
    pub fn new(buf: [u8; 64]) -> Self {
        Self {
            buf,
            ptr: ptr::null(),
            len: 0,
            _pin: PhantomPinned,
        }
    }

    pub fn next(self: Pin<&mut Self>) -> Option<u8> {
        // SAFETY: we do not move out of `ptr`
        let this = unsafe { Pin::get_unchecked_mut(self) };
        if this.ptr.is_null() {
            let buf: *const [u8] = &this.buf[1..];
            this.ptr = buf.cast();
            // here we set the len and it cannot be changed, since we are `!Unpin` and we are pinned.
            this.len = 1;
            Some(this.buf[0])
        } else if this.len >= 64 {
            None
        } else {
            this.len += 1;
            // SAFETY: `ptr` is not null, so it points into `buf`
            let res = Some(unsafe { *this.ptr });
            this.ptr = this.ptr.wrapping_add(1);
            res
        }
    }
}

And now the feature gets stabilized. Now users can write:

fn main() {
    let mut buf = Box::pin(BufPtr::new([0; 64]));
    loop {
        // field projection used here:
        buf.as_mut().len = 0;
        buf.as_mut().next(); // at some point we read some bad address
    }
}

y86-dev · September 21, 2022, 11:07pm

Thanks for bringing this up!

I guess our only options are:

require opt-in to pin projection (like you suggested), we could remove this in an upcoming edition (I think),
require explicit #[pin] and #[unpin] on fields (and not project the rest at all).

pitaj · September 22, 2022, 1:16am

Glad we're finally on the same page.

Yes, I suggest this to be a compile error. After thinking about it a bit, I think turning off field access entirely may be too restrictive. Instead, I'd propose that "excluded" fields can only be projected within an unsafe context. This way, you still get to use all of that nice syntax.

I think it would have to be field specific. And I support the form of requiring marking which fields get projected with wrapping (#[pin]), projected without wrapping #[unpin], and projected only in unsafe (no attribute).

I've also come around on the "importable attributes" idea. It could allow for some custom checks, like what I suggest below. I would probably name #[pin] / #[unpin] something like #[pin::project_wrapped] / #[pin::project_unwrapped] instead, though. I just think having use std::pin::{pin, unpin}; is pretty weird, but that's debatable.

I think that #[pin] / #[pin::project_wrapped] should trigger a warning (#[warn(useless_projection_wrapping)]) if used on a field with an Unpin type, since it will essentially allow moving out that value through projection anyways.

y86-dev · September 22, 2022, 11:10am

Let's continue the unsoundness discussion on zulip.

RalfJung · September 29, 2022, 12:37pm

I agree that this API satisfies the property that one cannot cause UB using this API and only safe code. That is certainly a necessary condition for soundness, but it is not entirely clear that it is sufficient. (A precise definition of soundness is super hard -- a first version of that definition takes more than 100 pages in my PhD thesis -- so we usually just say that one must not be able to cause UB from safe code. But in this case I think that leads to the wrong conclusion.)

For example (not sure if this example was already brought up here; it is by Quy Nguyen), a client might call Pin::get_unchecked_mut and then re-assign the len field. This needs unsafe code, but all requirements of get_unchecked_mut are satisfied -- if we had a formal system for this, we could prove this code correct.

So clearly, if we combine the unsafe module that declares this pinned type, and the unsafe module that uses get_unchecked_mut, we can have UB. Which of the two modules is buggy? I think a good argument can be made that the get_unchecked_mut module is perfectly fine, since it diligently added safety comments to each unsafe operation and everything checks out. Therefore it must be the other module which is buggy -- the "soundness proof" of that module relied on an invariant involving the len field, but public fields cannot carry custom invariants.

Therefore I don't think pin projections are to blame here, this module was unsound to begin with.

pitaj · October 1, 2022, 6:03pm

I'm not convinced. I suppose it may not be a soundness issue but this seems like it will be a huge footgun for self-referential structs in their own impls, even if they don't depend on Pin to prevent mutation in general, but just moves:

use std::pin::Pin;
use std::marker::PhantomPinned;
use std::ptr::NonNull;

// This is a self-referential struct because the slice field points into the data field.
// We cannot inform the compiler about that with a normal reference,
// as this pattern cannot be described with the usual borrowing rules.
// Instead we use a raw pointer, though one which is known not to be null,
// as we know it's pointing into the string.
struct Unmovable {
    data: String,
    slice: NonNull<u8>,
    _pin: PhantomPinned,
}

impl Unmovable {
    // To ensure the data doesn't move when the function returns,
    // we place it in the heap where it will stay for the lifetime of the object,
    // and the only way to access it would be through a pointer to it.
    fn new(data: String) -> Pin<Box<Self>> {
        let res = Unmovable {
            data,
            // we only create the pointer once the data is in place
            // otherwise it will have already moved before we even started
            slice: NonNull::dangling(),
            _pin: PhantomPinned,
        };
        let mut boxed = Box::pin(res);

        let slice = NonNull::from(&boxed.data.as_bytes()[0]);
        // we know this is safe because modifying a field doesn't move the whole struct
        unsafe {
            let mut_ref: Pin<&mut Self> = Pin::as_mut(&mut boxed);
            Pin::get_unchecked_mut(mut_ref).slice = slice;
        }
        boxed
    }
    
    // Currently, I can rely on the compiler to prevent me from violating
    // the `Pin` contract in safe code, even on my own types.
    //
    // But with auto pin projection, this is no longer the case.
    fn violate_invariants(self: Pin<&mut Self>) {
        // desugar of `self.data` pin projection
        let mut data = unsafe { self.map_unchecked_mut(|s| &mut s.data) };
        
        // Woops! Just accidentally violated an invariant of this type with safe code
        data.push_str("This additional string is long enough to cause a re-allocation.");
        
        // Woops! Just accidentally violated the Pin invariant with safe code
        println!("{}", std::mem::take(data.get_mut()));
    }
}

I think one the main purposes of this proposal is to enable a cleaner syntax for working with pinned structs, and I feel like a large portion of the use will be in impls. If you're essentially losing all of the protection that Pin offers within the module a struct is defined, that seems harmful.

RalfJung · October 2, 2022, 10:38am

This is an extreme hyperbole. You are losing almost none of the protection. You are in particular losing none of the intended protection, that actually has to do with pinning. The entire purpose of the map_unchecked_mut method is to let you go from Pin<&mut Struct> to Pin<&mut Field> for any field that you can access in the current module -- the only reason this function is unsafe is that given the &mut Struct, it could do other things like call mem::replace. Never was there the intention that accessing a field of a pinned struct should be unsafe, that is just an accident caused by not being able to express "this is safe if it only projects to a field".

Rust has no protection of private fields in a module. This affects pinned and unpinned code equally, and is not a new observation. I am totally a fan of tackling that problem, but it should not be tackled in a pin-specific way -- it should be tackled in a way that works for all code equally (e.g., including Vec, where set_len is the most famous example of this problem). 'unsafe fields' have been proposed several times to attack this very problem properly, someone 'just' needs to work out all the details and have enough patience to see through an appropriate RFC.

Using Pin to protect local fields with invariants is a crude hack to work around the lack of unsafe fields, and the hack only works for some situations (e.g. it doesn't work for set_len). We should not promote this hack or contort new APIs in a bad way to still make the hack work; instead if we as project feel like this is a problem worth solving, then we should do it in a principled way, such as unsafe fields.

2e71828 · October 2, 2022, 10:48am

On the other hand, opening a safety hole in existing code that is de facto, if not de jure, sound is something to be avoided— There’s a tricky balancing act here between supporting the existing (ab)uses of a stabilized API and creating the best experience for future programmers.

RalfJung · October 2, 2022, 11:10am

It's not opening a new safety hole, that code is already unsound if people use map_unchecked_mut while following all the rules of that function. So such code is already, at best, 'dancing the edge' of soundness and relies on users knowing that this field is protected and not using map_unchecked_mut in the intended and documented way.

From what I understood, this is an entirely theoretical existing abuse of stable API. Or is there an example of code in the wild that has a public field that users must not changed after pinning the value, and which is 'sound' due to there being no pin-projection?

system · December 31, 2022, 11:10am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
[Idea] Pointer to Field Unsafe Code Guidelines	81	5145	September 3, 2019
Generic Field Projection language design	8	1401	May 29, 2022
[post] Safe pin projections through view types language design	12	1361	June 3, 2022
Pre-Pre-RFC: Field offsets language design	7	1790	March 25, 2019
Idea: DerefPin/DerefCell language design	12	2362	March 25, 2019

[Pre-RFC] `field_projection`

Related topics