TaintedCell - interior mutability with shared mutable access

memoryleak47 · October 14, 2024, 12:47pm

A few years ago phlopsi has made this post (Exploring Interior Mutability, Auto Traits and Side Effects) about a cell with interiour mutability that can be freely shared. The remaining issue there was that storing this cell inside of a static can cause UB, because then you can get mutable access to the data inside that static multiple times.

My idea is to prevent it from being stored in a static, using a reference to the never-'static LocalToken.

If any unsafe wizards see why this still doesn't work, I'd love to know!

I could see this being useful for caches & interners where you typically want everyone to have access to it, often resulting in using &RefCell<_> to make it possible.

Cheers

elidupree · October 14, 2024, 2:41pm

Very clever! My mid-level unsafe-rust skills haven't spotted any holes in it, although it seems tricky enough that I wouldn't count on it.

That said, if you're going to accept the boilerplate of passing around a token to everywhere you construct such a cell, I feel it's worth mentioning the stricter approach of passing around a &mut AccessToken, where:

constructing an AccessToken uses a one-time check of a thread-local to make sure there's only one AccessToken at a time in each thread
you can only call access by passing an &mut AccessToken
(you can also get shared/immutable access by passing a &AccessToken)
there are no restrictions on creating a cell

(With this approach, you can also make cells and AccessToken carry a type-level disambiguator so that you can use nested callbacks as long as they access cells with different disambiguators. TaintedCell can't do this for as long as auto traits can't have generic parameters.)

I've implemented this once for a project of mine, but not made a polished crate of it. Having done a quick search just now, I see that the crate ghost-cell uses a very similar approach. (It does this same thing but using lifetimes as the disambiguator, which saves you the type-level bureaucracy but does mean you can only construct a cell while a token is live, which mattered for my particular project – my particular cells needed to be Send.)

RalfJung · October 14, 2024, 5:59pm

Ah, I see -- clever indeed.

However... I think there might be a soundness conflict with crates that provide "thread-local state for non-'static data". This crate is not quite enough, but this one might be? Generally it is considered sound to store non-'static data in a global variable and retrieve it later as long as we are still inside that lifetime, and sadly that can be used to share a &TaintedCell with a closure without going through the closure environment.

SkiFire13 · October 14, 2024, 6:08pm

For completeness, here's the same post on Reddit.

memoryleak47 · October 14, 2024, 9:31pm

Thanks to SkiFire13 and RalfJung for bringing up the scoped-tls-hkt / make_static issues!

However, I think there might still be a way to get a weaker version of this Cell to work, basically by replacing Untainted with const.

The core security guarantee is that access is never allowed to call itself in the callback. Now, in a const fn we are not allowed to call non-const functions. Which means that if we force our callback to be a const fn (while access is not) it can never call access a second time.

This is sadly more restrictive, because

const functions are very restricted (eg. no heap allocations yet)
We can't have different brands of taintedness, there's just one - 'const'

Here's a rough implementation:

Code

#![feature(const_trait_impl, effects)]

use std::cell::UnsafeCell;

// A hack to simulate `const FnOnce(T) -> R`
#[const_trait] trait ConstCall<T, R> {
    fn const_call(self, t: T) -> R;
}

struct UnconstCell<T> {
    cell: UnsafeCell<T>,
}

impl<T> UnconstCell<T> {
    fn new(t: T) -> UnconstCell<T> {
        UnconstCell {
            cell: UnsafeCell::new(t)
        }
    }

    fn access<R, F>(&self, f: F) -> R
        where F: for<'a>       ConstCall<&'a mut T, R>,
              F: for<'a> const ConstCall<&'a mut T, R>
    {
        let r = unsafe { &mut *self.cell.get() };
        f.const_call(r)
    }
}

Example Usage

struct MyVec {
    // no const heap allocations thus far. :/
    data: UnconstCell<[i32; 64]>,
}

// Note that this API does not require "const"! And accepts "&self".
impl MyVec {
    pub fn new(data: [i32; 64]) -> MyVec {
        MyVec { data: UnconstCell::new(data) }
    }

    pub fn set(&self, i: usize, v: i32) {
        self.data.access(SetCall(i, v))
    }

    pub fn get(&self, i: usize) -> i32 {
        self.data.access(GetCall(i))
    }
}

struct SetCall(usize, i32);

impl<'a> const ConstCall<&'a mut [i32; 64], ()> for SetCall {
    fn const_call(self, arg: &mut [i32; 64]) -> () {
        let SetCall(i, v) = self;
        arg[i] = v;
    }
}

struct GetCall(usize);

impl<'a> const ConstCall<&'a mut [i32; 64], i32> for GetCall {
    fn const_call(self, arg: &mut [i32; 64]) -> i32 {
        let GetCall(i) = self;
        return arg[i];
    }
}


fn main() {
    let x = MyVec::new([0; 64]);
    x.set(3, 5);
    dbg!(x.get(3));
}

(sorry for the raw code, I can't reach the Rust Playground right now)

SkiFire13 · October 15, 2024, 7:15am

This might be sound, but forcing const seems very restricting to me. I wonder if there's something useful it can do that e.g. Cell can't.

steffahn · October 15, 2024, 9:34am

At first glance, seeing the F: Untainted pattern here, this will most likely run into the same soundness issues as pyo3ʼs Ungil attempts, e. g.:

github.com/PyO3/pyo3

`Python::allow_threads` is unsound in the presence of `scoped-tls`.

opened 01:53PM - 11 Dec 23 UTC

steffahn

In analogy to * #2141 we can smuggle `Ungil` data across `Python::allow_th…reads` using `scoped-tls`, too. ```rs use pyo3::prelude::*; use pyo3::types::PyString; use scoped_tls::scoped_thread_local; fn main() { Python::with_gil(|py| { let string = PyString::new(py, "foo"); scoped_thread_local!(static WRAPPED: PyString); WRAPPED.set(string, || { py.allow_threads(|| { WRAPPED.with(|smuggled: &PyString| { println!("{:?}", smuggled); }); }); }); }); } ``` (results in segfault in my test) ***Unlike #2141, this issue is virtually unsolvable***, i.e. even the auto trait approach with the `feature="nightly"` enabled cannot catch this at all. There’s no property in the callback to `allow_threads` that can catch this. Really, the callback doesn’t even capture _anything at all_: ```rs use pyo3::prelude::*; use pyo3::types::PyString; use scoped_tls::scoped_thread_local; scoped_thread_local!(static WRAPPED: PyString); fn callback() { WRAPPED.with(|smuggled: &PyString| { println!("{:?}", smuggled); }); } fn main() { Python::with_gil(|py| { let string = PyString::new(py, "foo"); WRAPPED.set(string, || { py.allow_threads(callback); // callback is an ordinary `fn() -> ()` item. }); }); } ``` Again, there’s a “whose fault” question to be asked, whether the fact that *the standard library*’s thread-locals require `T: 'static` means there’s any guarantees that non-`'static` data isn’t allowed to be “smuggled” through thread-local storage. In my view, if you look at what kind of things it offers, `scoped-tls` seems _even less_ unreasonable to be called “sound”, compared to `sync_wrapper`. Yet the consequences for `Ungil` are more detrimental.

memoryleak47 · October 15, 2024, 10:16am

That's fair! I think this might become more useful once const heap allocations are a thing. As you can then hopefully create interior mutability versions of Vec or even BTreeMap with them (similar to the example above with [i32; 64]).

But one advantage already is that you can "update" large plain-old data objects without needing to copy the whole thing (there is no Cell::update_in_place(impl FnOnce(&mut T))).

RalfJung · October 16, 2024, 5:52am

Wow that's cute, and scary... I didn't think you could use const fn "staging" as a soundness mechanism.

Topic		Replies	Views
Statically checked interior mutability libs	5	1148	September 18, 2021
Zero-cost interior mutability - proposal language design	46	1784	November 28, 2024
[pre-RFC] Remove static mut	20	13495	March 25, 2019
Adding shared mutability, while still maintaining memory safety language design	14	660	November 29, 2024
Sharing for a lifetime language design	18	3356	March 25, 2019

TaintedCell - interior mutability with shared mutable access

Related topics