Rc is unsafe [mostly on 32-bit targets] due to overflow

stebalien · May 27, 2015, 4:50pm

Having Rc<T> point into the current processes address space is exactly what I want. The object of type T would be stored on the heap but the Rcs pointing to it would be stored on disk (in the wide vec). This can be useful where you have a couple of frequently accessed objects stored on the heap and a bunch of small objects that want to keep references to these objects (stored on disk).

I’m not suggesting re-using vec as-is, just implementing a vector like container. That wouldn’t be possible. (also, you’d have to be pretty crazy to forget 2^32 Rcs…).

Gankra · May 27, 2015, 5:50pm

Is it acceptable to just treat storing things in memmapped space as unsafe? It certainly seems frought with danger. This would allow Rc to not worry about this.

nikomatsakis · May 27, 2015, 6:07pm

Has anyone benchmarked the effect of saturating adds (and no free if counter saturated), both for Rc/Arc?

(I saw some references to the cost of a CAS loop?)

(Also, it’s worth pointing out that this is only needed on 32-bit systems, if the logic around 64-bit overflow being impractical is accepted.

stebalien · May 27, 2015, 7:08pm

So the new leak rule would be: every leaked resource must consume at least one address? What about the leak amplification you noted?

Anyways, this wouldn’t spell the end for safe memmaped storage, it just means that all objects stored in an memmapped file must be addressable. Unfortunately, this also means that memmaped files can never be unmaped because they might contain leaked resources (which must remain addressable).

llogiq · May 27, 2015, 8:22pm

In debug code, it should be a no-brainer – as was noted before, using up 31 bits of refcount is bad news, so we might as well give the coder the headlines. In release / bench code, I'll concede that just leaking is probably good enough.

This hinges on the idea of somehow keeping LLVM from munging all our increments together. Has this already been implemented?

bluss · May 27, 2015, 8:39pm

Ah but overflowing 64-bit is practical if llvm realizes it can do it for you without discrete steps. Beautiful, isn't it?

Diggsey · May 27, 2015, 9:18pm

Something like this x86 code might be faster on architectures without a saturating add instruction:

        add [counter], 1
        jns skip
        dec [counter], 1
skip:

Branch prediction should help reduce overhead.

nikomatsakis · May 28, 2015, 12:35am

@Gankra I’m trying to decide how fundamental this problem is – this “amplification effect” that you achieved with drain, is that specific to the drain API or do you think you could achieve it with other (stable) vec APIs?

Gankra · May 28, 2015, 1:40am

@nikomatsakis

As far as I recall, it’s currently fairly unique to Vec::drain in that it:

moves data out of the collection without consuming the collection
via a proxy value
and the leak doesn’t itself leak the collection’s allocation

However there may be other cases where a collection chooses to handle unwinding-safety like this: mark the collection as empty temporarily so that a panic just leaks all the elements.

e.g. @bluss briefly considered moving forward with such a design if cmp panicked in BinaryHeap: https://github.com/rust-lang/rust/issues/25842

There’s no way to “abuse” that one without catch_panic though, because unlike Drain it doesn’t expose a secondary handler type.

Generally I consider leak amplification to be a necessary evil in a world with uncatchable panics and safe forget.

bluss · May 28, 2015, 3:52pm

You make Vec::drain(..) sound so terrible. It’s not a surprise, it came onto the scene in the aftermath of a revolution.

hanna-kruppe · May 28, 2015, 4:44pm

Wrong, Vec::drain dates back to at least RFC 509 in December 2014 (I think the idea is even older). The implementation has changed to not be unsafe when the drain iterator is leaked, but the API didn’t. The drain iterator’s destructor would free the remaining values in the vector, so even if leaking was unsafe, leaking the drain iterator would also leak the remaining values in the vector — only it wouldn’t free the objects that were moved either, so it was completely unsafe in addition to amplifying leaks.

Actually Vec::drain was designed unaware of leaks in possible in safe code (remember, those were always possible!) and that the API works nevertheless, although with a slightly different implementation, is a happy accident discovered during the leakpocalypse. The API dates back to a happier, “leak-free” time.

bluss · May 28, 2015, 4:52pm

Ah but the new drain’s API did change, it takes a range now. And I did write it with the mem forget issue fresh in memory.

nikomatsakis · May 28, 2015, 7:45pm

Specifically, I was trying to understand whether it would be possible to deprecate safe forget and add the condition that it must consume memory. It seems awfully subtle though and I second the general concern that we're forgetting things (no pun intended) that could permit leak amplification in other cases.

nikomatsakis · May 28, 2015, 8:05pm

So I was thinking about performance costs, particularly for Arc, and I was wondering if this is a good idea:

gist.github.com

https://gist.github.com/anonymous/539b3468086b95c03cde

playground.rs

#![feature(core)]
#![allow(dead_code)]

use std::sync::atomic::{AtomicIsize, Ordering};
use std::intrinsics::abort;

struct AtomicRefCount {
    data: AtomicIsize
}

This file has been truncated. show original

The idea is not to use a single sentinel value for “wrapped around” but rather an entire range. The motivation is that then we don’t need to use a compare-and-swap but rather we can use an atomic increment and just check whether the previous value was in that range of values that are deemed as “too high”. In this particular code, I used an isize and said that the ref count should never go negative (actually, we should never inc the ref when the previous value was already negative). Seems like 2^31 (or 2^63) is high enough, but you could use a usize and just set a threshold wherever you like. Anyway, the idea is that even if we have a ton of CPUs executing in a very unfortunate scheduling, at least one of them is going to observe the increment start from an invalid state.

I haven’t thought super hard about this so possibly there is a very obvious flaw. Certainly the performance is very good when I measure it locally (no difference from a straight addition, basically).

I chose to have the code just abort the process. I generally am against aborting (vs panicing) but:

this seems like a distributed invariant across many threads, so panicing one thread isn’t clearly enough
aborting is cheaper because no need for landing pads (right?)

It is probably possible to adapt the setup to leak the value instead, e.g. by setting some boolean flag to true (“count may have wrapped”).

nikomatsakis · May 28, 2015, 8:12pm

To clarify something: the reason to use a range, rather than just checking if you added one to USIZE_MAX, is to eliminate the “window of unsafety”. That is, you will abort well before the counter would ever actually overflow. I think you would need 2^31 processors to do otherwise…? In any case, it’s seems vanishingly unlikely (though of course I’d love for someone to point out the flaw in my thinking if there is one).

Gankra · May 28, 2015, 8:16pm

Ah, another case of leak amplification I just recalled: My hypothetical Mox type would leak its allocation if it ever was dropped, but it’s intended to be used with a pool allocator which can easily be cleared to “reclaim” all the memory. Generally local allocators are ripe for introducing massive leak amplification sites with bounded memory consumption.

nikomatsakis · May 28, 2015, 8:26pm

Here are some benchmarking results for a simple case of spawning a bunch of threads, each of which do increments. (There is more of a slowdown that I saw at first, but still relatively small, certainly compared to CAS.)

gist.github.com

https://gist.github.com/nikomatsakis/8ad11f16fd2d24b51b25

results

lunch-box. ./bar --bench

running 3 tests
test cas   ... bench:    561344 ns/iter (+/- 116303)
test check ... bench:    194280 ns/iter (+/- 18843)
test inc   ... bench:    156670 ns/iter (+/- 4068)

test result: ok. 0 passed; 0 failed; 0 ignored; 3 measured

lunch-box. ./bar --bench

This file has been truncated. show original

zed.rs

#![feature(core)]
#![feature(test)]

extern crate test;

use std::intrinsics::abort;
use std::thread;
use std::sync::Arc;
use std::sync::atomic::{AtomicIsize, Ordering};

This file has been truncated. show original

“cas” is using compare-and-swap, “inc” is just a plain increment, and “check” is the “danger zone” variant I described.

UPDATE: @alexcrichton saw much better results (check and inc virtually identical) on his machine, so I imagine this will vary by machine somewhat.

huon · May 28, 2015, 11:48pm

I wonder how the various options behave on ARM, since I believe that is quite different to x86 for atomics/memory model.

rkjnsn · May 29, 2015, 1:02am

Given that the situation occurs due to leaking references, leaking the value seems much more intuitive than an abort to me. One way to do this without introducing an extra flag might be to have both inc and dec set the count to the middle of the negative range (-1073741824 for 32-bit) if the previous value was negative. This gives headroom of 2^30, which is plenty.

The real limitation is address space: even if one had 2^31 (or 2^30) processors, there wouldn't be enough address space for them to all have a distinct Arc on which to operate.

On other note, would this problem also apply to RefCell? It seems that one could get a shared borrow, take and forget enough additional ones to overflow the count, and then take a mutable borrow, yielding both a shared and a mutable borrow to the same memory. What about RWLock?

llogiq · May 29, 2015, 8:48am

Now I just have to put rust on my children’s RasPI.

Topic		Replies	Views
Idea: Leak auto trait and Drop call guarantees language design	12	2114	December 9, 2020
[Pre-RFC] `Forget` Marker Trait language design	8	956	February 25, 2025
Rc and internal mutability libs	21	2727	December 22, 2019
Canvas unsafe code in the wild Unsafe Code Guidelines	23	8337	June 15, 2017
Pre-RFC: Rc and Arc with only strong count libs	35	4216	September 3, 2017

Rc is unsafe [mostly on 32-bit targets] due to overflow

Related topics