[Pre-RFC] Fixed-capacity view of Vec


#41

An interesting consequence of a fixed-capacity view of a vector, is that, because the vector cannot grow through the view, inserting element through it never invalidates references to elements in the vector. That is, one could iterate over the elements of such a view, while inserting elements through it, something that cannot be done with just Vec.


#42

This could also be expanded to a fixed-capacity view of a String.

Additionally, there could be two variants: An owned and a borrowed one: FixedVec/String and FixedVec/StringView. The owned variant is constructed with a fixed capacity while the borrowed one is a view into another non-fixed one.


#43

I wrote a similar abstraction in the buffer crate. It allows you to pass uninitialized buffers of bytes around. This allows things like

let mut vec = Vec::with_capacity(1024);
if reader.read_buffer(&mut vec)?.len() != 0 {
    if vec[0] == 0 {
        // ...
    }
}

in safe Rust. It actually does have one reverse dependency now. :open_mouth: But I use it extensively in crates not published on crates.io, it makes for a very good abstraction for 1) networking, 2) compression/decompression, 3) building protocol packets, all with no or minimal allocations. (You can use an uninitialized ArrayVec as well as a Vec with spare capacity.)

This all works without (or at least with little) monomorphisation cost, because each function taking a B: Buffer can immediately convert it into a BufferRef, so the function won’t be instantiated for each possible buffer type. The conversion makes sure that the underlying buffer gets resized before the function returns.

What do you think about this?


#44

Can’t OP problem be solved by something like VecCapView? It will provide opaque view into capacity space and will allow to push data into it. Something like this:

let mut buf = Vec::with_capacity(12);
buf.extend_from_slice(b"hi");
// `cap_view` is somewhat similar to `split_at`, we also can add `cap_view_mut`,
// which will return `&mut [T]` instead of `&[T]`.
// Under the hood `VecCapView` contains mutable reference to the parent `Vec`
let (data: &[u8], mut cap_view: VecCapView) = buf.cap_view();
// values are written into memory, but not committed yet (i.e. `len` field
// is left unchanged, so if we'll forget the view, vector will not change),
// each method checks if there is enough capacity is left in the view
// returning an error otherwise
cap_view.push(b':')?;
cap_view.extend([b'f', b'o', b'o', b'='].iter().cloned())?;
cap_view.extend_from_slice(data)?;
// now it's time to "commit" data, this method consumes the view
// and simply bumps `len` field of the parent vector
cap_view.commit();

// assuming `data` is out of scope
assert_eq!(&buf[..], b"hi:foo=hi")

#45

I like the simplicity of the VecCapView idea. Why not automatically commit the length on dropping the view though?


#46

Buffer crate looks interesting. I don’t think you can actually read bytes from it without allocating another buffer and copying there, though? So you can’t implement RLE without copying?

Also, I have some safety concerns about BufferRef::new() - shouldn’t it be unsafe fn if setting the initialized parameter lets you view uninitialized memory? I’d split it in two constructors - make the current new() an unsafe fn and provide another safe one that always sets initialized to 0.

There’s also bytes crate that alread does all we need with BytesMut.split_to(), but has extra overhead due to atomics and reference counting.


#47

Yes, auto-commits are viable alternative, but I thought it could be useful to leave vector unchanged until we are sure that there is enough capacity for our data. I guess it’s up to discussion.


#48

This is basically what the buffer crate does, except that it autocommits on drop and supports more than just Vec.


#49

I’m not sure we’re talking about the same thing. Buffer allows you to pass an uninitialized buffer to a function, some space where it can write to. It’s not a read-write-view of data, it always assumes its data is uninitalized, so you can’t read from it.

It’s not an unsafe function — if you pass it a valid &mut [u8] all the bytes are already initialized.

If you don’t, that’s also fine, but you’re already in unsafe code by providing the slice.


#50

I suppose adding an unsafe constructor taking *mut [u8] might be good.


#51

Okay, this is very interesting! I’d very much like to see one of the motivating examples rewritten to use the buffer crate to gauge safety and performance of this solution. But I already have two Rust projects in active development stage, so I’m not sure when I’ll be able to get around to it, if at all.