Vec is asymmetric with memory handling

This thread is hosting the discussion split from the following section of a reply from @tczajka in What is the purpose of std? - #9 by tczajka


1 Like

It depends a lot on whether your constrained resource is memory or CPU time. I will say, though, that I don’t know of any stdlib, in any language, where shrinking the default growable contiguous storage type gives up capacity, either on every call or amortized. That doesn’t mean your idea is a bad one, just that no one else has decided to make it the default, either.

It does help that you can build remove-and-shrink on top of remove, but not the other way around.

5 Likes

That's the behaviour that makes sense in 99% cases.

You certainly don't want your vecs to unpredictably cause two reallocs where zero would suffice because you've popped one element too many before pushing again.

13 Likes

The behaviour to not shrink automatically makes sense most of the time, e.g. when using a Vec as a scratch buffer that you keep growing and shrinking a lot. You don't want your stack of pending elements in a depth first search algorithm to shrink mid-algorithm for example.

I think that the "grow list really big once, then shrink it and keep it shrunk for a long time" use case is really rare. For those cases you can manually shrink the Vec. Arguably you should have built in a memory arena (that you reuse) and copied the final shrunk Vec out of that after you know the actual size.

7 Likes

This seems like a micro-optimization. I can see use cases for non-releasing-buffers, but I just don't think it should be the default because it's not the conservative choice. We're talking about constant-time overhead for shrinking (amortized) vs possible quadratic (or worse) memory overhead.

Example: You have n vectors, and you pass n elements through all of them. You might end up with O(n^2) memory allocated even though you only ever have n elements.

2 Likes

I'd say that quite the opposite, trying to conserve memory for 1% of use cases is the micro-optimization.

If you only ever have n elements, you should just intoiter+collect.
If you have a non-restricted stream of elements, you won't ever grow the first n-1 vectors to n elements.

5 Likes

No no, worst-case guarantees are super-important. You can't call an asymptotic worst-case improvement a "micro-optimization".

I think you may have misunderstood my example. You have n vectors, and n elements. You move those elements in some unpredictable fashion between the vectors. Each vector may grow to n elements at some point in the process. The sum of lengths is always bounded by n.

To me this issue makes Vec a much lower-lever tool than what I'd expect from a general-purpose collection type. You basically have to manually worry about memory management to some extent whenever you're removing elements, or accept what amounts to memory leaks.

3 Likes

Shrinking and growing may both need to move the allocation, which means the behavior you want can inflict quadratic time overhead on some algorithms (for instance, imagine a bag-of-pages allocator and a stack whose size oscillates across a size bucket boundary).

11 Likes

Shrinking can be done in amortized constant time per operation (same as for growing), as long as the number of inserts/removals between any two capacity changes is proportional to the length. For example, doubling capacity when full and halving capacity when length < capacity/3 works fine. You’re right that in the scenario @tczajka is talking about one may spend O(n^2) time on shrinking. But even without shrinking, getting each of n vectors to O(n) capacity requires O(n^2) steps of inserting/removing elements, so the time complexity wouldn’t get worse from shrinking.

However, I agree that the current default behavior is fine: in many common use cases shrinking is unnecessary and costs performance, while the use cases that benefit from shrinking can still easily build it on top of what Vec provides. It’s possible to get bad asymptotic complexity by using the methods that exist, but ultimately that’s the responsibility of the programmer. The documentation clearly states that Vec never shrinks its capacity automatically, just like it says that remove takes O(len) time in the worst case.

4 Likes

Python list will give up capacity.

I'm not suggesting one should release capacity every time the length decreases. That would be a bad idea indeed. Release only when the length decreases sufficiently, e.g. double capacity when len > capacity, halve capacity when len < capacity / 4. This would ensure good amortized behavior for any sequence of calls and any allocator.

6 Likes

TIL, thank you!

I find it quite common to use a vec as a scratchpad that gets cleared and then refilled with a similar amount of items in a loop. Far more common than removing just some items.

9 Likes

Sure. Optimally you'd have two types: one type for such optimized scratchpads with manually controlled capacity, and one collection type that automatically manages capacity.

1 Like

As someone who has been bitten badly by this, I don't agree with you that it's a bad default. As others have said, I think it's far more common to want the extra capacity to be there. It's somewhat rarer to need to care about excess capacity from a memory usage perspective. The aho-corasick example I linked is I think the only time this has been relevant to me (that I know about) in ten years of using Rust. In contrast, I've relied on excess capacity hanging around for "scratch buffer" use cases countless times.

The suggestion to "just have two different types" likely underestimates the effects that such designs have on decision paralysis, especially when it's as subtle as small tweaks to the capacity amortization trade-off. So I disagree that having two widespread types is "optimal."

16 Likes

Can you provide a concrete example of this problem?

My example is when I was working on a big integer library that was using Vec internally, and had to be careful to shrink the vectors when the numbers get smaller. I ended up implementing a wrapper Vec-like type.

I would think in most user cases the memory allocation strategy isn't that important. malloc is not that slow. So you have to count many of these as a tie.

It's interesting that for HashMap the opposite choice was made. It's using a slower hasher by default, giving up some performance to avoid very troublesome edge cases where the complexity might blow up. Similarly, rand uses a good generator by default with slower performance, to avoid troublesome cases.

Perhaps this could be solved similar to HashMap, where Vec is parametrized with some "growth strategy" parameter that you can tweak according to your use case.

2 Likes

You would think so, but it's not true in practice. I've done this optimization so many times I can't count it. It's so common that it makes its way into API design. (e.g., "Provide a &mut borrow to this thing so that it can be reused.") So no, I absolutely do not count these as a tie.

18 Likes

Yes, I personally think this was the wrong choice. For my use cases I don't need DOS resistance at all, and I have seen significant speedups from switching hash function a couple of times. Rustc itself uses a different hash for example.

4 Likes

I have a third opinion, you could call it the middle ground.

I think the default hasher doesn't need to be cryptographically secure, but it should be a randomized universal hash function. Those can be very simple and fast, much faster than SipHash, and they actually also prevent DOS attacks in many scenarios (depending on the exact attack scenario), and importantly, they have mathematical upper bounds on the expected number of collisions independently of inputs, as opposed to the arbitrary hash functions that C++ has that can easily cause a lot of collisions even without trying.

Isn't that what siphash is? It's essentially a randomized function, randomized by its key.