Pre-RFC: .collect::<Box<[T]>()

I’d like to have a way to collect variables of type T : std::iter::ExactSizeIterator into a std::boxed::Box<[T]> because it’s hard to create dynamic Box<[T]> arrays right now. Maybe we could add that once specialization is stable?

Isn’t Box<[T]> what Vec<T> is effectively?

Box<[T]> has no capacity. A Box<[T]> is to a Vec what a dynamic array is to a std::vector in C++. It’s lighter but not designed to grow.

Hm, I think you can’t implement this more efficiently then .collect::<Vec<T>>().into_boxed_slice()? Given that this seems to be a relatively rare use-case, calling into_boxed_slice yourself does not seem too bad?

2 Likes

I don’t know. The Vec capacity gets optimized away often, but does it always?

Does it matter or are you engaging in "Premature Optimization"? Have you bench-marked it? I'm just curious because I have a hard time imagining how the extra usize for the capacity on the stack is any meaningful burden on performance even if it isn't always optimized away.

No, I have to admit I didn’t bench-mark it.

I just see capacity structures used often although a Box<[T]> would be enough and I think that adding this as a standard library feature might make people think more carefully if their capacities are actually needed. I write much toy code, so it might differ a bit from your code, but in my code, half my data structures actually don’t need a capacity.

1 Like

Hmmm....Interesting. Need to ponder that a bit.

1 Like

One technical difficulty is that impl FromIterator<T> for Box<[T]> can't be restricted to work with only ExactSizeIterator iterators. So code like string.chars().collect::<Box<[char]>>() would be allowed at compile time. Should it panic at run-time, or should it collect into a Vec and then call into_boxed_slice?

(For iterators that do return an exact size, it's probably reasonable to panic at run-time if the size is wrong.)

2 Likes

That’s why I would wait for specialization (https://github.com/rust-lang/rfcs/blob/master/text/1210-impl-specialization.md) before adding this feature. Or did I get something wrong and specialization doesn’t solve the problem you pointed out?

[EDIT] I read the RFC now and I see what you mean. What prevents us from putting a std::compile_error into the general impl?

[YET ANOTHER EDIT] Oops, found out what prevents us: https://play.rust-lang.org/?gist=23dad0385a3bf847335be9036d6948d7&version=stable&mode=debug&edition=2015 But just panicking is also fine. Or making this a separate function called into_boxed_slice instead of collect::<Box<[T]>>.

Box<[T]> could implement FromIterator and an optimization specialization (which we already allow in std) could implement it in a way that allocs the exact amount and doesnt go through Vec. I don’t know of anything stopping this except that no one has made the PR. :slight_smile:

3 Likes

Wonderful, so I’ll try to make the PR. I’m a newbie, so let’s see if I can do it.

1 Like

I don’t think specialization is needed? Just use size_hint for the initial Vec::with_capacity.

But for writing collect::<Box<[T]> without a temporary Vec, it is. And if a temporary Vec is needed, I want to make that explicit so that you have to call collect::<Vec<_>>().into_boxed_slice()

How do you create a Box<[T]> without a temporary Vec?

That isn't possible (or desirable in my opinion). But you can use specialization to guarantee the temporary is not created (and there are never any reallocs, etc) with ExactSizeIterator.

With unsafe magic and a safe API. That’s something you wouldn’t do in production code but in the standard library.

Ok, I see.

Why do you only want this for ExactSizeIterator? Couldn’t it just as well work, via Vec, for things that aren’t that? Vec already uses the hint; if it’s correct already it doesn’t matter whether it was labelled ESI or not.

Also, the only thing this would theoretically be avoiding compared to .collect::<Vec<_>>().into_boxed_slice() is the length in the Vec, but it can’t actually avoid that: Iterator::next() can panic (yes, even for ESI) so you’d need to track the number of items you’ve stored so that you can destruct them correctly on panic anyway.