Pre-RFC: .collect::<Box<[T]>()

TimonPasslick · October 5, 2018, 2:02pm

I’d like to have a way to collect variables of type T : std::iter::ExactSizeIterator into a std::boxed::Box<[T]> because it’s hard to create dynamic Box<[T]> arrays right now. Maybe we could add that once specialization is stable?

gbutler · October 5, 2018, 2:05pm

Isn’t Box<[T]> what Vec<T> is effectively?

TimonPasslick · October 5, 2018, 2:11pm

Box<[T]> has no capacity. A Box<[T]> is to a Vec what a dynamic array is to a std::vector in C++. It’s lighter but not designed to grow.

matklad · October 5, 2018, 2:14pm

Hm, I think you can’t implement this more efficiently then .collect::<Vec<T>>().into_boxed_slice()? Given that this seems to be a relatively rare use-case, calling into_boxed_slice yourself does not seem too bad?

TimonPasslick · October 5, 2018, 2:17pm

I don’t know. The Vec capacity gets optimized away often, but does it always?

gbutler · October 5, 2018, 2:20pm

Does it matter or are you engaging in "Premature Optimization"? Have you bench-marked it? I'm just curious because I have a hard time imagining how the extra usize for the capacity on the stack is any meaningful burden on performance even if it isn't always optimized away.

TimonPasslick · October 5, 2018, 2:21pm

No, I have to admit I didn’t bench-mark it.

TimonPasslick · October 5, 2018, 2:24pm

I just see capacity structures used often although a Box<[T]> would be enough and I think that adding this as a standard library feature might make people think more carefully if their capacities are actually needed. I write much toy code, so it might differ a bit from your code, but in my code, half my data structures actually don’t need a capacity.

gbutler · October 5, 2018, 2:27pm

Hmmm....Interesting. Need to ponder that a bit.

mbrubeck · October 5, 2018, 4:39pm

One technical difficulty is that impl FromIterator<T> for Box<[T]> can't be restricted to work with only ExactSizeIterator iterators. So code like string.chars().collect::<Box<[char]>>() would be allowed at compile time. Should it panic at run-time, or should it collect into a Vec and then call into_boxed_slice?

(For iterators that do return an exact size, it's probably reasonable to panic at run-time if the size is wrong.)

TimonPasslick · October 5, 2018, 4:49pm

That’s why I would wait for specialization (https://github.com/rust-lang/rfcs/blob/master/text/1210-impl-specialization.md) before adding this feature. Or did I get something wrong and specialization doesn’t solve the problem you pointed out?

[EDIT] I read the RFC now and I see what you mean. What prevents us from putting a std::compile_error into the general impl?

[YET ANOTHER EDIT] Oops, found out what prevents us: https://play.rust-lang.org/?gist=23dad0385a3bf847335be9036d6948d7&version=stable&mode=debug&edition=2015 But just panicking is also fine. Or making this a separate function called into_boxed_slice instead of collect::<Box<[T]>>.

withoutboats · October 5, 2018, 5:05pm

Box<[T]> could implement FromIterator and an optimization specialization (which we already allow in std) could implement it in a way that allocs the exact amount and doesnt go through Vec. I don’t know of anything stopping this except that no one has made the PR.

TimonPasslick · October 5, 2018, 5:06pm

Wonderful, so I’ll try to make the PR. I’m a newbie, so let’s see if I can do it.

jethrogb · October 5, 2018, 5:08pm

I don’t think specialization is needed? Just use size_hint for the initial Vec::with_capacity.

TimonPasslick · October 5, 2018, 5:12pm

But for writing collect::<Box<[T]> without a temporary Vec, it is. And if a temporary Vec is needed, I want to make that explicit so that you have to call collect::<Vec<_>>().into_boxed_slice()

jethrogb · October 5, 2018, 5:13pm

How do you create a Box<[T]> without a temporary Vec?

withoutboats · October 5, 2018, 5:14pm

That isn't possible (or desirable in my opinion). But you can use specialization to guarantee the temporary is not created (and there are never any reallocs, etc) with ExactSizeIterator.

TimonPasslick · October 5, 2018, 5:14pm

With unsafe magic and a safe API. That’s something you wouldn’t do in production code but in the standard library.

TimonPasslick · October 5, 2018, 5:21pm

Ok, I see.

scottmcm · October 8, 2018, 4:17am

Why do you only want this for ExactSizeIterator? Couldn’t it just as well work, via Vec, for things that aren’t that? Vec already uses the hint; if it’s correct already it doesn’t matter whether it was labelled ESI or not.

Also, the only thing this would theoretically be avoiding compared to .collect::<Vec<_>>().into_boxed_slice() is the length in the Vec, but it can’t actually avoid that: Iterator::next() can panic (yes, even for ESI) so you’d need to track the number of items you’ve stored so that you can destruct them correctly on panic anyway.

Topic		Replies	Views
mRFC: iterable Box<[T]> language design	10	1104	February 2, 2020
Impl FromIterator for Rc<[T]>? libs	3	764	March 25, 2019
Collecting iterators into arrays	20	13484	September 12, 2019
Include `collect_vec` in `std`? libs	23	4025	March 25, 2019
[pre-RFC] TryFromIterator and try_collect to enable collecting to arrays libs	26	2438	September 8, 2021

Pre-RFC: .collect::<Box<[T]>()

Related topics