`Vec::append` but by value

dpc · June 14, 2024, 3:18am

Am I the only one that keeps needing it?

impl<T> Vec<T> {
    pub fn concatenate(mut self, mut other: Vec<T>) -> Self {
        self.append(&mut other);
        self
    }
}

I'm open for naming suggestions, as it seems the hardest problem here.

Is there some fundamental issue with it? If I was to just fill a PR to rustlang, would be accepted, or does it require an RFC?

Edit:

It seems:

fn main() {
    let a = vec![1u8, 2];
    let b = vec![3u8, 4];
    assert_eq!([a, b].concat(), &[1, 2, 3, 4]);
}

is what I needed.

Jules-Bertholet · June 14, 2024, 3:34am

You want Extend::extend(): Vec in std::vec - Rust

dpc · June 14, 2024, 3:36am

It does not seem to take self by value. Am I missing something?

ryanavella · June 14, 2024, 3:41am

Extend::extend takes &mut self because then it can be used in contexts where only &mut self is available. Requiring &mut self is more general than requiring self.

dpc · June 14, 2024, 3:53am

Sure it's "more general", but it's also typically a PITA to use. If I need to pass a single vec to foo that just concats two vectors together:

existing.extend(new);
foo(existing);

while I could:

foo(existing.chained(new));

If existing is a reference it's even worse

let mut both = existing.clone();
both.extend(new);
foo(both);

vs:

foo(existing.clone().chained(new));

Anyway - IMO, the samentics, readability and everything is much better, and performance is the same.

Neutron3529 · June 14, 2024, 4:55am

Here is the question: why the Vec<T> is needed?

IMHO, it is more common that a function could receive a &(mut) [T] (needs to read (or modify) the slice) or a T (want to drop the value). It is not flexible to accept a Vec<T>, since it is unnecessary.

If the foo accept T directly, you could use existing.into_iter().chain(new.into_iter()).for_each(|t|foo(t)) directly, without the need of allocating extra memory and drop them very soon.

dpc · June 14, 2024, 5:10am

If a function accepts &[T] and you have two &[T]s that you need to concat first, then it's the same story except with foo(&...).

Taking a single T is completely different, not relevant.

dpc · June 14, 2024, 5:12am

Huh.

I just discovered:

fn main() {
    let a = vec![1u8, 2];
    let b = vec![3u8, 4];
    assert_eq!([a, b].concat(), &[1, 2, 3, 4]);
}

which I guess is exactly what I need. For some reason I thought it worked on strings only, and I see some Concat etc. unstable traits around it now, so I wonder if it changed recently, or did I just miss it for all this time.

mbrubeck · June 15, 2024, 1:00am

Unfortunately, <[Vec[T]]>::concat clones the T elements, so it is not a suitable replacement in cases where T::clone is expensive or unimplemented.

scottmcm · June 15, 2024, 8:46am

One option is to use Add `core::iter::chain` method · Issue #154 · rust-lang/libs-team · GitHub

foo(chain(existing, new).collect());

which is by-value, has a good size hint, and does no cloning.

the8472 · June 15, 2024, 9:27am

This isn't specific to Vec. What you want boils down to method chaining by self-passing for a case where only a &mut method is available. tap has a general solution for this.

jjpe · June 15, 2024, 1:19pm

One way to deal with that is to iterate over both/all Vecs:

let v1 = vec!["hello".to_string()];
let v2 = vec!["world".to_string()];

let v3: Vec<_> = v1.into_iter()
    .chain(v2.into_iter())
    .collect();

That does allocate for the resulting Vec, but doesn't need to clone the elements.

And it's not clear that appending directly to the Vec by value would perform better than that, because under normal circumstances any old vec.push(...); can cause the entire Vec to reallocate anyway.

SkiFire13 · June 15, 2024, 2:24pm

It can reuse the first Vec's allocation if it's big enough (and the same for Vec::append)

the8472 · June 15, 2024, 6:07pm

That's not implemented for chain and it'd only help if the first vec has enough spare capacity to absorb the second, otherwise a new allocation will be needed anyway. For large allocations and allocators with in-place realloc it might be worth it.

But it does impl TrustedLen, so it'll only be a single allocation.

DragonDev1906 · June 17, 2024, 7:28am

Since half the discussion is about method chaining: What if rust had something like the following (placeholder syntax for sake of argument, I'm not trying to suggest using », but simply using . would be ambiguous):

struct A {}

implA {
    fn x(&self) -> usize {}
    fn y(&mut self) {}
    fn z(self) -> usize {}
}

fn main() {
    let mut a = A{};
    // "New operator for chaining methods that don't return self
    my_func(a.z()»y()»x())

    let mut b = A{};
    // Also allowed (even though x and y don't take ownership and normal
    // method chaining wouldn't work, as we the caller can take it), the
    // same goes for mutability vs non-mutability.
    my_func(a.x()»y()»z())
}

The last line would be equivalent to the following:

a.x();
a.y();
let value = a.z();
my_func(value)

The return value of x is be dropped, thus this would even allow chaining methods that return something else as long as you don't care about its value.

Motivation

When writing library code you currently have to consider if method chaining should be allowed/possible (and then decide if you return Self, &Self or &mut Self), for example when implementing a builder pattern. Then the user/caller is somewhat forced into using that system. With this the user/caller has the choice if he is interested in the return value . or wants method chaining », even if the library developer never considered method chaining in the first place.

Is this worth adding to the language? I don't know, but I think it cleanly solves the issue of method chaining (e.g. in a builder pattern) at the cost of adding another operator to the list of how you call functions: . ::.

Ratatouille · June 17, 2024, 8:26am

jjpe · June 17, 2024, 12:56pm

Technically, with a slight abuse of syntax, this already exists:

my_func(a.z()»y()»x())
    // vs. 
my_func({ a.z(); a.y(); a.x(); })

Note that neither version would compile if your example is parsed as left-associative, because z() takes its receiver by value.

DragonDev1906 · June 17, 2024, 1:07pm

Good catch, can't use self after you moved it into z() of course.

SkiFire13 · June 17, 2024, 1:07pm

Note that this works only if a is a variable. If it's some more complex expression you will be recomputing it multiple times.

jjpe · June 17, 2024, 1:33pm

As-is, absolutely. At the core though it wouldn't be difficult to update the example such that expr evaluation happens only once:

my_func(a.z()»y()»x())
    // vs. 
my_func({ let mut val = a; val.z(); val.y(); val.x(); })

Topic		Replies	Views
Vector Concatenation language design	57	2982	May 9, 2021
Chain methods in Vec? libs	5	598	October 6, 2023
Overload `+` on `Vec` to support appending one element libs	20	3841	March 25, 2019
Pre-Pre-RFC: Noncontinuous Lifetime Parameters language design	7	909	January 23, 2022
Add a "Vec.sorted()" function	27	11168	June 12, 2020

`Vec::append` but by value

Motivation

Related topics