`Vec::append` but by value

Am I the only one that keeps needing it?

impl<T> Vec<T> {
    pub fn concatenate(mut self, mut other: Vec<T>) -> Self {
        self.append(&mut other);
        self
    }
}

I'm open for naming suggestions, as it seems the hardest problem here.

Is there some fundamental issue with it? If I was to just fill a PR to rustlang, would be accepted, or does it require an RFC?

Edit:

It seems:

fn main() {
    let a = vec![1u8, 2];
    let b = vec![3u8, 4];
    assert_eq!([a, b].concat(), &[1, 2, 3, 4]);
}

is what I needed.

You want Extend::extend(): Vec in std::vec - Rust

It does not seem to take self by value. Am I missing something?

Extend::extend takes &mut self because then it can be used in contexts where only &mut self is available. Requiring &mut self is more general than requiring self.

3 Likes

Sure it's "more general", but it's also typically a PITA to use. If I need to pass a single vec to foo that just concats two vectors together:

existing.extend(new);
foo(existing);

while I could:

foo(existing.chained(new));

If existing is a reference it's even worse

let mut both = existing.clone();
both.extend(new);
foo(both);

vs:

foo(existing.clone().chained(new));

Anyway - IMO, the samentics, readability and everything is much better, and performance is the same.

1 Like

Here is the question: why the Vec<T> is needed?

IMHO, it is more common that a function could receive a &(mut) [T] (needs to read (or modify) the slice) or a T (want to drop the value). It is not flexible to accept a Vec<T>, since it is unnecessary.

If the foo accept T directly, you could use existing.into_iter().chain(new.into_iter()).for_each(|t|foo(t)) directly, without the need of allocating extra memory and drop them very soon.

If a function accepts &[T] and you have two &[T]s that you need to concat first, then it's the same story except with foo(&...).

Taking a single T is completely different, not relevant.

Huh.

I just discovered:

fn main() {
    let a = vec![1u8, 2];
    let b = vec![3u8, 4];
    assert_eq!([a, b].concat(), &[1, 2, 3, 4]);
}

which I guess is exactly what I need. For some reason I thought it worked on strings only, and I see some Concat etc. unstable traits around it now, so I wonder if it changed recently, or did I just miss it for all this time.

3 Likes

Unfortunately, <[Vec[T]]>::concat clones the T elements, so it is not a suitable replacement in cases where T::clone is expensive or unimplemented.

6 Likes

One option is to use Add `core::iter::chain` method · Issue #154 · rust-lang/libs-team · GitHub

foo(chain(existing, new).collect());

which is by-value, has a good size hint, and does no cloning.

3 Likes

This isn't specific to Vec. What you want boils down to method chaining by self-passing for a case where only a &mut method is available. tap has a general solution for this.

7 Likes

One way to deal with that is to iterate over both/all Vecs:

let v1 = vec!["hello".to_string()];
let v2 = vec!["world".to_string()];

let v3: Vec<_> = v1.into_iter()
    .chain(v2.into_iter())
    .collect();

That does allocate for the resulting Vec, but doesn't need to clone the elements.

And it's not clear that appending directly to the Vec by value would perform better than that, because under normal circumstances any old vec.push(...); can cause the entire Vec to reallocate anyway.

It can reuse the first Vec's allocation if it's big enough (and the same for Vec::append)

2 Likes

That's not implemented for chain and it'd only help if the first vec has enough spare capacity to absorb the second, otherwise a new allocation will be needed anyway. For large allocations and allocators with in-place realloc it might be worth it.

But it does impl TrustedLen, so it'll only be a single allocation.

Since half the discussion is about method chaining: What if rust had something like the following (placeholder syntax for sake of argument, I'm not trying to suggest using », but simply using . would be ambiguous):

struct A {}

implA {
    fn x(&self) -> usize {}
    fn y(&mut self) {}
    fn z(self) -> usize {}
}

fn main() {
    let mut a = A{};
    // "New operator for chaining methods that don't return self
    my_func(a.z()»y()»x())

    let mut b = A{};
    // Also allowed (even though x and y don't take ownership and normal
    // method chaining wouldn't work, as we the caller can take it), the
    // same goes for mutability vs non-mutability.
    my_func(a.x()»y()»z())
}

The last line would be equivalent to the following:

a.x();
a.y();
let value = a.z();
my_func(value)

The return value of x is be dropped, thus this would even allow chaining methods that return something else as long as you don't care about its value.

Motivation

When writing library code you currently have to consider if method chaining should be allowed/possible (and then decide if you return Self, &Self or &mut Self), for example when implementing a builder pattern. Then the user/caller is somewhat forced into using that system. With this the user/caller has the choice if he is interested in the return value . or wants method chaining », even if the library developer never considered method chaining in the first place.


Is this worth adding to the language? I don't know, but I think it cleanly solves the issue of method chaining (e.g. in a builder pattern) at the cost of adding another operator to the list of how you call functions: . ::.

4 Likes

Technically, with a slight abuse of syntax, this already exists:

my_func(a.z()»y()»x())
    // vs. 
my_func({ a.z(); a.y(); a.x(); })

Note that neither version would compile if your example is parsed as left-associative, because z() takes its receiver by value.

1 Like

Good catch, can't use self after you moved it into z() of course.

Note that this works only if a is a variable. If it's some more complex expression you will be recomputing it multiple times.

1 Like

As-is, absolutely. At the core though it wouldn't be difficult to update the example such that expr evaluation happens only once:

my_func(a.z()»y()»x())
    // vs. 
my_func({ let mut val = a; val.z(); val.y(); val.x(); })