Vector Concatenation

gbutler · February 3, 2021, 11:02pm

If you target a new edition, old code must continue to compile with only automated changes required. For example, a new keyword requiring escaping on existing variable names that have the same name that can be automatically fixed with "Rust-Fix". My understanding, is that barring that kind of automatic "upgrade" breaking changes are not permitted in Editions.

burakumin · February 3, 2021, 11:24pm

I don't think this would cause an issue. The 2018 create would still be able to provide String. The 2021 crate would just not be able to internally refer to String + &str explicitely and the doc would mark it as deprecated.

Here:

mjbshaw · February 3, 2021, 11:26pm

Not that I'm arguing in favor of anything here, but I'm pretty sure cargo fix could auto-translate String + &str to something like {string_var.extend(str_var); string_var}.

mbrubeck · February 3, 2021, 11:41pm

However, cargo fix could not do anything to fix code that passes (String, &str) to a generic T: Add<U> function, still leaving us with at least some cases where x + y compiles despite x being a String. And the impl itself would have to remain because editions cannot make breaking changes to the standard library.

If there's anything further to be said about this, please start a separate thread for it, since this is a very different proposal from the one this thread is about.

burakumin · February 4, 2021, 12:22am

As I see it the OP is less about a proposal than about an expression of surprise about an obvious inconsistency:

In order to mitigate it, the discussion has evolved from Should we apply this pattern to other types? to Should we remove it entirely?, which does not looks off-topic to me.

skysch · February 4, 2021, 12:45am

Perfection is not the goal of the standard library. The goal is to solve common problems with common solutions, provide crate-unifying interface types, and never break anyone's build. Deprecating things because someone occasionally has a moment of surprise is unnecessary, especially given that momentary surprise is a normal and expected part of learning.

In order to mitigate it

While doing so may alleviate OP's concern, there's regardless going to be regular surprise for anyone who then thinks "why doesn't Rust use + to concat strings like all my favorite languages do?" Not every complaint points at a problem. This is particularly true of complaints about what kind of syntax people like to use; you create just as many problems as you solve by changing things blindly (that is, without doing appropriately relevant case studies. On actual code bases.)

Aloso · February 4, 2021, 1:05am

I sometimes use String + &str, because it is so convenient:

my_iter.fold(String::new(), |acc, s| acc + s + "\n")

vs

my_iter
    .fold(String::new(), |mut acc, s| {
        acc.push_str(s);
        acc.push('\n');
        acc
    })

I'd rather use a String method for this which is more convenient to use in this case, e.g.

impl String {
    fn concat(self, other: impl AsRef<str>) -> String {...}
}

This method can be chained and doesn't require making the string mutable.

toc · February 4, 2021, 2:48am

I use my_iter.join("\n"). Which is from itertools but itertools might as well be part of std in my headcannon.

Kixunil · February 5, 2021, 5:49pm

It also suggests inefficient code.

a.to_owned() + b can allocate twice. OTOH:

[a, b].join("") reserves the right amount upfront and allocates only once.

pickfire · February 7, 2021, 2:33pm

What about soft deprecation by having warnings when String + &str is used? That isn't considered breaking change right? Code will still compile except warnings will show up.

skysch · February 7, 2021, 4:08pm

Why not would the compiler emit that warning? It's just a style preference, right?

pickfire · February 8, 2021, 5:31am

It's not just style preference, it encourage inefficient use of concatenation and very possibly induce extra unnecessary allocations, which beginners will often fall into. As well as the inconsistency between vector concatenation and string concatenation which makes it non-intuitive (like the current &str vs &[u8] Pattern/Needle API, see find).

skysch · February 8, 2021, 1:42pm

Ok, well if you don't want to call it a style preference, call it a design preference. If someone is using an inefficient algorithm or the wrong data structure, that's still not a compiler problem. String::add isn't inefficient unless you misuse it, and the compiler can't judge that. You can write safe, correct, and efficient code with it, so leave it be. Don't throw warnings at people who are doing exactly that.

toc · February 8, 2021, 2:09pm

_.to_owned() + _ might be appropriate as a clippy lint.

pickfire · February 8, 2021, 4:23pm

The language and library should try to prevent prevent these, rust is built for safety which should reduce foot guns for everyone here and there, by throwing the responsibilities to developer experience with the language and implementation details in the docs, I think it is not worth it. Even so, it is (not very) easy to misuse in this case, which is also why there is no std::io::input() (I wrote a RFC back then but thinking it can easily be misused so I didn't continue) up till now yet since it will always allocate.

The only biggest drawback would be that it may require more code when people are using the current construct but it can prevent people from easily doing extra allocation without thinking, as well as API inconsistencies between String and Vec. If String + &str works, why wouldn't Vec<u8> + &[u8] works?

mjbshaw · February 8, 2021, 5:58pm

I think before anyone can take deprecating/warning on String + &str in rustc seriously, a full case study would have to be done to show:

That programs really are bottlenecked by String + &str. That is, show that alternatives aren't just premature optimization.
That the performance increase of alternatives is worthwhile across the whole ecosystem.
That users who use String + &str are more likely to do so in an incorrect way (read: a bug that causes the program to misbehave). And show that alternatives to + are more likely to avoid incorrect usage and thus more likely to prevent bugs (not just performance issues).

When someone says "Why don't we just do X?", where X is something that would have a massive impact/churn on the whole Rust ecosystem, my instinctive reaction is "Please provide sufficient evidence to justify the cost here." The answer "String + &str might be suboptimal in some situations" is quite insufficient, in my opinion.

Maybe because other languages often support + with strings, but less often support it with arrays/vectors? And it's ambiguous whether you want to concatenate or do a per-element addition? Anyway, that's orthogonal to strings.

skysch · February 8, 2021, 5:58pm

String::Add is (usually) nothing but syntax. It's not unsafe at all, so I have no idea why you brought safety up. It doesn't perform worse than it would if you did what it does manually. I'm certainly not trying to argue that it is good, I'm only saying that emitting warnings and/or deprecating it for insufficient reason is worse than just having it around and choosing not to use it. Rust's stability guarantees and overall usability are more important than being perfectly consistency in std.

system · May 9, 2021, 5:58pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
`Vec::append` but by value language design	31	1215	September 19, 2024
Implement Add for String + String libs	71	9171	July 2, 2019
Overload `+` on `Vec` to support appending one element libs	20	3839	March 25, 2019
A little proposal for string concatenation language design	13	1444	March 25, 2019
Making elementwise operations using const generics more ergonomic? (Possible new trait or FromIterator impl) libs	3	1630	October 13, 2020

Vector Concatenation

Related topics