Implement Add for String + String

Related: https://github.com/rust-lang/rust/pull/36430

Note that I didn’t suggest adding String to Cow yet (although that could be a logical next step). Perhaps more importantly, if we make Cow have better ergonomics than String, more people will use it?

1 Like

I know. I’m just saying it’s related.

Is it possible to add a lint to this behavior saying you should do s1 + &s2 if you want to keep using s2?

Sure, but what for? You’d need to actively .clone() the second String anyway, or the best thing the lint would do is improve on the error reporting the borrow checker gives anyway.

It should warn if you’re clone()ing things that don’t need to be cloned. I’ve seen a lot of clone() abuse by beginning Rust programmers that can’t get their program to compile otherwise.

This is not a great way to communicate! Your opinion is that the + operator shouldn’t be overloaded for the concatenation operation, but many people disagree. I recommend trying to express subjective ideas with a subjective tone - like “I think the concatenation add for String that already exists is a mistake, and I think adding this would further compound the problem that it presents.”

The choice to express yourself as if your position is inarguably objective may not seem like a big deal, but it’s the accumulation of small tone choices that push the community toward toxic acrimony.

24 Likes

I like concatenative add, but I also don’t like adding this impl for the reason @hanna-kruppe points out.

Exact messaging could be improved, but if you String + String right now, what the compiler can suggest is that you reference the second String. But if you add String and then use the second String again, getting use after move errors, there’s a non-trivial chance the user will solve the problem with string + string.clone() instead of string + &string.

4 Likes

In Erlang, the append operator for “strings” is ++ not + and so is unambiguous and can be caught at compile time. Worth considering?

3 Likes

Only if the Rust team wants to deprecate the Add impl on String + &str

Edit: I can’t get spaces to work around the word on

Otherwise it’s pointless

@iopq - pointless is a bit strong - what if it proves truly a worthwhile avenue to explore? I’d rather respond to a fault in the idea than some restriction in a current implementation.

I think Add should not be implemented for String. String + &str should be depricated.

2 Likes

If the only point is to guide people away from doing a clone, perhaps we should consider using a lint for that purpose. i.e., I think we could detect common scenarios where people are cloning when a reference would have done.

13 Likes

A lint is certainly a possibility, though I have trouble imagining what rules it could apply. Part of the problem is that the by-value and by-reference versions will generally call different functions or use different impls, which are all user defined, so I suspect there will either be false positives or a non-exhaustive list of special cases (which could be challenging to create). Curiously, it seems there are no lints along these lines in clippy (the closest thing is replacing .map(|x| x.clone()) to .cloned() which is just a style issue), despite numerous lints targeting other kinds of “works but is not as fast as it could be”.

I would love to see such lints regardless of whether String + String happens. As I said, I’m not even sure if this concern outweighs the ergonomics increase, and there are plenty of other places where unnecessary clones can happen.

1 Like

I’d be in favour of introducing a new operator for concatenation rather than extending +, and deprecating String + &str.

+ only makes sense to concatenate when used on types where Add doesn’t make any mathematical sense. If we allow those meanings to overlap, it’s bound to become a source of confusion. For some types, Add and Concatenate might be both valid but different operations, e.g. for numeric vectors. I also think Concatenate could interact well with type level integers when we implement them.

As for the operator, good candidates would be

  • ++ (Haskell, Erlang, Elixir, …)
  • <> (strings in Elixir)
  • ~ (D) (pro: single character – con: usual meaning is approximation)
  • are there others ?
9 Likes

To be very exact, strings form a well-defined mathematical object named a free monoid. So it is just a different mathematical sense :wink: Also,

Could it? Type level integers form a ring, just like an ordinary integer. I argue that the similarly-looking and similarly-meaning types should have the same set of operators and operations.

Here goes a complication with extending lexical tokens. I think that, as far as I know, introducing a new multi-letter token which doesn't contain any previously invalid letter breaks a backward compatibility with macros by examples (i.e. macro_rules! style):

macro_rules! t {
    (+ $x:tt) => ()
}

// used to compile, but no longer compiles with the addition of ++
t!(++);

Note that we already have similar problems with the treatment of >>, && and ||---they have to be split as the parser requires, and the macros cannot directly request the split. In that regard, a single-letter token is much better candidate than others.

This may help a lot. It seems that there are not much alternatives there (except that ~ is also used by Perl 6).

6 Likes

My opinion is that (a) an operator is convenient but not super important and (b) String + &str should be deprecated in Rust 2.0.

1 Like

This is Rust, not Haskell ;). We don’t have Monoid, SemiGroup, Functor or Monad traits and we, so far, haven’t stuck too close to category theory. Rust has always kept a low-level, practical approach to things so far. I see no reason not to continue.

As for backward compatibility with macros, I’m not convinced the breakage would be extensive. We could measure that first. Plus there are steps we can take to reduce/prevent it. And macros are being redesigned anyway.

I know :smiley: What I meant is that the concatenation is quite different from the addition and your comparison seemed not aligned (e.g. type-level integers).

I agree that the turnout may be insignificant (breaking brainfuck macros would be unfortunate though). That said, not having to deal with the potential breakage (and to decide whether to ignore it or not) is a clear advantage. Also the compatibility breakage of deprecated features also counts as breakages…

1 Like

So I still want this. Here’s the reason:

I have code that is generic between String and Cow<str> for the purpose of benchmarking the performance difference between them. I can run this code with each Rust version to see if either one had a performance regression (and in fact, there was one on Windows when Rust on Windows switched allocators).

The code in question looks like this:

//does the monoid operation on the slice of tuples if the closure evaluates to true
fn accumulate<'a, T: Monoid>(tuples: &[(&'a str, &Fn(i32) -> bool)], i: i32) -> T
		where T: From<&'a str> + From<String> {

	tuples.iter()
		.filter(apply(second, i))
		.map(first)
		.cloned()
		.map(<&str>::into)
		.fold1(T::op)
		.unwrap_or_else(|| i.to_string().into())
		//op just concatenates, but String does not satisfy Add
}

See, I had to actually make my own Monoid trait in order to call .fold1() with T::op

Another idea was to do this:

trait MaybeCollect: Iterator {
    fn maybe_collect<B>(self) -> Option<B> where B: FromIterator<Self::Item>;
}

impl<T> MaybeCollect for T where T: Iterator {
    fn maybe_collect<B>(self) -> Option<B> where B: FromIterator<Self::Item> {
        let mut iter = self.peekable();
        if iter.peek().is_none() {
            None
        } else {
            Some(iter.collect())
        }
    }
}

this works for String BUT my code is generic over String and Cow<str>

using .collect() on Cow<str> leads to allocations in the case where Cow<str> would just return a reference! Which means the performance benefits of using that type are eliminated.

I have to jump through extra hoops (making a new trait and having to use it to concatenate Strings) just because Add is not implemented on String already

3 Likes