Pre-RFC: std::string::String::replace_and_count

Summary

This is a pre-RFC for a replace_and_count function in addition to the replace function for strings. I have, more than one occasion, wanted to check the number of replacements, usually to validate number_of_replacements > 0, in which I currently separately check for the existence of the searched string.

It is rather easy to adapt the replace source code for doing this yourself, which I plan to do for my use case, but I also wanted to ask if it makes sense to add a replace_and_count function to the String type.

This is my first time proposing any feature, so if there are any problems with my post, please notify me and I'll be more careful.

It's a one-liner to call matches and get the count (playground). Why modify replace when there's a lightweight way of solving this?

That would search for matches twice, rather than once.

5 Likes

In most cases to specifically check whether there are any matches, you can do s.matches(…).next().is_some() before the replace to avoid going through the whole string twice.

A more general alternative could be a str::replace_with method that takes an FnMut that’s called for each match to determine the replacement.

6 Likes

That could still(worst case behavior) traverse the string twice in the event that there is a match though, right?

Seems like the problem with replace_with could be the performance hit if it's not inlined.

2 Likes

Yes, good point about the worst case.

I would presume a hypothetical replace_with method to be inlined and optimized to remove essentially all overhead, like Iterator methods are, among others.

4 Likes

replace_with is certainly general enough, but using it for this purpose would require something like:

let mut count = 0;
let replaced = some_str.replace_with("pattern", |something| { count += 1; "replacement" });

For the proposed use case, this seems like a substantially more cumbersome solution.

I'm not sure if replace_and_count is a sufficiently common use case to motivate putting it in the standard library, but I think replace_with may be excessively general for that use case.

2 Likes

I would think that replace_with would serve for programmatically changing the pattern, which could indeed be desirable in some cases. As a more general version of replace_and_count, we could have replace_and, which still uses a static pattern but allows for extra computation.

Another option could be replace_iter(pattern, new), which could return an iterator instead of calling a closure. Not sure which is better, probably replace_with or replace_and, as it's easier to understand and the iterator one does not allow changing the replace value (just thought I'd mention that as an option).

Not sure if replace_iter can be sound without a LendingIterator, since every replace needs an exclusive borrow for the whole string.

Would it be silly if replace_with also returned the count?

The downside is that it is redundant, and could introduce overhead when the user doesn't need the count. Though I do think this is trivially optimized with inlining and DCE passes, to the point where I'd expect no overhead in release.

It's kind of astonishing that String::replace doesn't return the number of replacements already, since this information is useful, readily available, but promptly discarded.

But searching the net I found out that Javascript, Python, Java, C#, and other languages suffer from the same issue. Most "solutions" I find makes people scan the string two times. In Javascript and Java at least, it's possible to have stateful code that counts the number of replacements as they are made (like the solution with an hypothetical String::replace_with), which is better but looks too convoluted.

I think not only it's common, but it's also the kind of thing that people often write suboptimal solutions like scanning the string twice.

I think that both replace_and_count and replace_with are good additions to String.

4 Likes

replace comes from str so has to return a new string, and returning something like (result, count) would be awkward. But String could perhaps have a "replace_in_place" method that returned the count.

I don't think that returning (result, count) is awkward, this means that rather than

let replaced_string = x.replace(..);

If replace returned the count, you would write either

let (replaced_string, count) = x.replace(..);

Or, if you didn't want the count,

let (replaced_string, _) = x.replace(..);

Which doesn't look too bad.

But okay now that replace discards the count, I think the next best thing is to make replace_and_count return (String, usize).

In other words, I don't think that counting replacements should be tied to whether the replace is done in place or not, just for the sake of having a tidier API. (an in place replace, if it's ever done, should return the count though)

Often you'd probably do x.replace(..).0 in the real world though, as a part of a larger expression, which doesn't fill me with joy. Could use a named result struct instead, but that would in turn make the destructuring case more verbose. Not sure it would be worth the noise, given that needing the count is likely to be rare.

Of course you could do something like this to get the best of both words at the expense of some type annotations or turbofish applications here and there :grin:

So, what should be the next step? Should I write the several variations(replace_and_count, replace_with etc.) and open a PR, or write a more detailed documentation?

Open an ACP

5 Likes

ACP: Add `std::string::String::replace_and_count` and/or `replace_with` · Issue #344 · rust-lang/libs-team · GitHub posted!

3 Likes