Getting rid of String slices for better ergonomy

Would String literals help enough?

This is one obvious ergonomic deficiency, which I think everyone can agree on.

It also makes novice users first encounter with stings based on &'static str, which is sort of an edge case for stings and lifetimes.

2 Likes

That makes me think of a similar proposal on one of the rust forums that turned out an April fool's day joke. So I gotta ask: what would that look like?

As for ergonomics issues: I haven't seen a concrete example yet in this thread, but the only place where it can sometimes get a little messy for me is when I use the + operator to concatenate a string slice to the contents of a mutable String buffer. Even then it's just an extra & to first borrow and then force a deref to &str. So the mess mostly looks like this:

// Owned String creation could use literals I suppose, but IMO not big enough to be worth
// arguing over, and there's something to be said for uniformity with other types:
let mut foo = String::from("hello world, "); 

let bar = String::from("StringUser");

let concatenated = foo + &bar; // This is slightly annoying but again not that big of a deal compared to:
// let concatenated = foo + bar;

Those are my biggest annoyances with the String system, so I'm wondering what exactly the big ergonomic issues are, other than the discussion about string names.

String literal might means that as a beginner you wouldn’t be exposed that much to string slices and you might still be able to produce some code without knowing about slices. You basically smooth the learning curve.

That being said, it can also introduce even more confusion with the current &str literal.

Regarding examples, concatenation is a good example. Also initialisation of struct containing String. I have fair number of to_owned in my code.

1 Like

The fact that you need something like .into() or .to_owned() in Rust is not a bug, or even an issue. It’s an explicit design decision to force the user to write an explicit marker where potentially expensive conversions occur. While it might be less convenient to write, I can tell you from experience that it sure is a big help in understanding the performance of code you wrote a year ago, or someone else’s code for that matter.

Similar things happen when you go from a &[T] to a Vec<T>, for example, or when cloning non-Copy values.

Please don’t expect Rust to behave as something as Java or Python. Rust’s memory management fundamentals (i.e. ownership-based) as well as the use cases for the language as a whole are sufficiently different from those languages that expecting this kind of thing to behave similarly is setting yourself up for failure, a lot of anger, or both.

1 Like

String::from("") is not too bad, but it does make docs a bit more verbose. I fought for changing std HashMap example to owned strings (since that’s the usual/easy way), but extra calls in the examples were a downside.

I’m not sure about the syntax. There is one for Vec, but I don’t see a way to be consistent with it. So maybe S"hello world"?

1 Like

I never said to_owned and into shouldn’t exist in general… I’m just saying that I’m doing it a lot to convert from one string type to an other. And Rust memory management itself has nothing to do with having a unified string type.

I really don’t buy the argument “its a really big help the language doesn’t make it convenient so you can understand performance code”. It’s like the opposite of zero cost abstraction. A language is all about abstracting things away. Here most user won’t care whether the string live on the stack or on the heap as long as the compiler chose the right one, make it transparent and allow enough customisation when you actually need it.

I actually think comparing with other language is very relevant because, knowing what the Rust compiler knows, you should be able to make something which is both more convenient and more performant.

1 Like

@kornel This would be workable, but of course then we need to explain why one literal type allocates while another one doesn't.

@tibotz As explained before in this thread there are tradeoffs to be made when it comes to string design, if there is an abstraction that lets a programmer have it all, it has yet to be invented.

The string behavior observed in languages such as Java and python (which you seem to be arguing for) is built on the fact that those strings are garbage-collected. Rust has no such concept, and from that basic premise flows a different design. Now, it is possible to abstract over both &str and String with e.g. Cow<str> but that induces a runtime penalty, as explained by an earlier post.

Calls that turn a borrow into an owned value can never be never zero cost. As for the "syntactic overhead", that's the point :slight_smile: The code is supposed to make the reader aware that expensive operations are happening. Rust has a different niche as a programming language than e.g. Python or Java, and it is designed as such. You wouldn't want to create a kernel in Python/Java, for example. This has to be as possible in Rust as it is in C/C++, and more so where possible. Trading performance-by-default in favor of convenience seems like the wrong tradeoff for such a language.

I'd say that a language is about abstracting certain things away. What exactly is abstracted by that language depends on the purpose of the language. Rust, for example, is a language that puts performance, safety and concurrency above programmer convenience. When all 4 are possible at the same time, that's great, and it'll happily take all 4 into account. But when tradeoffs need to be made, convenience is easily the first thing to go. That makes Rust in a sense a lower level language than Java (at least when it comes to things like strings), and that's perfectly fine.

that’s funny, I use Rust as a scripting language all the time and I never had an issue with strings.

anyway, you probably want str methods that produce Box<str>.

fn concat(&str, &str) -> Box<str>

this makes str inefficient but easy to use. use String for performance.

Lua makes a similar tradeoff: use … if you want something inefficient but easy to use (well, sometimes this can be more efficient, but it’d need to take an array tbh), use table.concat for performance.

(technically, Rust String provides the worst of both worlds… but it generally works well enough, better than repeated concatenation but worse than table.concat. I don’t think Rust has any APIs comparable to table.concat…)

slice::join?

Note that if the RHS is already a string literal, you don’t need to (and for performance and complexity reasons, probably shouldn’t) go through a String buffer. let concatenated = foo + "StringUser" should work fine.

Agreed, the reason I wrote it that way was simply to demonstrate how String allocation works, which is seen as a papercut by those who want String literals.

That outputs an String not a Box<str>

And why would you prefer to get Box<str> in this case?

And slice.join("").into_boxed_str() will literally just discard the capacity field in String anyway, as <&[&str]>::join(&self,&str)->String doesn’t over-allocate.

1 Like

Eh, still dislike it… but then I’m comparing it to Lua, which doesn’t have mutable character buffers.

I like Rust and I generally like how strings work in Rust. But maybe we can simply get rid of String… Lua doesn’t have it.

Box<str> is nothing like Lua’s strings, which are more like Gc<str>. You need an entire runtime to make that happen.

Mutating strings is an uncommon need, to be sure (unless you’re actually writing a method like join), but I don’t see why it should bother you so much that the option is there. The borrow checker forbids you from doing anything you’ll regret.

3 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.