The current situation is that there are a few main ways to create String instances:
"foobar".to_string()
"foobar".to_owned()
"foobar".into()
…and some others by using constructors. Of these the last version probably has the best performance, but might need type annotations in some situations. .to_string() is the most obvious / easy to remember approach, but it invokes the formatting machinery and offers less performance. (Also, it’s not meant for creating String instances from 'static strings, which is confusing to newbies. OTOH it’s the way shown in most of the book and the tutorials.)
Another way to solve the issue is to create interfaces that take T: Into<String> as parameters, but that quickly results in hard to read interfaces, as Strings are usually used all over the place.
Proposal
Why not introduce a prefix to create String literals? In order to match byte literal syntax (b"foobar") a s"foobar" (for string) or a u"foobar" (for unicode, like in Python) prefix could be used.
Downsides
It would probably hide the cost of allocating a new String.
The byte literal creates a static reference, while the string literal would create a non-static owned object. (see also)
Newer code using the literals would probably be incompatible with previous Rust versions.
I would be happy about any comments Maybe it’s a terrible idea (I only started doing Rust in January, so I don’t know much about the internals), but I hope the current situation can be improved somehow.
The issue really is, strings are usually not considered as composite values. They’re kind of numbers, but with complex allocation story. Sure, I’m talking mostly about usage story.
In most cases, we use either static literal or produced, owned string. Which is in fact finely covered via Cow<'static, str>. Based on this, I’d better have two simple things:
Type constraint alias, which resolves to something like IntoCow<'static, str>
Alias for Cow<'static, str>
And spread using them for common string handling as good practice. This way, we won’t care in many cases whether particular string is a static literal or an owned string. And we’ll be able to pass both &str and String as function arguments, without syntactic clutter.
This will not change existing API, but will influence future ones.
Long ago, we had this in Rust: ~"string". It cause a ton of people to overallocate when they didn't need to because they'd just toss ~ onto things until things worked, rather than understanding what was going on. I know, I was one of them.
Heap allocation is explicit in Rust, and it's really nice. Making it syntactically more lightweight is a mis-step imho.
I don’t think there would be any conflicts with current impls from an impl<T, U> From<Vec<U>> for Vec<T> where T: From<U>, and that does seem like a reasonable thing to have. You’d need to use vec!["aa", "bb"].into(), but that’s still not bad and adding impls involving arrays is annoying at the moment.
When "overloaded box" is implemented, could it be possible to do let s: String = box "hello";? If so, this would make it clear that an allocation is taking place. Combined with box patterns, this could also allow very ergonomic pattern matching on String values.
This might look weird at first to modern Rust programmers, but remember that ~5 became box 5, so it seems consistent that ~"hello" would become box "hello". Vec and String are smart pointers to [T] and str respectively (for example, they implement Deref), so constructing them should be consistent with constructing other smart pointers like Box<T> and Rc<T>.
Making heap allocations invisible is a bad idea in Rust. But what matters is that the allocation is visible. Having a heavy syntax for something frequently used is not a good idea.
I think the issue with a syntax like s"..." isn’t that its lightweight, but that its unique to strings and non-obvious that it means an allocation is being performed. Everyone would have to learn what that s meant.
On the other hand box is / will be more generally applicable and is clearly associated with allocations, making box "..." easier to pick up and easier to guess at if you don’t already know.
If box "..." is a string, though, box [...] should be a Vec<T>.
An other possible Solution would be to extend the Grammar with User-defined Literals (see C++14 2.13.8). To avoid Conflicts, these should only be allowed after a Statement, like this: "UserDefined IsoString"iso1 or "UserDefined Big5String"big5
[edit]
Would one use use box as s to alias box to s "will alloc String"? Grammar
I'm not a big fan of that syntax, mostly for asthetic reasons, but also because it adds more complexity when reading code that contains non-standard user defined literals.
The box syntax is very elegant indeed. It's obvious that there will be an allocation, but the syntax is much nicer than having to call a method on a string literal.
I question the necessity of a syntax sugar for heap allocated literals. To me this sounds like a major code smell.
Well designed code should avoid magic numbers (or any other literals).
Rust has distinct types that denote a view into a container (slices for contiguous memory containers, str for Strings and general ranges and iterators) and these are are usually used in APIs to abstract over the container and so I don’t see a compelling use case to sweeten the syntax so much.
What’s so bad about "foo".into() or the other variants?
Edit: To clarify, I’d expect literals to usually be stack allocated and as such