String literal applied to owned type (String) and boxed Any

By default, "" returns &'static str. With type inference, it'd be interesting if you could do this:

fn take(s: String) {
}

take("zxc");
take("zxc".into()); // more verbose

There are no breaking changes... This may not be useful for everyone, but I think String may be often used more than &str. For example:

  • Some types store a dynamic string without a borrow: instead they own it.
  • If a type that holds a owned string needs to return it, they will need to return a clone of it, and in that case it can't return &str, since there's no lifetime it can use for that return.

Any

This could work too:

let _: Box<dyn Any> = "zxc"; // yields a Box with string

And other types too: Rc and Arc:

let _: Arc<dyn Any> = "foo";
1 Like

(NOT A CONTRIBUTION)

There was once an idea bouncing around not only do to this, but to do so without allocation (by having a string with a capacity of 0 special cased). It was discussed at some length in this thread: [pre-RFC] Allowing string literals to be either &'static str or String, similar to numeric literals

No one ever really pursued it seriously, unfortunately. I think it would be a great change.

4 Likes

I think it's interesting to support boxed types of Any as well.

This would make allocations implicit, which I think is counter a fundamental goal of Rust.

5 Likes

Right... I guess this is not a very needed feature maybe...

I mean I kind of would love to be able to do

let s: String = "abc";

and I don't think the allocation is horrifically hidden in this case. (implicit Box<dyn Any> conversion I don't understand). I'm kind of hoping when a reasonable confluence of features enable a const FromLit trait we will get some more simple literals.

1 Like

This equivalent code would be hidden:

let s = "abc";
...
use s in a way that allows the compiler to infer s must be String.
2 Likes

Correct. I don't think it's horrifically hidden in that case either. I'd also like to be able to

let x = 123...99100
// infer that x is a BigInt

There is value in having some operations be syntactically noisy. Pushing code authors towards learning less wasteful ways to use strings is probably more something I'd like a linter to cover in the case of allocation. It is insanely easy to hide allocations in rust at present though, by calling a function.

I don't think that's a good idea, since it would introduce an edge case that would break the API of String and would add error handling everywhere.

I really dislike the choice of names here btw and I have objected at the time this decision has been made. String is actually a mutable owned string buffer which is named a StringBuilder or something similar in most other languages.

str is usually used behind a reference and causes ergonomic pain due to lifetime annotations.

What we really need and what is the design of choice in most other languages is an immutable owned type such as the String class in Java. That allows it to behave like a value.

Ideally this ought to be a general purpose feature, maybe like the &'owned T that people have been discussing for years.

Isn’t that just Box<str> (or Arc<str> if you want to throw it through an interner)? (And the latter could support being an Arc where the backing data is a string literal, there have been musings towards supporting this with better const-eval).

(NOT A CONTRIBUTION)

In the thread I linked Gankra extensively analyzed the changes necessary to support this and found that for String, it would add a branch to a few specific methods, which did not seem to be a serious cause for concern. This is specifically because we have fairly limited mutable access to Strings because of the fact they're UTF-8. (For vectors, the branch would substantial). You can go and read her post, which involved a lot more effort than your comment.

Of course, her analysis is now 6 years old, because like so many efforts this proposal stalled out from lack of project management. There may be some new APIs that would require a change, but I doubt they are significant. An actual analysis which reached a different conclusion from Gankra's would definitely be interesting to read.

2 Likes

(NOT A CONTRIBUTION)

The proposals in the past have been to make string literals coerce to String by having String support pointing to rodata instead of the heap, so they do not "implicitly" eagerly allocate when the coercion happens. They give String semantics more similar to Cow<'static, str>.

3 Likes

Not quite. Box<str> represents a mutable heap allocation still whereas string literals are in rodata and Arc<str> would have atomic reference counting.

I would like to see a separate vocabulary type for this. Probably reusing the underlying Unique<T>.

You are right that there's a difference between a Vec and a String, but even so, these are core vocabulary types in the language and we should strive for clear and orthogonal definitions for them without any edge cases since they are core to all Rust code.

1 Like

What are you going to do with your immutable String type?

  • Does it unconditionally free on deallocation? Then it’s the same as String.
  • Does it unconditionally not free on deallocation, only guarantee existence? Then it’s the same as &str or &'static str, depending on which guarantees you want.
  • Does it conditionally deallocate? Then it’s the same as Cow<str>, though I admit that one doesn’t have great ergonomics as a string.
  • Is it shareable (and may free on final deallocation)? Then you need that reference-counting.
  • Do you want it to be implicitly copied? Then it has to not free on deallocation; that’s just a rule of Rust and you cannot avoid it with any kind of type.

Rust doesn’t usually need dedicated-immutable types because it has compiler-enforced “shared xor mutable”—even types with mutating methods can “behave like a value”. Java doesn’t have that, but it does have implicit sharing thanks to its garbage collector. Different goals and different constraints lead to different designs.

6 Likes

That misses the point entirely. I'm taking about immutable functional style types and collections where modifications create new instances instead of modifying in place. String is just one case in that group. Rust enforces shared xor mutable at the call site which serves a different purpose entirely vs a type level invariant.

Cow<str> would be closest to this I reckon and seems to me that just like String is a specialised version of a Vec due to utf semantics, we probably should consider a specialised wrapper type here as well for similar ergonomic reasons.

Also, to be fair, I keep forgetting about Cow.. it's a silly name that doesn't stick in my head..

1 Like

I feel like a wrapper type around Cow<str> wouldn't improve the ergonomics much, if you want to create a new modified instance you just do something like let mut new = old.clone(); new.to_mut().remove_matches("foo");.

But really I don't know that I've ever seen Cow being used much in this way. In my experience &str and String generally suffices in situations where you're doing modifications, the borrow checker negates a lot of the reasons to do unshared immutable functional style types. Shared immutable types can be useful like im, but that can't map to the same semantics as str since it doesn't give a consecutive slice.

Right, unless you are going to share any immediate state there is no reason to do this in Rust because of shared xor mutable…except that sometimes it is more ergonomic to do so for the programmer. Which is why str does have some operations that produce String, like replace.

I think the only edge case would be that capacity() would be smaller than len(). It is hard to say how often that assumption is made. I believe only a crater run would tell. I believe the biggest benefit would be for beginners, who usually struggle to understand the difference between buffer types and static slices.

In my learning experience for example, I read somewhere that recommended to use &str for functions instead of String. I took it as to use &str instead of String, so I used it in my structs as well, littering my code with unnecessary lifetimes.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.