Syntactic sugar for &str to String conversion

Hello,

I've noticed that we often write :

"xxxx".to_string() /* or */ String::from("xxxx") /* or */  "xxxx".into()

to convert from &str to String.

I have a proposal to simplify life for developers, why not add a little syntactic sugar to shorten the whole thing? Like this :

S"xxxx"
// or
@"xxxx"
// or
s#"xxxx"

the compiler will automatically transform &str into : "xxxx".to_string() / String::from("xxxx").

what do you think? :thinking:

8 Likes

I am not really bothered by writing String::from('xxxx") but why not

2 Likes

This has already been asked in the past in various forms. There are two important issues with it:

  • it wouldn't always work (e.g. in #[no_std] environments without alloc support);
  • it introduces hidden allocations, going against Rust's principle of being explicit when possible.
12 Likes

Not to weigh in heavily one way or the other, but...

  • Lots of things don't work with #[no_std], hopefully by that point one has learned which string to use though.
  • The proposed syntax is explicit, just terse. Forcing verbosity in certain cases is helpful when it pushes users towards an alternative with less friction (like Cow!), but honestly I've worked with some string heavy codebases where having an S"hello" syntax would have helped me relax.
19 Likes

BTW, long long time ago Rust had ~ for heap allocations, so ~"string" and ~[] would give you a boxed type (I don't remember whether that was String or Box<str> equivalent).

2 Likes

I totally forgot about the possibility of not having STD :face_with_raised_eyebrow: you're right if you look at it from that point of view

Isn't the comfort of our eyes worth breaking the rules? :laughing:

it looks more like Box<X>, Do you know why they removed it?


Otherwise there are only two possibilities: either generate an error if you use this syntax with #[no-std], or create an S! macro that does this.

Well I would argue that it would generally be nice to have a dedicated symbol for .into(), for example @. Then, @ "I am a string" would be desugared to "I am a string".into(), and there are many other examples where it would be nicer to have it too. But I guess it is far from the first time something like that was suggested..

5 Likes

very good idea, @ = into(), it will work for all conversions

1 Like

I know this has been somewhat brought up when the syntax was reserved in the 2021 edition.

I like the idea of adding these kinds of things to rust, though I care more about the suggested f"" syntax/string interpolation. It's a really nice feature in Scala and I think it would work well in rust as well.

4 Likes

Thank you for this link, here is the part that concerns this syntax :

Some new prefixes you might potentially see in the future (though we haven't committed to any of them yet):

  • k#keyword to allow writing keywords that don't exist yet in the current edition. For example, while async is not a keyword in edition 2015, this prefix would've allowed us to accept k#async in edition 2015 without having to wait for edition 2018 to reserve async as a keyword.
  • f"" as a short-hand for a format string. For example, f"hello {name}" as a short-hand for the equivalent format!() invocation.
  • s"" for String literals.
  • c"" or z"" for null-terminated C strings.

FWIW, c"foo" for CStr literals will be stable in 1.77, to be released next week. It's pure syntax sugar in the sense that there are no hidden allocations or anything, but still, it's precedence for a library type getting its own literal syntax.

4 Likes

S!("foo") is a bit too hard to both, type and read. BUT if macros could accept " as their delimiters, then we could have S!"this is String", f!"format {s}", c!"null-terminated" without the separate string prefix feature.

Additionally, #[nostd] case would be solved since f! or S! could be present in std but not in core.

Additionally, users could define their own "prefixes".

11 Likes

This looks like a nice idea, but with c"some c string" and u8'c' already in the langage I think that s"String" would be the more consistent idea.

2 Likes

Great idea since macros are already an extension point and strings are such a ubiquitous representation format. It may expand the compile time functionality seamlessly.

s! " " 
s! """ """
s! r#" "#
regex! ".*"
math! "(sqrt 2) * pi +$x"

...come to think of it, the following micros might be cool.

math! { sqrt(2) * PI + x }
matlab! { ... }
maple! { ... }
haskell! { ... }
c! { ... }
cc! { ... }
fsharp! { ... }
1 Like

Note that format!("Hello world") compiles to identical assembly as String::from("Hello world") so if f"Hello {name}" is implemented, you get String literals for free. No need to make a new s"Hello" syntax for string literals that cannot contain interpolation.

I think it's quite niche for the light syntax to be important and where the literals need to contain interpolation syntax without being interpolated.

11 Likes

f"" strings will likely desugar to format_args!(""), not format!("")

6 Likes

Ah, interesting. I see format_args! suggested in this thread, with the benefit of being usable in core. I suppose one could imagine F"Hello {name}" as having format! semantics (making a String instead of fmt::Arguments).

When the syntax was reserved in edition 2021, these were suggested.

Some new prefixes you might potentially see in the future (though we haven't committed to any of them yet):

  • k#keyword to allow writing keywords that don't exist yet in the current edition. For example, while async is not a keyword in edition 2015, this prefix would've allowed us to accept k#async in edition 2015 without having to wait for edition 2018 to reserve async as a keyword.
  • f"" as a short-hand for a format string. For example, f"hello {name}" as a short-hand for the equivalent format!() invocation.
  • s"" for String literals.
  • c"" or z"" for null-terminated C strings.
1 Like

I want to raise a few points:

Format args usefulness

How useful really is it to have syntax for format_args!? anything that takes an Arguments tends to be a macro already.

String literals, and formatted string literals are a much more common problem, and I think that it's worthwhile to have syntax for that, especially having an elegant solution that covers both.

New syntax

I'm usually opposed to new syntax as it complicates the language and makes it harder to learn but I think that in this case will simplify it and make it easier to learn.

This will also solve the problem of "what way of creating Strings from a &str should I use?", since this syntax would clearly be the best and more concise one.

Prefix

Regarding the prefix, I personally think that s"str" is better than f"str" in order to both create owned Strings and formatted strings.

Next steps

If this were to be accepted I think that it's reasonable to expect p"path" for &Path / PathBuf, though this raises the question of which one should it desugar to. This would cover all string types in std.

no_std

There is no prior art of syntax that doesn't work on core, but I don't think that's really an issue.

Macro syntax

I don't see the point of having s!"str" outside of some feeling of correctness. I think it would overcomplicate the language. A beginner might ask "Why does b"byte literal" work but I have to do s!"String literal"?"

1 Like

fmt::Arguments is a Display type, so it's useful for that. The temporary lifetime limitation is restrictive, but it's still useful in a number of applications. Any time you'd use a temporary String buffer that's cheap to recreate you can usually use fmt::Arguments instead.

It would for literals, but any time you actually have &str, I'd argue that S"{var}" is actively a poor way of spelling so compared to the existing options. Terse is not necessarily better.

I don't think this is a good idea if it includes formatting. Path manipulation is limited the way it is in order to encourage thinking about potential injection points.

Everyone loves to forget about OsStr/OsString.

6 Likes

fmt::Arguments is a Display type, so it's useful for that

It's still rare to encounter a call to format_args! in the wild or to use Arguments outside a macro.

It would for literals, but any time you actually have &str, I'd argue that S"{var}" is actively a poor way of spelling so compared to the existing options. Terse is not necessarily better.

You're right. It's only better for literals.

Everyone loves to forget about OsStr/OsString.

My bad! os"os string" has a really cool prefix. Whether it returns an owned formatted version or a static literal is a different question.

To me it would make a lot of sense to have the uppercase version for the owned part:

  • p"path": &'static Path
  • P"path": PathBuf - Formatting TBD
  • c"cstr": &'static CStr - Stable as of 1.77
  • C"cstr": CString
  • os"str": &'static OsStr
  • OS"str": OsString
  • S"formatted {string}: String
  • s"str": &'static str - This would make sense for symmetry but feels but I don't like it at all
  • B"byte literal": Vec<u8> - ??
2 Likes