Raw r#"..."# string literals in Rust vs named R"abc(...)abc" strings in C++

Hi all,

In C++, there is a raw string syntax that looks like this:

std::string query = R"sql(
  SELECT email
  FROM Users
  WHERE username = "foo";
)sql";

The sql part serves as a delimiter, similar to how # is used in raw Rust string literals:

let query = r#"
  SELECT email
  FROM Users
  WHERE username = "foo";
"#;

In addition to making it unnecessary to escape things like double-quotes, the string between R" and ( also serves to describe the content of the string: in this case, readers will know that the string is an SQL query.

Auto formatters like clang-format can use this information to format the code inside the string literal (see RawStringFormats in the style options as well as the example in this StackOverflow answer).

This has been a super useful feature in practice for me. Has something similar been discussed for Rust?

When searching to see if this had been discussed already, I learned that Rust supports arbitrary suffixes on literals. So it's possible to write

macro_rules! blackhole { ($tt:tt) => () }
blackhole!("string"suffix); // OK

today. However, this is very limited since you have to use the syntax inside a macro call. So this doesn't work:

let x = "string"nope;
blackhole!(x);

Ah, it seems the discussion was here:

and the feature was then implemented in

My question is now how people feel about the raw strings now that we've had them for a few years?

Personally, I'm repeatedly bitten by quoting strings which start or end with a double-quote: r#""hello world""#. The doubled double-quote is hard to decipher when that happens. In addition, the # character is quite a "hard" or "noisy" character to me. So overall, I feel that Rust raw strings are harder to read than C++ raw strings.

Have others felt the same?

I start with zero #'s and add more as necessary. I usually start the actual string after a newline, and also put a newline before the end, so I don't feel it's rather clumsy.

Yeah, I do the same. But how does it compare with C++ for you? Assuming you use C++, of course :slight_smile:

I've now read through the discussion in #9411 and the use of delimiters to carry semantic information does not seem to have been discussed. Perhaps this was not a thing in C++ back in 2013?

FWIW, the way to add semantic language injection information to Rust source would be one of three things, currently:

  • stable: comments

    process(
        // #ide:inject-language=sql
        r##"
            ...
        "##,
    );
    
  • unstable: expression attributes

    process(#[ide::inject_language(sql)] r##"
        ...
    "##);
    
  • stable (with major caveats): proc macros

    process(sql!(
        ...
    ));
    

IIRC, the comment method is used by IntelliJ IDEA for language injection in Java and Kotlin sources.

2 Likes

Thanks for that overview! I've used the proc-macro approach recently via the quote! macro for embedding Rust code. It's been awesome to have proper indentation support as well as syntax highlighting inside the macro.

I use Emacs and I think it's more by accident than by design that I get those things, but it's nevertheless been very helpful :smile:

I almost don't use C++ for the same things I use Rust now, and so I barely used C++ raw strings (only once, IIRC), but I didn't realized the text may have semantic meaning, and so I just wrote some gibberish :slight_smile:

1 Like