Request for a smarter lifetime: format!(...).as_str() should be valid

Problem

I often find myself write this pattern of code:

// verbose code
let text: String = expression_that_creates_string;
do_something(text.as_str());

The above code could have been shortened (see below) if not for lifetime restrictions.

// shortened code
do_something(expression_that_creates_string.as_str());
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ error: live not long enough

Request

"shortened code" above should be transformed into "verbose code" by the compiler.

2 Likes

The shortened code is equivalent to

do_something({
    let _1 = expression_that_creates_string;
    _1.as_str()
})

which doesn't work because _1 is dropped at the end of the block. To make it work, the compiler could implicitly change it to

let _1;
do_something({
    _1 = expression_that_creates_string;
    _1.as_str()
})

(this would actually happen in the MIR)

For backwards compatibility, the compiler can only do this in situations that would otherwise fail to compile.

I'm concerned that this might add more complexity to the language, which makes it more ergonomic but also more difficult to comprehend. With this proposed feature, it's no longer obvious where a value is dropped.

If _1 is only dropped at the end of a function or a closure, then the behavior would be consistent. I don't think there is any gain in dropping at expression/block level.

Side note: if you expect that you'll operate with the result of formatting frequently, consider making your functions accept any T: Display or T: ToString instead of &str. In this manner you could pass the result of format_args!(…) directly, without having to allocate an intermediate string.

1 Like

You don't want to take something like ToString, because then you have to convert a &str to a String before you can use it, which allocates and copies.

Instead, take an AsRef<str>, which lets you take a String or &str and use it directly.

Alternatively, even if you take a &str, the caller can pass &format!(...) or format!(...).as_str(), as long as the lifetime doesn't need to last longer than the call to do_something. The compiler already extends the lifetime of temporaries to the end of the statement.

6 Likes

There are several reasons to drop values as early as possible, and not just at the end of a function, for example:

  • freeing heap memory as early as possible reduces the total memory usage
  • not dropping a MutexGuard or RefMut can cause deadlocks or panics.
4 Likes

This can be achieved by looking up last reference to a variable (inside the scope of a function), which would allow dropping even sooner than end-of-block.

The above block is only called once (unlike functions, closure, and body of a loop).

Drop order is stable, and rustc has explicitly committed to not changing drop order.

Dropping (nontrivial drop) items early like it looks like you suggest would be a destructively huge breaking change.

6 Likes

The problem becomes evident in this example:

fn deadlock(mutex: Mutex<Foo>) {
    do_something(mutex.lock().unwrap().foo());
    do_something(mutex.lock().unwrap().bar());
}

If the call to foo() borrows the MutexGuard, and the compiler therefore drops it only at the end of the function, it will result in a deadlock (I didn't test it, please correct me if I'm wrong).

I'm not sure which comment you are refering to. I don't think the drop order should be changed.

Yes, thinking about it, I should have written Display only (and your comment about lifetimes also applies indeed, fair enough).

This works already:

do_something(&expression_that_creates_string());
1 Like

For this particular example, yes.

What about a.as_ref().foo().as_ref().bar()?

&new_string() should be valid in the exact same situations where new_string().as_ref() is - neither has special rules. If I recall correctly, the temporary will stay alive for the entire statement, but not longer than that.

By that I mean, as long as do_something has no particular lifetime requirements on the &str it consumes, even this is valid:

do_something(format!("a {} string", "new").as_str());

By that I mean that this compiles in playground:

fn do_something(v: &str) {}

fn main() {
    do_something(format!("a {} string", "new").as_str());
}

So if there were extra restrictions on the lifetime do_something can accept that would cause the example in the OP to fail, then do_something(&format!("")) would fail as well.


Responding to the OP, I think something similar to what you're suggesting already happens - it just isn't as broad. Rather than transforming your code into

let text: String = expression_that_creates_string;
do_something(text.as_str());

Rust transforms it into

{
    let text: String = expression_that_creates_string;
    do_something(text.as_str())
};

The temporary is guaranteed to live as long as the statement, but it lives only that long, not as long as the entire block the statement is in.

As others have stated, this is preferable in general, and allows much nicer usage with mutexes and such. For instance, how long would you expect this code to hold the mutex for?

let x = calc_a;
mutex.lock().insert(x);
let y = calc_b;

Under rust's current rules, mutex is unlocked directly before the let y = ... line. If we went with your suggestion instead, then mutex would remain locked through calc_b - I'd argue the confusion in this case wouldn't be worth the extra simplicity when dealing with things like strings.

There could be a way around that - only extending the drop if the lifetime required it. I think this would probably also be bad, though, since it'd become even harder to analyze the behavior of any particular code segment. In addition, all valid rust code can currently be compiled without understanding lifetime bounds - breaking that for ergonomics probably wouldn't be a good idea?

Hope that wasn't overly harsh! I think looking for ways to improve the way rust handles things is good, but there are a lot of pitfalls when dealing with drop rules. Plus, as CAD97 pointed out, these rules are stable. Changing the drop order so that intermediate values are kept longer could and would break code.

1 Like

There is an accepted RFC for this, though it is extremely old, somewhat vague, and never implemented.

3 Likes

It wouldn't be a breaking change to relax only cases which currently lead to a compilation error.

2 Likes

Off topic, but isn't that symptomatic of a breakdown somewhere in the RFC process?

Once upon a time there was a proposal to add eager drop to Rust. If this was generalized to an opt-in "side-effect-free drop", then it would give freedom to the compiler to shorten — or extend — lifetime of such objects without risk of interfering with mutexes or other objects relying on the stable drop order.

This gets into complicated edge cases, so I don't think it's that simple. I think it's actually more like

do_something(match expression_that_creates_string { x => x.as_str() })

Though I'm bad at remembering the exact rules for how that's different.

This is not true, by the way. There are special rules to extend temporary lifetimes. They currently apply to the first expression but not the second. RFC 66, when implemented, will extend them to both.

This compiles today:

fn new_string() -> String {
    "hello".into()
}

let x: &str = &new_string();
dbg!(x);

This is an error today, but RFC 66 makes it valid code:

let x: &str = new_string().as_ref();
dbg!(x);

Playground

2 Likes

Thank you for correcting me there! I didn't realize it did actually have special handling.

I feel like I'm misunderstanding something else here, though. Wouldn't RFC 66 change the meaning of the following code?

struct NoisyDrop(u32);
impl Drop for NoisyDrop {
    fn drop(&mut self) {
        println!("dropped {}!", self.0);
    }
}
impl NoisyDrop {
    fn no_op_ref(&self) -> &NoisyDrop {
        self
    }
}

fn uses_noisy_drop(v: &NoisyDrop) {
    println!("used {}", v.0);
}

fn main() {
    let x = NoisyDrop(4).no_op_ref();
    println!("middle of function");
    // uses_noisy_drop(x)
}

Right now, this outputs

dropped 4!
middle of function

If the temporary x was extended to live through the function, as it would if it were &NoisyDrop(4), I would expect it to change to instead output

middle of function
dropped 4!
2 Likes