Request for a smarter lifetime: format!(...).as_str() should be valid

KSXGitHub · May 25, 2020, 4:21am

Problem

I often find myself write this pattern of code:

// verbose code
let text: String = expression_that_creates_string;
do_something(text.as_str());

The above code could have been shortened (see below) if not for lifetime restrictions.

// shortened code
do_something(expression_that_creates_string.as_str());
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ error: live not long enough

Request

"shortened code" above should be transformed into "verbose code" by the compiler.

Aloso · May 25, 2020, 4:54am

The shortened code is equivalent to

do_something({
    let _1 = expression_that_creates_string;
    _1.as_str()
})

which doesn't work because _1 is dropped at the end of the block. To make it work, the compiler could implicitly change it to

let _1;
do_something({
    _1 = expression_that_creates_string;
    _1.as_str()
})

(this would actually happen in the MIR)

For backwards compatibility, the compiler can only do this in situations that would otherwise fail to compile.

I'm concerned that this might add more complexity to the language, which makes it more ergonomic but also more difficult to comprehend. With this proposed feature, it's no longer obvious where a value is dropped.

KSXGitHub · May 25, 2020, 5:10am

If _1 is only dropped at the end of a function or a closure, then the behavior would be consistent. I don't think there is any gain in dropping at expression/block level.

H2CO3 · May 25, 2020, 6:24am

Side note: if you expect that you'll operate with the result of formatting frequently, consider making your functions accept any T: Display or T: ToString instead of &str. In this manner you could pass the result of format_args!(…) directly, without having to allocate an intermediate string.

josh · May 25, 2020, 6:31am

You don't want to take something like ToString, because then you have to convert a &str to a String before you can use it, which allocates and copies.

Instead, take an AsRef<str>, which lets you take a String or &str and use it directly.

Alternatively, even if you take a &str, the caller can pass &format!(...) or format!(...).as_str(), as long as the lifetime doesn't need to last longer than the call to do_something. The compiler already extends the lifetime of temporaries to the end of the statement.

Aloso · May 25, 2020, 6:40am

There are several reasons to drop values as early as possible, and not just at the end of a function, for example:

freeing heap memory as early as possible reduces the total memory usage
not dropping a MutexGuard or RefMut can cause deadlocks or panics.

KSXGitHub · May 25, 2020, 6:46am

This can be achieved by looking up last reference to a variable (inside the scope of a function), which would allow dropping even sooner than end-of-block.

The above block is only called once (unlike functions, closure, and body of a loop).

CAD97 · May 25, 2020, 6:53am

Drop order is stable, and rustc has explicitly committed to not changing drop order.

Dropping (nontrivial drop) items early like it looks like you suggest would be a destructively huge breaking change.

Aloso · May 25, 2020, 7:02am

The problem becomes evident in this example:

fn deadlock(mutex: Mutex<Foo>) {
    do_something(mutex.lock().unwrap().foo());
    do_something(mutex.lock().unwrap().bar());
}

If the call to foo() borrows the MutexGuard, and the compiler therefore drops it only at the end of the function, it will result in a deadlock (I didn't test it, please correct me if I'm wrong).

I'm not sure which comment you are refering to. I don't think the drop order should be changed.

H2CO3 · May 25, 2020, 8:32am

Yes, thinking about it, I should have written Display only (and your comment about lifetimes also applies indeed, fair enough).

kornel · May 25, 2020, 12:34pm

This works already:

do_something(&expression_that_creates_string());

KSXGitHub · May 25, 2020, 1:08pm

For this particular example, yes.

What about a.as_ref().foo().as_ref().bar()?

daboross · May 25, 2020, 2:54pm

&new_string() should be valid in the exact same situations where new_string().as_ref() is - neither has special rules. If I recall correctly, the temporary will stay alive for the entire statement, but not longer than that.

By that I mean, as long as do_something has no particular lifetime requirements on the &str it consumes, even this is valid:

do_something(format!("a {} string", "new").as_str());

By that I mean that this compiles in playground:

fn do_something(v: &str) {}

fn main() {
    do_something(format!("a {} string", "new").as_str());
}

So if there were extra restrictions on the lifetime do_something can accept that would cause the example in the OP to fail, then do_something(&format!("")) would fail as well.

Responding to the OP, I think something similar to what you're suggesting already happens - it just isn't as broad. Rather than transforming your code into

let text: String = expression_that_creates_string;
do_something(text.as_str());

Rust transforms it into

{
    let text: String = expression_that_creates_string;
    do_something(text.as_str())
};

The temporary is guaranteed to live as long as the statement, but it lives only that long, not as long as the entire block the statement is in.

As others have stated, this is preferable in general, and allows much nicer usage with mutexes and such. For instance, how long would you expect this code to hold the mutex for?

let x = calc_a;
mutex.lock().insert(x);
let y = calc_b;

Under rust's current rules, mutex is unlocked directly before the let y = ... line. If we went with your suggestion instead, then mutex would remain locked through calc_b - I'd argue the confusion in this case wouldn't be worth the extra simplicity when dealing with things like strings.

There could be a way around that - only extending the drop if the lifetime required it. I think this would probably also be bad, though, since it'd become even harder to analyze the behavior of any particular code segment. In addition, all valid rust code can currently be compiled without understanding lifetime bounds - breaking that for ergonomics probably wouldn't be a good idea?

Hope that wasn't overly harsh! I think looking for ways to improve the way rust handles things is good, but there are a lot of pitfalls when dealing with drop rules. Plus, as CAD97 pointed out, these rules are stable. Changing the drop order so that intermediate values are kept longer could and would break code.

mbrubeck · May 25, 2020, 3:45pm

There is an accepted RFC for this, though it is extremely old, somewhat vague, and never implemented.

kornel · May 27, 2020, 1:29pm

It wouldn't be a breaking change to relax only cases which currently lead to a compilation error.

jjpe · May 27, 2020, 1:38pm

Off topic, but isn't that symptomatic of a breakdown somewhere in the RFC process?

kornel · May 27, 2020, 1:39pm

Once upon a time there was a proposal to add eager drop to Rust. If this was generalized to an opt-in "side-effect-free drop", then it would give freedom to the compiler to shorten — or extend — lifetime of such objects without risk of interfering with mutexes or other objects relying on the stable drop order.

scottmcm · May 27, 2020, 4:27pm

This gets into complicated edge cases, so I don't think it's that simple. I think it's actually more like

do_something(match expression_that_creates_string { x => x.as_str() })

Though I'm bad at remembering the exact rules for how that's different.

mbrubeck · May 27, 2020, 4:51pm

This is not true, by the way. There are special rules to extend temporary lifetimes. They currently apply to the first expression but not the second. RFC 66, when implemented, will extend them to both.

This compiles today:

fn new_string() -> String {
    "hello".into()
}

let x: &str = &new_string();
dbg!(x);

This is an error today, but RFC 66 makes it valid code:

let x: &str = new_string().as_ref();
dbg!(x);

Playground

daboross · May 27, 2020, 11:28pm

Thank you for correcting me there! I didn't realize it did actually have special handling.

I feel like I'm misunderstanding something else here, though. Wouldn't RFC 66 change the meaning of the following code?

struct NoisyDrop(u32);
impl Drop for NoisyDrop {
    fn drop(&mut self) {
        println!("dropped {}!", self.0);
    }
}
impl NoisyDrop {
    fn no_op_ref(&self) -> &NoisyDrop {
        self
    }
}

fn uses_noisy_drop(v: &NoisyDrop) {
    println!("used {}", v.0);
}

fn main() {
    let x = NoisyDrop(4).no_op_ref();
    println!("middle of function");
    // uses_noisy_drop(x)
}

Right now, this outputs

dropped 4!
middle of function

If the temporary x was extended to live through the function, as it would if it were &NoisyDrop(4), I would expect it to change to instead output

middle of function
dropped 4!

Topic		Replies	Views
Format_args!() with long lifetimes language design	16	1637	September 11, 2023
Lifetimes are too obscure language design	8	1425	October 13, 2022
Pre-RFC: `#[must_bind]` language design	49	6677	December 24, 2020
Opposite of &'static language design	26	7017	August 3, 2017
Simplification reference life-time language design	32	2869	July 20, 2020

Request for a smarter lifetime: format!(...).as_str() should be valid

Problem

Request

Related topics