Mandatory inlined functions for unsized types and 'super lifetime?

I guess the most inconvenient thing about not_broken_2_but_in_current_rust is the fact that we have to make an explicit binding for cow even though we don't care about it for anything other than the fact that we need foo to reference it. That problem could use some attention, possibly through a feature for promoting temporaries (which has been discussed before, and would have benefits beyond the inline function issue). IIRC the main challenge was the fact that temporaries are currently guaranteed to be dropped, so we would need an ergonomic explicit way to specify that you want to extend its lifetime. Probably not like this, but just for example:

fn not_broken_2_with_with_promoting_temporaries(input:&str) {
  let foo = if some_condition {
    input
  } else {
    &*#[promote_this_temporary_as_needed] upper_if_odd(input)
  };
}

I was thinking smallest containing scope originally. But I hadn't originally considered how that'd interact with narrower scopes like if statements and the like, and you make a good point re: ergonomics.

The simplest way to salvage the concept would be if from the caller's perspective, you could pick the scope the `'super' lifetime applied to by binding the return value to a name, with the default being the smallest enclosing scope:

fn upper_if_odd(s: &str) -> &'super str {
    if s.len() % 2 == 1 {
        let super upper: String = s.to_upper();
        &upper
    } else {
        s
    }
}

fn fixed_2(input: &str) {
    let upper;

    let foo = if some_condition {
        input
    } else {
        upper = upper_if_odd(input);
        &upper
    };
    /* ... */

    // String dropped here
}

However, the more I think about it, the less I like the hidden drop. That particular example isn't a good one either, as once you need the rebinding, why not just return a impl Borrow<str>, and use Either internally? We're dealing with a heap allocation anyway, so there isn't much need for return value optimization.

let super and self-referential structs

If let super allows us to name stack space in our caller, 'super' let's us reference that stack space, and &own Thas ownership ofT`, we can return both an owned value and, and a reference to within it at the same time:

fn to_upper(s: &'super str) -> (&'super own Option<String>, &'super str) {
    let mut super r: Option<String> = None;
    if s.len() % 2 == 1 {
        (&own r, s)
    } else {
        r = Some(s.to_upper());
        let r = &own r;
        (r, r.as_ref().unwrap())
    }
}

fn use_to_upper(s: &str) {
    let maybe_string;
    let s = if some_condition {
        s
    } else {
        let (opt, r) = to_upper(s);
        maybe_string = opt;
        r
    }
    /* do stuff */
    // the Option<String> is dropped here when the &own Option<String> goes out of scope
}

Needs more thought into what exactly the semantics should be, and that's kinda verbose compared to the original. Also, I'm glossing over the exact lifetimes a bit. But it's explicit as to what is happening, and plausible.

Secondly, I'll point out that return value optimization in general has issues when you want to return part of a larger object: let super could allow you to be explicit as to what you want to put in the callers stack frame, even in cases where you're only going to logically return a subset of the data. Eg if I return the latter half of ([u8; 1_000_000], [u8; 1_000_000]) I may still want both parts allocated once, at the top of the call stack so as to avoid copying it over and over.

Hmm.. would unsized_locals actually help here? The RFC says

1 Like

That is an interesting point. So, that scenario is basically, memory-wise, equivalent to returning the entire ([u8; 1_000_000], [u8; 1_000_000]), but for the programmer, equivalent to returning just the second half. The fact that the whole object goes in the parent stack frame is just an implementation detail. It's decoupling the logical return value from the memory-efficiency return value.

I'm still not sure what semantics would be able to accomplish that in a good way, but it seems worth thinking about.

It really just looks like you want what is already provided by so-called return value optimization (RVO).

I'm strongly against changing the language in any direction that complicates reasoning with lifetimes – lifetimes are already hard enough, and I don't even want to imagine how much unsound unsafe code would result from functions that are allowed to violate normal lifetime rules.

As almost always, if you need to violate the memory management principles of Rust, it's overwhelmingly likely that you are doing something wrong, and you should re-design your algorithms and refactor your code instead of try and bend the language so that it allows for more sloppy code.

It would be great if you could provide a concrete, practical use case where your proposal allows something that isn't possible with well-structured, safe code along with rustc's and LLVM's existing optimizations.

1 Like

I think it's enough if let super simply means that the value is put in callee-provided memory, and that returning a value (or part of a value) bound with let super is recursive.

Consider this example:

fn make_pair() -> ([Foo; 1000], [Bar; 1000]) {
    /* ... */
}

fn first_half() -> [Foo; 1000] {
    let super pair: ([Foo; 1000], [Bar; 1000]) = make_pair();
    pair.0
}

fn call_half_pair() -> Foo {
    let mut half_pair: [Foo; 1000] = half_pair();
    mem::replace(&mut half_pair[0], Foo::default())
}

Assumming that Foo and Bar have drop glue, and returning &own to indicate returned ownership (the actual asm doesn't need to return a pointer, as it has a known offset), that would desugar as:

fn make_pair(r: &own MaybeUninit<([Foo; 1000], [Bar; 1000])>) -> &own ([Foo; 1000], [Bar; 1000]) {
     /* ... */
}

fn first_half(r: &own MaybeUninit<([Foo; 1000], [Bar; 1000])>) -> &own [Foo; 1000] {
     let pair = make_pair(r);
     pair.0
     // [Bar; 1000] dropped here
}

fn call_half_pair(r: &own MaybeUninit<Foo>) -> &own Foo {
     let r = MaybeUninit::uninit();
     let half_pair: &own [Foo; 1000] = half_pair(&own r);

    let returned_foo = mem::replace(&mut half_pair[0], Foo::default())

    /* copy returned_foo into r, etc. */

    // half_pair dropped here
}

Critically, note how the last function, call_half_pair, did not use let super, which means it has to copy the one Foo that it does return into the callee provided return slot, as usual.

Now, this may look like "spooky action at a distance", because suddenly callers have to provide some unknown amount of stack space that isn't visible in the function definition. But remember, that's how it works already! There's no way to know the total amount of stack space a function call will use up front from the function declaration. Asking the caller to provide it from their stack frame just expedites the process. You could even provide some low level unsafe intrinsics to determine the layout of the required space (remember that the maximum possible size is known up front if let super is restricted to sized types) and provide it from somewhere other than the stack. That might actually be worthwhile for things similar to Box::new_with, even if you have to shrink the actual allocation later.

I'm not. There's a few ideas I'm talking about in this thread. An important category of them being able to return unsized types. Currently, Rust functions can only return sized values, and it's not at all clear what's the best way to support unsized values. RVO can't help that use-case.

Secondly, RVO can't decide what is the best trade-off when you're logically returning part of a large value. Read the two comments above yours.

My intent here is to explore adding new forms of lifetime and memory management to the Rust language, in part to avoid the need for unsafe code.

Just for the matter of argument how would this work? You have caller frame on top, you have callee frame bellow it (stack grows down on x86-64 if my memory serves..). Where will this Option<String> go? Is it an alloca bellow the callee frame?

let super at the ABI level means the callee has to provide a pointer to some number of bytes of memory from their own stack frame (or potentially somewhere else). Similar idea as how large values are actually returned by writing to a pointer provided by the caller.

So no alloca: rather the opposite actually, as the let super requirements for the entire call tree would be known in advance at compile time.

..ahh, that can be done because the function is inlined, I see. Been a slow-thinker today.

Actually no! My first writeup at the very top of this thread was to do this via inlining. But I then realized I didn't actually need that, as calls to non-inlined functions can just supply the necessary stack space in the frame of the caller via a pointer.

Now, if you do need alloca, I don't see a way to avoid inlining (or something entirely different like returning a closure, CPS, etc.). But many things don't need it.

Looks like your latest idea is to have multiple implicit "return references" – like &mut, but they start uninitialized and the only way to interact with them is to initialize them (which turns them into a regular &mut).

Why not approach this by explicitly adding such references, maybe called &return or &out? Or, actually, maybe this can be implemented in user code? I feel like I've read previous discussions of this idea, but unfortunately I can't remember them, and I'm not sure what search terms to use to find them again. But making it explicit would avoid the current issues where you need a special type of function that can only be inlined.

1 Like

But how do they know how much to supply?

I've been running in circles around a similar design in my mind countless times and never arrived to a solution which would look feasible.. The desire to return objects of an unknown size via stack from functions is not such a rare thing! Many people would have loved the ability..

Sorry, that's out of date: I realized after writing my initial post that inline isn't actually necessary, and both inlined and non-inlined functions can use these techniques.

Explicit is of course possible too. But that's quite a bit less ergonomic, and exposes implementation details unnecessarily.

Even if they don't need to be inlined, they're still special because they can't be used as function pointers or Fn trait objects with their as-written signatures.

1 Like

I understand, and I'm worried about how it will interact with unsafe, since it's conceptually very different from the rules of today's Rust. IOW, this is a problem that involves the human factor.

See my message above: Mandatory inlined functions for unsized types and 'super lifetime? - #8 by petertodd

You can statically analyse the total size of all let super requirements, in the same way that you can statically analyse the size of each stack frame.

Of course, this isn't magic: if you try to use this in a recursive function, compilation will obviously fail as you're effectively creating a recursive type of infinite size. But that exact same problem already exists in recursive functions returning impl Trait.

It's interesting that you mention impl Trait.
There does seem to be a degree of similarity here.

Okay, what if the result is needed not one but two stack frames up? Three? Forty two?

As I mentioned above, you can in fact get the size and alignment of the scratchpad space from the vtable, and allocate it with alloca prior to calling the function.

That said, impl Trait isn't compatible with either function pointers or Fn trait objects, and it's still quite useful. I'm ok with that trade-off.

That's my intent! If you actually have a fourty-two level deep call graph doing something useful, this could be very helpful in ensuring that you actually are returning values directly to the top-level caller, rather than repeatedly copying them over and over again all the way down the call graph. And, let super would let you be explicit about when to copy vs when to return to the original callee.

It's similar to how a very complex impl Future tree can result in one large structure, allocated in advance.