IDEA: relative 'static lifetime

The 'static lifetime plays a very special role in today’s Rust. Some API’s depends (Any for example) on it to work, and it is considered where leaked memory will be sit in, as we don’t currently support leak free memory managements.

This proposal introduce a keyword initially named leakproof (suggest a better name!) that

  1. It appears in front of blocks or function bodies like a unsafe keyword. Not sure about the sementics if we have it in traits yet.
  2. Inside the leakproof block/function it decorates, any values being created before or in the top scope have lifetime being promoted to 'static, but still get dropped after the block/function body, as usual
  3. Any reference to 'static inside a leakproff scope (directally or indirectally) refer to the innermost leakproof scope.
  4. std::mem::forget will only delay the drop clue to the exist of the leakproof block.
  5. Box::leak is also only delay the drop clue unless Box::from_raw being used on its result.
  6. Any threads creaded inside a leakproof block will be killed on exit, to resolve leaks caused by deadlock.

Motivations

Dynamic code loading

Technically, 'static only makes sense if all code are loaded statically. If we are going to load/unload dynamic code, the 'static lifetime in the loaded code is only relative to the load/unload operation. This is not a problem today because we only support dynamic load through FFI and foreign functions does not support lifetimes. But my dream is rdylib: loading Rust only code dynamically!

Needless to say, 'static is in fact relative to the process!

Lifetime extension of FFI

Introducing lifetime in FFI will make it possible to interact with foreign languages that is also supports lifetimes. This is not too common at the moment, but would be a future proof.

Lifetime in RPC APIs

When designing RPC APIs, the object’s lifetime can naturally outlives 'static (relative to the process): a server may use 'session lifetimes that outlives the client’s 'static, as it can remain valid after the client restarts.

Memory leaking control

For the same reason, in reality memory leaking programs were controlled by manually killing the processes. If we do have relative 'static lifetimes, we can do it programatically.

Making APIs more usable

Allowing APIs like Any available to types other than being strictly 'static.

Simplify code that using a fixed set of long living lifetimes

rustc itself uses a few lifetimes that lives for long enough to take benefit to making it relatively 'static.

Implemention thoughts

I guess the best implementation strategy is to introduce a new region for each leakproof blocks, and the lifetime erasure procedure simply ignore them and let monomorphization generate different versions for them.

Drawbacks

  1. One more keyword
  2. Code size explode - if we use the implementation strategy I described above

Alternatives

  1. Do nothing if we don’t want to support native dynamic code loading ever
  2. Instead of promoting everything to 'static use a lifetime 'persist and have every lifetime longer than 'persist being stay after lifetime erasure and relax the restriction of API for Any to be 'persist (more to discuss)

Unresolved issues

  1. The exact behavior for panic/unwinding
  2. Interaction between variables inside and outside a leakproof block - shall we disable storing any values created inside to an outside reference if it is not Copy?

That definitely isn't true. Consider the following trivial counterexample:

/// In one crate/module
pub fn loaded_dynamically(arg: &'static str) -> &'static str {
    arg
}

/// In another, loading/"host" crate/module
fn another_function() {
    let fptr: fn(&'static str) -> &'static str = dynamic_load("othercrate::loaded_dynamically"); // or something
    let s = fptr("hello world");
    dynamic_unload(fptr); // or something, could/should probably be scope/RAII-based
    // here, `s` should still be valid because it refers to a string literal inside this crate
}

Isn’t this exactly how non-static lexical lifetimes already work? What code would be enabled by static-but-not-really-static lifetimes that isn’t possible with regular non-static lifetimes, yet would still be sound and not require unsafe?

This seems like a complete non-starter because—unless I’ve wildly misunderstood many things—threads are simply not part of the core Rust language. Auto traits are, but Send and Sync are “merely” part of the standard library, and threads “merely” an implementation detail of many standard library APIs using those traits.

———

I think you’re trying to propose a general mechanism for what’s called “linear types”, i.e. types that must be “used”/“destroyed”/etc and cannot be leaked/mem::forgotten/etc. If so, see https://gankro.github.io/blah/linear-rust/

1 Like

I do think we need a mechanism to handle the interaction between lifetimes and dynamic code loading and unloading. However, I don’t know that I’d call that 'static.

If you load Rust code dynamically and you treat it as an opaque block, then that Rust code can assume 'static works as normal; it’d have to take care when returning a reference that would outlive the loaded code, but if loaded and run opaquely then it already has to take the same care it’d take when returning something to C.

If you load Rust code dynamically and you treat it as part of the same type-system universe as the code that loads and unloads it, then you need something better than 'static; you need a lifetime limited to the loaded code, and you need ways to return memory to the code calling it that has an appropriate lifetime preventing the unloading of that code. Effectively, any borrow of the loaded module’s memory must occur with a lifetime contained within the lifetime of the module. If you want to overcome that, then either the caller must clone your object, you must return it by value, you need to write it into the caller’s memory in the first place, you need a memory allocator that returns memory that’ll outlive your module, etc.

2 Likes

I think even your example shows the point I mentioned. The code you gave is not safe: the function loaded_dynamically can return some local string in the first crate (although your implementaion is safe), and unloading the crate will make it invalid memory. Unless the dynamic loader is allowed to look at the function body, there is no way to garantee safety.

Not quite. I hoped to be able to use Any but it is only available for 'static types.

Linear types are interesting; but I am not talking about it. I am talking about "Linear code blocks", which can be a machenisim for native dynamic code loading.

Yes, I am not sure what would be the best way either, this is why I only posting an "IDEA", not even a "Pre-RFC". We need to think about it and decide which is best.

Yes I see all those complexity. But I am just proposing some first steps.


My random thoughts

If we attempt to use Rust to model the world, we can see that there would be lifetimes that strictly outlives 'static: we can model serialise being writing to a reference 'persistent, and write writting to a refernce 'machine, an HTTP RPC can access something 'internet etc. So 'static is not the longest lifetime at all.

I think I understand the distinction, but I don't think it's actually possible to implement true "linear code blocks" without effectively also providing linear types or an effect system or some other similarly complex language extension, because otherwise there's simply no way to detect whether anything transitively invoked inside the block is going to spawn a thread without joining it or create a reference cycle or... create a 'static value by deliberately mem::forgetting something. That last case in particular makes me doubtful whether any sort of "linear"/"leakproof"/"static" feature would be part of the correct solution to dynamic code loading.

Interesting, but that feels like an XY problem to me. I'm still not entirely sure what Any is for, but if you need an Any<&non-static T> for some reason, maybe we should be more interested in questions like whether Any<Pin<&T>> could be viable.


More generally, it sounds like the problem we're trying to address is "I want to dynamically load code that operates on a &'static T (or some other 'static-bounded type) only using safe Rust", which obviously can't work today because dynamically loaded code is not "alive" for the whole process. Is that correct?

To me the really interesting question there is can allowing that possibly be sound? Because it's not obvious to me that it can. For instance, what if the dynamically loaded code passes a &'static T through some unsafe FFI API that actually relies on it being 'static?

For "dynamic typing".

It corresponds roughly to Data.Dynamic in Haskell. Here are some more resources for consideration.

It's only unsafe if 'static is not really 'static. I would not want to call a lifetime tied to dynamically-loaded code 'static (exactly to avoid these kinds of confusion), so in my example I only called the true static lifetime 'static.

Most of them were address in my top post. I don't have answers to all relevent problems though; expecially the way to handle threads and panic/unwinds.

Actually, I am looking at a bigger picture. Right now FFI only supports foreign functions without lifetimes because Rust is the only mainstream language that supports it. But as Rust is being popular there will be foreign languages adopted the concept of lifetimes, and so to better interact with them we need to allow lifetimes in FFI.

Other good scenarios includes RPC as a superset of FFI. We may need to call a local stub to access a RPC server, and have lifetimes in the API. This is natural extension of Rust type system (especially with NLL) . So for example, the RPC API can create a 'session lifetime after login, and then have a 'file lifetime after open a file in the remote server. All those lifetimes may or may not outlives the process's 'static, and need proper handling of resource leak.

I don't have a formal proof of that; but manual resource management was working so good in safe Rust, and safe Rust is Turing complete so nothing stopping us from making it safe.

The point here is that there is nothing being 'static globally. Even the universe itself have a lifetime. So this kind of unsafety is just a symptom of unsound modeling. We should be able to express the exact lifetime of the FFI API.

This is exactly the point. When writing the first crate, you specify the local string to 'static and so rustc is happy to return it as 'static. But when the code was loaded dynamically, rustc need to know this 'static is only relative to ensure type safety.

What you are suggesting is another solution of dynamic code loading: making dynamically-loaded 'static mapped to a lifetime different to the host program's 'static, without using any keywords. Still, we have to due with most of the issues I mentioned in the top post, and at the end we again have relative 'static lifetimes.

Have one less keyword sounds like good, but I think people will soon ask: if we can have relative 'static for dynamically-loaded code, why I can't make use of the same relative 'static in my statically compiled code (e.g. 'tcx in rustc)?

That is exactly the opposite of what I am trying to say. I'm arguing that a lifetime which is not actually static should not be called as such.

Then what is your suggestion to call a dynamically loaded function that returns local static data? What lifetime would you expect? Or these are forbidden (a big restriction to me)?

Lifetimes which are allowed to be shorter than 'static and permit dynamic loading should be called something else, e.g. 'dyn. Because they mess around with lifetimes, functions that you want to load dynamically should explicitly be designed for this purpose.

Then you are suggesting dynamically loadable code should have to be rewritten? And also this rule out the possibility to allow dynamic code using Any or other APIs requires 'static.

The key argument however, is that 'static is always relative. Right now, it is only relative to the process, but in theory, there are much more rich set of lifetimes that may or may not outlive 'static and yet should have the property that 'static have. Therefore, the special role that 'static plays should be thought as something else, and some language feature should be used to specify 'static like lifetimes.

To be clear, are you suggesting there be something like a life-time that means basically, “LIves as long as this Dynamic Module is Loaded, but, no longer”?

If so, that isn’t 'static is it? 'static is defined as, “The lifetime that lives as long as the process”. If, a loadable module has a function that is called to produce an “instance” of the loadable module, then that instance has a life-time. Any method/function in the module should be called through that instance and aything that is returned lifetime-wise should be tied to the lifetime of the instance, no?

Or am I misunderstanding the dilemma?

Your understanding is what I called "relative 'static": the lifetime tied to the dynamic code is not 'static in the host program/process, although it is defined as 'static in the dynamic code.

However, @H2CO3's argument is that this should not be the case as he suggest we should not allow calling functions in dynamic loaded code that returns 'static data, as those data might not be actually 'static in the host program.

To me, this is really a big restriction on dynamic code loading as we have to rewrite the code originally designed for static use to enable dynamic loading.

I don't see how that is a big impediment (of course, perhaps I'm misunderstanding your concern or use-case). It seems to me that if something is intended to be dynamically loaded, it should be written with that in mind.

I guess your concern is things like the Any type that require 'static? So, by not allowing dynamic module to return something with a `static lifetime and have it be treated as a 'dynamic_static lifetime is the issue?

Hmm.....perhaps I'm understanding your concern? So, ideally, you'd like to be able to say, "Withing this block, anything that returns a 'static lifetime is really returning a `dynamic_static lifetime tied to this other lifetime associated with a particular instance? Something along these lines?:

let my_module = DynamicLoad::load( "my_module" );

dynamic_static ( my_module ) {

   let  some_any = my_module.get_foo();

   // my_module.get_foo() returns an Any<T> where T : 'static, but, the 'static is interpreted as being a lifetime corresponding to the lifetime of my_module because we're in this "dynamic_static" block referencing the ```my_module``` object

}

Is this about the right understanding of what you'd like to have happen?

1 Like

Yes, this is what I said in the top post, except I use keyword leakproof instead of dynamic_static (and no module names, the module variable binding will be moved to the block:


leakproof {

   let my_module = DynamicLoad::load( "my_module" );
   let  some_any = my_module.get_foo();

   // my_module.get_foo() returns an Any<T> where T : 'static, 
   //but, the 'static is interpreted as being a lifetime corresponding 
   //to the lifetime of the current scope because we're in this "leakproof" 
   //block

}

or

//in crate dynamic_load
leakproof fn load(module_name: impl AsStr) -> Self;

//in using site you have to put the method inside a leakproof block
//because it was declared with leakproof keyword.
leakproof {
   let my_module = DynamicLoad::load( "my_module" );
   let  some_any = my_module.get_foo();
}

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.