A puzzle and why generics are not generic enough right now


#1

Let’s look at the following example

fn consume(t: Atype) {
    //whatever
}

Now you see the following code

let v:Atype = get_value();
consume(v);
consume(v);

What can you conclude about the type Atype?

The first intuition come to your head must be Atype: Copy. This is correct, but not the full answer.

There are other types that also can be used:

let v:&mut i32 = get_value();
consume(v);
consume(v);

Playground link

But you know, &mut i32: !Copy.

Technically the above makes

fn consume_generic<T>(t: T) {
    //whatever
}

not generic enough - if you call consume_generic instead on the &mut example, it does not work. Which means the “generic” function does not doing same way if you substitute with some specific types.

It looks like a little inconsistent. Any idea?


#2

My proposal is like this:

// Current behaviour: never revives, allow all types
fn consume_move<T>(t: T) {
}

//Revive when possible, allow all types
fn consume_revive_when_possible<T>(t: T)
where T: ?Revive {
}

//Revivable types only, and revives
fn revive_only<T>(t:T)
where T: Revive {
}

//Revivable types only, and moves it
// I am not sure what is the right grammar here
// !Revive would not be an option as it was used for
// negative trait bounds
fn consume_revivable<T>(t: T)
where T: !Revive {
}

//Copy implies Revive
revive_only(10);

//Mutable references are Revive, and Revive can be derived 
//the same way as Copy
revive_only(&mut 10);

#3

The problem isn’t that consume_generic isn’t generic enough – it is that built-in references have implicit reborrowing operation applied to them; your second example is the same as:

let v: &mut i32 = get_value();
consume(&mut *v);
consume(&mut *v);

Therefore the values passed into consume are indeed not Copy but they are different values from each others, and they get consumed as move.

So the semantics of your Revive trait don’t match the current language semantics.


#4

@GolDDranks

I know what the compiler actually do; But this is exactly how Copy does for this example - every value passed to consume are also new things, as if in your case, “reborrow”.

The issue here is, “reborrow” did not happen for consume_generic automatically whilst this should at least be an option.

Also, you can name it what ever, but in pratice, v is gone temporary when sent a mutable reference to consume (you will not be able to use v while its mutable reference is still in scope, for example), and back afterwards. So I think this is Revive.

or think it this way:

Behaviour Trait name Enabled code
Reborrow share reference Copy consume_double(v,v)
Reborrow mutable reference Revive consume(v);consume(v);

#5

I also want to add my POC crate here, which exhibits similar behavior.

https://docs.rs/try_transform_mut/0.1.0/try_transform_mut/


#6

Can you give some example on how your crate supposed to be used? And how is it related to the example of mine?

If I am right, you can extend your implementation

impl<'a, T> TryTransform for &'a T {...}

to

impl<T> TryTransform for T where T: Copy {}

But you cannot do the same thing for &'a mut T. Am I right?


#7

No, they are not. Reborrow doesn’t mean “copy the mutable reference”. (Because it’s not possible to copy a mutable reference). Basically, this is not the “fault” of generics at all. (I wouldn’t call it a fault either. It’s a convenience shorthand for the often-used reborrowing idiom, and it’s 100% intentional.)


#8

I’m having hard time imagining Revive as a trait bound. Reborrowing (and copying) is something that happens at the call site, not inside the called function. On the other hand, generics with trait bounds are all about what can be done inside that function.

Copy is a compiler-blessed marker trait and types that implement it have copy semantics everywhere. For a marker trait to signify automatic reborrowing, that would mean that it would be reborrowed everywhere, but then it should also have a method for that, as the compiler doesn’t know how to do that for types other than built-in references.

I don’t think specifying a trait bound in function that triggers an implicit operation at call side has precedent in Rust. I don’t mean to say that it should never happen, but any proposals suggesting such functionality should explore the design space and justify the functionality quite thoroughly.

Rather, I think it would be “straighter” way to get to your desired functionality using the already-present auto-reborrow (or, some people, I think, use the term “reborrow coercion”), if the reborrowing functionality would be exposed as a trait:

Here, the function would expect some target of reborrowing, and the current system could be extended for the automatic reborrow happen at call site in such case.

fn revive_only<T>(t: <T as Reborrow>::Target) {
    ...
}

#9

I don’t know what the real difference is? When look at the bit presentation, they are exactally the same (because lifetime and type of pointer are not part of it). So it looks just as the same as a Copy - you just memcpy it to the stack, without assuming anything.

In sementical, there is also no real difference. The restriction is that once you have a mutable reference to a value, you lost access to the value until you drop the reference. But then this is what I called Revive.

I know, I know. But then tell me a way to define consume_revive_when_possible, this is what I want to do in my own crate but not able to.

This is not new to Rust anyways. The trait Copy I mentioned before and many things in std::ops(Deref, DerefMut…), Sized (I am not 100% sure) are examples of this. They all have effect on the call site.

So I am looking for compiler support.

Now I think it is time to show my real motivation. I am working on a crate namable_closures. I have defined 5 different types there, and then I figured out one of them is redundent, but a very similar one is not. So I figured this inconsistency out. If your idea can help me addrss this issue, I appreciate your help. More details are in this post.

Again, the most important thing is not revive_only, it is consume_revive_when_possible. This is what I need to do in my crate.


#10

The bit representation is the same; however, they are different at type level, because the reborrowed new value always has a lifetime restricted by the original value.


#11

The same logic apply to the Copyed &T value. Right? So it IS a copy, but only happen to have effect to shadow the original value.


#12

I fail to see how this is a problem. Trying able to tell that a type is Copy because it appears to move multiple times is… not something you should be doing in the first place.


#13

I know I’ve said this a lot lately, but once again: What is the use case for this?

The closest thing I see to a goal is two posts saying that consume_revive_when_possible is the key thing you can’t define and want to define, but I have no idea why you want that, or what you would try to do with it if it was possible. I’m also not sure what you even mean by it, but I think that’s just a side effect of having no stated purpose/use case.


#14

Actually i appreciate OP’s thinking a lot. I don’t think the usual “motivation needed” style of response appropriate here. This discussion is just resulted from the lack of formal specification of Rust the language, specifically, the move semantics.

I used to think the move semantics include two cases (I think many people might think the same way). The value is copied if the type is Copy, and moved if it’s not. The lifetime parameters can be changed covariantly if they’re covariant.

But according to the discussion in this thread. It’s not. Actually there’re at leastthree cases. 1. The Copy case. Where the old value is kept alive. 2. An imaginary LockingCopy<'a> case for &'a mut T types. where the original value is kept alive but “locked” until a compiler inferred 'a finish its {lexical | non lexical} scope, and the 'a parameter eliminated. This check is done by borrowck. 3. Other Move cases, where the old place is put to death.

This is nothing new, and should be in the specification, in the part describing the move semantics. (And more question immediate come up: What types are in the second category? Can user define types that behaves this way? Are there even more categories?)

For use cases: I think these knowledge are perquisites for any serious implementation of program transformation for Rust, including but not limiting to refactor tools, etc.

Whether there’ll be language constructs describing these categories, so programs can make use of them as generic bounds etc, like the OP is hoping. I do hope that Rust team can make good documents describing the exact rules This make people more confident in the soundness of the language.


#15

You have stated good points.

Let me visualize the lifetimes in different cases.

Copy:

  ________
_/_______

At the point of the value being copied, two independent lifetime continues. It does not matter which one is which.

&mut - The current implementation


  ________
_/_ _ _  _\___ 

At the “reborrow” point, a branch instance being created and shadows the original one. They then join back to the original one once this branch finishes.

&mut - My proposal


[_____][______]

At the “use” or “move” point, the old value get moved and finishes its lifetime. But a new “borrow” from the refered value being created on the same position, and have its own lifetime.

In other words, the code

consume(v);
consume(v);

desugar to

let tmp = v as *mut T;
consume(v); //v's lifetime ends here
let v = unsafe {&mut *tmp}; //new v with new lifetime starts here
consume(v);

This is actually working code: playground. However, this is of cause can only be performed by the compiler, otherwise it is really unsafe: we can no longer garantee that there is only one mutable reference to a value!

I don’t know whether the above would break existing code, but it seems enables some code that wouldn’t compile before. For me this is a simpler model anyways.


#16

I think that crate is unsafe.

The problem is already kind of described in this comment in the sources:

    /// Try to consume a reference and transform it to `Some(B)` with a closure.
    /// If failed and returned `None`, return the original reference value.

You cannot do this. Just because the closure did not return anything, does not mean it did not put the reference somewhere. This is very similar to how an early version of RefMut::map was unsafe.

I think a possible “exploit” of this bug looks something like this

fn duplicate_ref<'a>(x: &'a mut i32) -> (&'a mut i32, &'a mut i32) {
  let mut dup = None;
  x.try_transform(|x2| {
    dup = Some(x2);
    None
  });
  (x, dup.unwrap())
}

This duplicates a mutable reference, which clearly should not be possible in safe code.


#17

Thanks so much for pointing out. Yes it is indeed problematic. I somehow missed it. (I thought users must use unsafe to save it to global state, but somehow missed the local variable case.)

One more question: can the api become sound if i add a 'static bound to the closure required?


#18

A minor fix of the code duplicate_ref:


let mut dup = None;


#19

I don’t think 'static will help. It is actually safe to write to a global static Mutex, for example. So even an fn (not closing over anything and hence trivially 'static) can leak data.

There is no way, in the Rust type system, to express that a function does not leak a reference.


#20

I think your new &mut proposal is a restatement of the Ref2Φ type discussed as a possibility for some of NLL work to accept nested &mut self method calls.