Deferred references (`&Transparent { inner: ref *foo }`)

Idea

Currently, you can do this:

let x: &str = &*"foo";

Currently, you can also do this:

#[repr(transparent)]
struct Name { name: str }

The idea is that you should be able to do this:

let x: &Name = &Name { name: ref *"foo" };

The ref here is similar to &, in a way, except it marks the outer struct as being part of a reborrow, by "deferring" the actual conversion to & to the outer &. Note that the struct must be transparent.

Benefits

While it is sound to transmute an &str into an &Name, it's kinda error-prone - and that's without even getting into &mut. The currently recommended approach is to instead use pointer casts. However, that's still unsafe, even if sound (and easier to get right).

This approach would also let Cell::from_mut be redefined with no unsafe {}, altho it would require the addition of UnsafeCell::from_mut:

fn from_mut(value: &mut T) -> &Cell<T> {
  &Cell { value: ref *UsafeCell::from_mut(value) } 
}

Drawbacks

None.

Alternatives

Keep using unsafe is always an option. It's not always a good option tho.

(Previously this post talked about making &Foo { inner: *bar } for transparent Foo always reborrow but turns out that would've broken existing code.)

1 Like

The #[repr(transparent)] isn't doing anything here - struct Name { name: str } compiles on stable.

Allowing *"foo" to work requires support for unzised rvalues, which is tracked at Tracking issue for RFC #1909: Unsized Rvalues · Issue #48055 · rust-lang/rust · GitHub

&*"foo" works, &(*"foo") also works. but you should be able to shove stuff between the & and the * if said stuff is transparent.

We want limited DST support without actually supporting unsized Rvalues, by using the same hack that allows &*"foo" to work. This should probably also allow &{*"foo"} to work, altho that fails for an unrelated reason.

#[repr(transparent)] is about the ABI and layout of a struct - it shouldn't affect this kind of behavior.

"Shove stuff between the & and the *" doesn't make sense on the MIR level. When you write &SomeStruct { some_field: foo(), some_other_field: bar() }, the outer & is handled separately from the SomeStruct { ... }. You are first constructing a SomeStruct rvalue, and then constructing a reference to the temporary where it's stored.

That's not a hack - you can use &* on a pointer to re-borrow that pointer, regardless of whether the pointee is Sized

& is a type initializer (for making &'a Ts), whereas * is an operator. if it can work there it can work here, right?

FWIW, the ref-cast crate provides a safe derive-based wrapper around the transmute that removes the risk of getting it wrong. (So long as external users are allowed to also do the transmute, anyway, as it's enabled through a trait. If you want to make the construction private, ref-cast won't work.)

I would actually also like something like this. Currently, things like CStr are annoying to implement, because it requires transmuting between &[u8] and &MyCustomStrType. The case I worked on was a ModifedUtf8Str for a jvm implementation, and I find the transmute I do at the end of the validation for from_modified_utf8 (or the _unchecked) variant incredibly sus, and I doubt I'd ever be entirely comfortable with transmuting between pointer-to-dst. ref-cast would not work as the public conversion would be unsound (since I assume the string is in Java's Modified UTF-8)

2 Likes

Also this would allow one to change the implementation of Cell::from_mut to

fn from_mut(t: &mut T) -> &Cell<T> {
  &Cell { value: *&UnsafeCell { value: *t } }
}

(or indeed having an UnsafeCell::from_mut to go with it, but same difference. Altho why doesn't from_mut return an &mut Cell<T>? Ah well guess we're stuck with making UnsafeCell::from_mut also return an &UnsafeCell<T> for consistency, even tho it'd be better for it to return an &mut UnsafeCell<T> :‌/)

1 Like

As often, you could be more clear about what you're actually suggesting.

It sounds like you want to make Name { name: *foo } into an lvalue, with the same address as *foo. In other words, &Name { name: *foo } would be equivalent to a pointer cast, &*(foo as *const str as *const Name) – which is sound (thanks to the repr(transparent)) but unsafe.

Unsized rvalues, on the other hand, are a proposed (and technically RFC-accepted, and partly implemented) feature, wherein &Name { name: *"foo" } would be legal syntax, but it wouldn't do what you want. Instead of being a pointer cast, it would copy the entire string to a new stack variable and produce a reference to that.

After all, the general syntax &Struct { field: *x } already has that effect when x is a copyable reference to (or Box of) a Sized type, and unsized rvalues would extend the same behavior to unsized types.

Making struct literals sometimes be lvalues would be a breaking change that would require an edition boundary. IMO, it has some elegance to it, but would be excessively confusing, since you wouldn't be able to tell whether something is an lvalue or not without non-local information (namely, whether the struct was defined as repr(transparent)). There should be a safe way to do this kind of cast, but IMO it shouldn't be this syntax.

6 Likes

This is the only syntax we think makes any sort of sense (since it's conceptually reborrowing) and honestly we can't imagine this being a breaking change since its semantics are strictly broader (namely being able to return the thing).

Counterexample:

#[derive(Debug)]
#[repr(transparent)]
struct Transparent<T> {
    inner: T,
}

fn main() {
    let some_value = &mut 530;
    let transparent = &Transparent { inner: *some_value };
    
    // some_value is a &mut, so use it as such
    *some_value = 1024;
    
    // transparent is a copy, so it's still valid
    dbg!(transparent); // prints 530
    
    // as is some_value, of course
    dbg!(some_value); // prints 1024
}

We don't have #![feature(unsized_locals)], yet, so for unsized types it wouldn't strictly be breaking. However, the point of #![feature(unsized_locals)] is that unsized and sized values mostly behave the same, which means that unsized values should also make a copy here to keep the same behavior as sized types.

Imagine this being &mut [u8] rather than &mut i32 if that helps.

1 Like

Ah...

what about &Transparent { ref inner: some_value }? does that work today?

Why don't you try it, rather than asking someone else to do it for you? This is a forum, not IRC, so there's generallly an expectation that you've done a small amount of experimentation before posting. The same way there's a different effort/formality level expected on email versus on, say, Slack.

That said, no, that syntax is currently a hard error, because inner: is a field label and not a pattern in this position. If you want to borrow pattern syntax, I believe what you want would actually be &Transparent { &inner: some_value }, as & in pattern position removes a layer of indirection. (This not obviously intuitive behavior is one of the main reasons match binding modes were introduced, by the way.)

Taking binding modes to their logical conclusion in struct constructor position, you'd get just let transparent: &Transparent = Transparent { inner: some_value };. I like pattern binding modes, but I don't like this. (Question: why? I'm not sure what the difference is.) Plus, the people who weren't super fond of pattern binding modes in the first place would absolutely oppose this behavior. I'm also not super fond of &inner: as a way to explicit binding mode get your desired behavior, either.

Either way, if pattern-like binding modes are extended to #[repr(transparent)] wrapper structs, it's only consistent for pattern binding modes to apply equivalently in that position as well. And it really makes me uncomfortable in a way that pattern matching binding modes didn't, for a reason I can't quite express. (It's almost certainly why people who don't like pattern binding modes don't like them. I just can't articulate what makes struct constructors different for me.)

4 Likes

Eh, ref is a keyword, not a pattern. It's fine to throw it here.

So something like this would be good yeah? &Transparent { inner: ref *some_value }

It seems really weird to have the ref keyword used in two very different ways (e.g. your proposal, and with patterns).

If I understand correctly, you want to be able to safely go from a &T to a &Transparent(T). This is something that could be done by an attribute macro:

#[add_safe_reborrow]
struct Name {
    name: str
}

expands to:

#[repr(transparent)]
struct Name {
    name: str
}

impl Name {
    fn safe_reborrow(val: &str) -> &Name {
        // SAFETY: Our macro added `#[repr(transparent)]`,
        // which guarantees that `Name` and `str` have the same
        // layout.
        unsafe {
            std::mem::transmute::<&str, &Name>(val)
        }
    }
}

Note that the ref-cast crate has open issues about supporting private refcasts.

As usual, an interesting idea for a proc-macro has already been implemented by dtolnay :laughing:

2 Likes

Rust developers don't like pulling in random crates tho. Yes cargo makes it easy but there are good reasons for not doing it, like backdoors and malware. And this makes more sense being in the language.

That is a sweeping generalization. I am a Rust developer and I don't mind pulling in random crates. Yes it might be a good idea to verify the crate's internals but that shouldn't stop people from using it.

Furthermore, it is definitely a tenet that the stdlib shouldn't be too big(tm) (FYI I am on no teams).

3 Likes

Eh we've heard too many horror stories from npm... We only trust few packages, especially those maintained by the Rust developers.

That's alright, this proposal wouldn't go in the stdlib.