Need for -> operator for Unsafe Code Guidelines

RalfJung · May 8, 2019, 12:06pm

That makes it impossible to write correct code in some cases currently. But I don't see how resolving this fixes the issue ergonomics issues around raw pointers that @Gankra and me mentioned.

That would implicitly derefernce an unsafe raw pointer, which seems rather dangerous.

HeroicKatora · May 8, 2019, 12:11pm

Adding &raw or even an &_ that would use the same pointer type would allow to use &_ ptr.field or &raw ptr.field to get a pointer, which avoids having to introduce a new -> operator and allows to “.” operator to have the same behavior for references and pointers.

This works for all builtin examples of references. However, I do not see any way to generalize it to user types in the manner of other operators and then already we have lost the opportunity to also provide it for Ref and Pin where appropriate. The problem is that it seems any unary operator here to choose some representation for the self argument, but none of them are simulatenously safe (if we chose a pointer type) or applicable to all possible representations (already we would be missing lifetimes if we didn't choose pointers). Thus a value self seems the only possibility for such an operator but it can not possibly take the targetted field by value, and as @RalfJung notes this is implicitely unsafe.

hanna-kruppe · May 8, 2019, 12:28pm

There is no real ambiguity for the compiler or backwards incompatible change, sure. But humans' interpretation of x.y now has to vary greatly in novel and confusing ways depending on subtle (and possibly quite far-removed) type differences. The case without dereferencing (foo(x.y) passing a projection of x instead of copying or moving a value out of its field, as the syntax suggests) is equally confusing, IMO. I do not want to sound dismissive but I feel like the only way this can be "intuitive" is if one's intuition does not incoporate the distinctions between values and places, or pointers and pointed-at values.

References to raw pointers can definitely occur (when generic code that handles &Ts is instantiated with T being a raw pointer). But in any case, that such a completely reasonable looking expression is so misleading that it needs to be linted against IMO illustrates how this proposal goes against the grain of the language.

And quite frankly, it seems to be trying to solve a non-issue. There's already multiple perfectly serviceable proposals for solving projections through raw pointers that don't have this confusion, and they're more general and orthogonal as they aren't tailored to field accesses exclusively (e.g., you could combine &raw mut with hypothetical indexing and slicing of *mut [T], if we ever add it).

It's perfectly clear how to define evaluation of place expressions such that intermediate derefs don't imply anything wrong that causes UB (e.g., a claim of validity for field a when the place expression actually navigates into the disjoint field b). In fact, I am pretty sure we already have that. The only potential UB that proposals like &raw ... dissolve is the temporary reference being created at the end, after the place expression is evaluated.

withoutboats · May 8, 2019, 12:41pm

On its own, I’m sort of unenthusiastic about the idea of adding a new operator for this for the obvious reasons (niche use case, uses syntax, new thing for people to learn, etc), and so I was sort of more inclined to just let you use the . operator on raw pointers.

However, I think this is an area that deserves a properly holistic examination. Unsafe code is just kind of a PITA right now in a number of ways. For me, the most annoying thing is that NonNull’s APIs often make me feel like I should just use raw pointers, even though my pointers are nonnull. I wonder if this proposal would make sense as part of a more complete look at how to make it easier to effectively deal with all potentially dangling pointers?

bill_myers · May 8, 2019, 1:02pm

We could special case &_ foo.field to use a special trait and also have "foo.field" work by behaving as * &_ foo.field

The alternative is to introduce a new operator like "->" and have them be foo->field and make foo.field behave as *foo->field respectively.

phil_opp · May 8, 2019, 1:08pm

To put my thoughts a bit more into context:

In the OP, @Gankra showed two examples that both used references to access a field of a raw pointer. My ptr::drop_in_place example was a variant of their second example. I don't think that the primary problem is writing (*ptr).field because you don't need to use &mut or & for that anyway. The problem is getting pointers to that fields. For that reason I'm advocating for the "Field Access on Raw Pointers as Sugar for Offset" solution proposed in the OP:

HeroicKatora · May 8, 2019, 2:10pm

Both in MIR and in my head, both &_ and * _ count as a form of pointer, and it would be suprising to me if the operator treated them in different ways (one resolving to an lvalue/place and one to an rvalue/value).

It seems unclear to me how this may be backwards compatible and how it is supposed to interact with auto-deref? As a middle ground of not adding new operators for accessing members themselves, but also adding pointer traversal while also not colliding with currently in-use place expressions, maybe this is possible?

&.foo.field

I can't say that I find it intuitively optimal but an exploration of the possible design for syntax can't hurt.

nikomatsakis · May 8, 2019, 2:11pm

This is the high order bit for me as well. I think unsafe code is unnecessarily difficult to write. I used to be opposed to adding "special syntax" around unsafe code, but I was persuaded as part of the union discussion that, indeed, it sometimes makes sense to extend the language with support for "unsafe abstractions" in direct ways. I am not 100% sure if -> is such a case, but it seems plausible.

Gankra · May 8, 2019, 2:39pm

If I had my druthers I would make NonZero a proper lang item, *T, but I haven't seen any bugs that result from NonZero being unergonomic, and I am slightly concerned with encouraging people to use the covariant internal mutability type more.

RustyYato · May 8, 2019, 3:11pm

I don't think it is sound to use NonNull for internal mutability on its own.

mcy · May 8, 2019, 3:14pm

Something like this makes me want to repeat what Josh said upthread, which is that we really, really need

A ptr-to-member type (and an equivalent of C++'s .* operator).
A trait to overload .*, which, in effect, lets us overload foo.bar too. One could imagine implementing it for *?mut T so that a T::*U ptr projects to a *?mut U, instead of a &?mut U (something something associated type ctors...).

HeroicKatora · May 8, 2019, 3:49pm

I think it is insofar confusing to cite C++ here as the operator-> semantics work as follows:

We are presented with t->u
We start with some value that is, either pointer or reference type, of type T
As long as T is not of pointer type:
- a. If T has an overload for auto operator->([const] T&) -> S, check the constntess and follow it
  - Note: There are no value semantics for overloadable operators
- b. If T does not have such an overload, fail.
- We now have a value of type S
T is some pointer type Foo const*, check if Foo has the wanted member
- a. If so, dereference the pointer to the member (perform (*t).u)
- b. If not, fail

This is similar for operator->* and .*, to connect this more closely to the topic of pointer to member. That is, C++ ptr-to-member does not project T* to U* but rather T* to U&. Seems benign in terms of C++ but inconsistent for Rust.

To be perfectly honest, I find these semantics rather confusing (drastically put insane). This not only fails in my eyes for rust in the implicit dereferencing of the pointer in the end which defeats the purpose we want if for. But also, the return type of the overload itself must nevertheless be an actual pointer in the end, meaning we can never use it for NonNull<T>. Also value semantics could provide clearer self types but that is likely orthogonal.

The concept of ptr-to-member from C++ suffers from similar failures in my eyes. They were/are defined to work solely on pointers (which is why .* can't be overloaded separately) and also do an implicit dereferencing on access, i.e. are part of an lvalue expression and not an rvalue one... If we are to provide a concept similar to this, I would very much regret having it specialized to built-in pointers again.

In conclusion, . in Rust works on places, not on references (modulo auto-ref-deref). It seems incomplete to port ptr-to-member but present it in terms of raw pointers only, skipping both embedding to reference pointers and custom pointers. And enforcing the result type to one of the options does not seem complete as well.

And something else that I had mentioned last time this surfaced, a holistic solution for Rust should in my eyes consider enum and union as first-class citizens. Mostly for enum, the C++ syntax and semantics for creating a ptr-to-member lose most of their meaning.

mcy · May 8, 2019, 5:18pm

I think I expressed myself incorrectly. If this is Too Intense of a derail, let me know.

When I say I want rust to add ptr-to-member, what I sugest is adding the following (re-using C++ syntax, even know we can do better, since syntax is a silly bikeshed):

A pointer to member type T::*U for all T, U which is just a typed offset.
A .* operator, such that given a place t: T, t.*field simply offsets that place.

This is roughly equivalent to C++'s T::*U, .*, respectively.

There is, of course, one very unpleasant detail. Deref requires that you return a reference, which is kind of the whole reason we’re discussing this. Similarly, we can’t have a DerefField in any interesting way, since DerefField wants to take Ptr<'a, T> and spit out Ptr<'a, U> (think of Pin).

Unfortunately, associated type ctors (which are sort-of required to be able to play this game) have a host of problems, so I imagine that we’d just either special-case raw pointers (like we already do with *ptr), or just add

impl<T> *const T {
  fn field<U>(f: T::*U) -> *const U;
}

and accept having to write ptr.field(&_::my_field).

HeroicKatora · May 8, 2019, 6:07pm

Using the placeholder as part of a syntax to derive strongly type offsets is interesting. The intuitive meaning would neatly avoid any other associations with specific pointer types, or value semantics, at least in my head.

I'm not sure what you mean here? Is this in reference to the fact that not every T::*U should be unconditionally constructible (e.g. the one to unaligned packed members requires some care)? Special casing pointers definitely requires unsafe everywhere though.

In terms of derailing, depends on what your means are and what your goal. After a few years of C++ myself, ptr-to-member often appears to me as an XY-problem especially with closures. The underlying problem is a way to abstract type structure (fields, variants, ..) over pointer types, that is be able to synthesize fn(Ptr<T>) -> Ptr<U> when T contains a U structurally, on all Ptr<_>. Both offsetof in C and ptr-to-member in C++ present solutions to this in some/many cases of the pointer types of their language. But it remains an open question if they are generic enough to solve the problem of abstraction over all pointer types. For offsetof I'm confident that the answer is no at least for safe code.

At the same time, my above formulation brings and what I'd want from the feature brings it dangerously close to higher kinded traits, which would be likely far into the future, so that maybe a specific but ready solution may be preferable. However, a quick solution needs to be more carefully evaluated to not restrict future design space.

RustyYato · May 8, 2019, 6:59pm

I started a thread for the pointer to field discussion, so that it doesn't knock this thread too far off course

RalfJung · May 9, 2019, 10:52am

Also on the topic of raw ptr ergonomics in general: the proposal to add methods for working with raw slices.

MSxDOS · May 23, 2019, 2:52pm

I think the -> operator deserves an RFC at least. It is nothing but pure pain for me to write (*(*ptr).inner_ptr).field in Rust compared to ptr->inner_ptr->field in C and it gets even more ugly and error prone when I have to do casting., so I’m all for adding this operator to Rust, if it behaves the same way it does in C.

RustyYato · May 23, 2019, 2:55pm

When do you need to do that, I haven't run into that problem before.

mbrubeck · May 24, 2019, 10:30pm

When ptr and inner_ptr are both raw pointers, you need to explicitly dereference them:

struct Foo { inner: *const Bar }
struct Bar { field: u8 }

unsafe fn foo(ptr: *const Foo) -> u8 {
    (*(*ptr).inner).field
}

Playground

RustyYato · May 24, 2019, 10:31pm

Ok, hadn’t thought of that. Does this come up often in unsafe code or FFI?

Topic		Replies	Views
[Idea] Pointer to Field Unsafe Code Guidelines	81	4904	September 3, 2019
Feature: Allow pattern-matching of a pointer language design	7	1831	July 10, 2021
Computing raw pointers to fields Unsafe Code Guidelines	3	4372	December 22, 2024
Raw pointer ergonomics	22	1851	May 3, 2023
Unsafe Deref Trait	4	1523	May 23, 2021

Need for -> operator for Unsafe Code Guidelines

Related topics