PhantomData without field syntax noise

PhantomData is super useful from typesystem perspective, but PhantomData values are usually uninteresting, and explicit initialization of phantom fields in struct literals is just boilerplate:

StructWithPhantom {
    value,
    _phantom_field: PhantomData, // this uses type inference anyway
}

The field has to be explicitly initialized, but this explicitness doesn't add new information, because the generic types are usually specified elsewhere. This can be even seen as leaking an implementation detail, because other ways of defining generic types don't need a faux value.

A struct with public fields can't add a generic type requiring PhantomData without a semver break, even if the type argument has a default (Foo -> Foo<T = ()>).

A single-field tuple struct works nicely with callbacks like map: value.map(Wrapper), but adding a PhantomData makes it clunky value.map(|v| Wrapper(v, PhantomData)) and IMHO this just takes away convenience, without adding clarity (selection of the type is delegated to type inference anyway).

Rust hasn't figured out delegation yet, or fields in traits, so a phantom type argument could be an easy way of creating "newtypes" of POD structs like Point or RGB:

struct Point<Domain> {
   x: i32, y: i32, 
   _domain: PhantomData<Domain>,
}

so you could have type-safe Point<WorldSpace> and Point<ScreenSpace>, and implement methods on Point<T> without extra hoops of traits without fields, or enormous boilerplate of properly delegating all impls from some WorldSpacePoint(Point) to Point without a type-safety-defeating Deref cop-out.

but it doesn't work with the simple Point { x, y } syntax, and explicit management of the extra field gets tiring quickly.

The extra field also gets in the way when destructuring and matching against the type.


So it makes me wonder what Rust could do to make PhantomData less burdensome?

Can type arguments get an attribute that makes them used in structs, without this being an explicit field?

struct Struct<#[phantom(&mut T)] T> {}

Or maybe there could be a feature that is not phantom-data specific, but helps initialize such fields in general?

struct Struct<T> {
   #[default] 
   // gets initialized even without `{ ..Default::default() }`
   _phantom: PhantomData<&mut T>, 
}
7 Likes

This would make me happier if each Point didn't generate a new monomorphization though - at least for the cases where you don't call T::something() or things like that in the methods

(or can rustc reliably deduplicate them already?)

1 Like

Any solution to this should also address destructuring. I have a vector type containing a PD for type-tagging, and while initialization using initializer syntax is ugly as hell, I’ve worked around that by simply providing free functions such as vec3(f32, f32, f32). But there’s no such workaround for destructuring. Luckily handling the PD field doesn’t require as much ceremony as construction does, but still, it’s a small but annoying papercut.

3 Likes

How about instead of baking in Default, providing default values (that must be constants)? That way, it's still usable in const eval, and usable in non-ZST situations where a singular default isn't the right .

struct Struct<T> {
   _phantom: PhantomData<&mut T> = const { PhantomData }, 
}

(The const block is to mandate and clarify that no side effects will occur from constructing the value.)

That said, this won't help with destructuring at all, which IMO is the worse half of the problem, since you can't have helper functions for patterns (though you could write a helper macro). In order to help with destructuring, you'd need some way to express “this field may be ignored, even in an exhaustive struct pattern”. I haven't got any ideas for that.

4 Likes

I would love to see something along these lines. I don't know exactly what it should look like, and I don't know if it should take a type like the &mut T here or not, but I think we should have something better than the current solution.

Rationale for potentially not needing to include the specific type containing the phantom type parameter: AFAICT that's primarily used to get properties that would be gained by having a field of that type, such as "I don't want to be Sync" or "I want T: 'a" or similar, but such properties seem better expressed with things like impl !Sync for MyType or T: 'a. Given the ability to do something like that, perhaps we just need struct Struct<phantom T> or similar.

Half-joking, half-serious: allow fields to be marked as non_exhaustive, with sort of the opposite meaning as for structs. This definitely won’t be abused by anyone ever, or result in an arms race of exhaustivity checks and opt-outs. (Maybe limit it to zero-sized types.)

Obligatory mention of this, which I still want as-written:

(So it'd allow exactly what you wrote, minus the const block, since all field defaults would be const.)

3 Likes

That already doesn't work. Defaults are ignored by type inference. That is also why HashMap::new() is forced to only work with the default hasher. Otherwise type inference wouldn't know which hasher to pick.

2 Likes

:+1:. It'd be nice if we had working default type parameter fallback.

1 Like

That's cool.

Monomorphisation bloat is a problem in general, even for non-phantom types (e.g. vec.len()), so it may get tackled eventually. LLVM can dedupe identical functions, although that doesn't help compile times

impl !Sync would be nice, but for other things PhantomData is IMHO fine. I remember early Rust had types like PhantomCovariant and PhantomContravariant, and I needed to re-read the whole Wikipedia article about variance twice to figure out which one is which. Having an "as-if" type is a clever solution to this.

2 Likes

See this IRLO thread where I propose exactly this. I never got around to fully rewriting the RFC, but it is something I still want to do eventually. Essentially you'd have to explicitly declare the default in the field definition and then use .. to include the field in the struct expression.

This is similar to how Haskell does things (but with even less syntax - just struct Struct<T> {}), I would love to be able to do it in Rust as well, at least as long as I'm not trying to do fancy variance.

Why not use the same mechanism used for {integer}, rather than ignoring defaults? (An integer literal whose type fails to be inferred defaults to i32)

Could this perhaps be handled via associated types on structs? (FWIW coming from C++ I wanted to have those a couple of times regardless.) For the purpose of phantom data they'd likely need additional annotation, e.g.:

struct Struct<'a, T> {
    #[phantom_data]                // associated type used like PhantomData
    type ReferenceType = &'a T;    // access pattern

    index: usize,
    ...
}
2 Likes

Prior art:

The RFC mentions features similar to the ones proposed in the OP as alternatives.

Biggest downside I see is that everybody who wants to implement a derive proc macro will have to deal with those extra annotations in some way. Derive macro situation is already miserable as is.

Something like struct MyStruct<T: Covariant>(); would be nice. Having to add actual fields just to specify variance is very inconvenient and can require significant changes to the source only for the purpose of being able to implement some trait for your type.

I agree with the RFC that variance by example is easier to understand and more flexible.

2 Likes

I've been thinking about this a little bit. I think it would be better to represent these as "ignored" fields, like:

struct MyStruct<T> {
    pub real_field: usize,
    _: &mut T, // Phantom Field
    _: T, // Another phantom field!
}

I use marker types to represent different transfer functions for an RGB struct. Right now I have type aliases for transfer functions, but they cannot be constructed without specifying the transfer function field.

An alternative that'll work just as well is field defaults. e.g.

trait MyTrait {
  const DEFAULT: Self;
}

struct MyStruct<T: MyTrait> {
  pub required_field: f32,
  pub defaulted_field: T = <T as MyTrait>::DEFAULT,
}

// ...
let whatever = MyStruct::<MyTraitImpl> { required_field: 100.0 };
2 Likes

This looks like struct containing fields of type &mut T and T which you cannot name, not PhantomData fields. Though this led me to a different idea: how about adding something like UnnameableField trait like the following:

use core::marker::PhantomData;
trait UnnameableField {
    const VALUE: Self;
}

impl<T> UnnameableField for PhantomData<T> {
    const VALUE: Self = PhantomData;
}

impl UnnameableField for u8 {
    const VALUE: Self = 0;
}

Then make it so that

  1. Adding _: Type to some enum/structure requires Type: UnnameableField.

  2. Any time enum/structure is constructed all unnameable fields appear during construction with value VALUE. I.e. the following is equivalent:

    struct Foo {
        _: u32
    }
    const FOO: Foo = Foo {};
    

    and

    struct Foo {
        v: u32
    }
    const FOO: Foo = Foo { v: 0 }
    
  3. It is undefined behaviour to have a structure with unnameable field and have anything other than VALUE there. (This is to create niches.)

This way UnnameableField will solve three problems: people sometimes needing padding fields which just add noise to the code, people wanting to use space wasted as padding for niches and your problem of not wanting to write phantom fields.

This also does not introduce inconsistencies with the rest of the language: _: T meaning _: PhantomData<T> is really unexpected.

(If you feel this is a good idea deserving of RFC you are free to write one, I do not feel strongly enough about either padding or having to write PhantomData, so probably will not write it myself. There is also some more thinking needed: I am not sure about how easy this would be for common unsafe use-cases as you cannot take a chunk of initialized memory and put data for unnameable fields there without help of some compiler magic.)