Pre-RFC: Runtime reflection

I think the problem you listed in the RFC isn't any problem from my side.

For the first one, an usage of macro is beneficial. Since it indicates that the type may be written to database and this makes the code much more readable. And please realize the fact that even for a database client, most of the data type in the program isn't the data itself. So why we need the ability that allows any type written into database. That only makes the code less readable. AFAK, even for C# which uses reflection for serialization, an attribute that indicates this type should be taken by the serializer is used to increase the readability and detect misuse of serialization function at run-time. serde already have this in compile time which is much better than C#, why we want to get rid of that ?

For the second one, I would rather have a procedural macro that generates DB independent metadata, so that any database crate compatible with the metadata protocol works with it. Or conceptually, we just introduces a macro that makes some type has kind of ability of "reflection".

I would like say all those thing doesn't really makes real reflection necessary. And I don't think it's beneficial to have all those things integrate to the language core.

I actually think the meta-programming crate that simulate reflection seems a reasonable approach. For example we could have things like:

#[reflectable]
struct Entry {
     name: String,
     id: u32,
}

Then we can use the refection interface like:

fn print_field_name<T:Reflectable>(data:&T) {
       for field_name in data.field_names() {
           println!("{}", name);
       }
}

But I don't think all this kind of things worth a change to the language itself. And I think this is what @gbutler means. But anyway, introducing run-time reflection into a system language seems always harmful. Plus Rust is already in a better situation than C and C++ since we have much better meta-programming mechanism already. So my question is, are we seriously making Rust another Java?

12 Likes

Could you describe what those are and why they're not achievable at compile time?

4 Likes

I think there's one core requirement for anything in this area:

  • It cannot add binary size or runtime cost by default, and thus must be opt-in.

As such, it feels like it'll always be some sort of trait+derive or similar mechanism, and thus I think it should probably be prototyped outside the language/stdlib for now.

4 Likes

As someone who programs heavily in java, I believe most of the runtime reflection stuff that caused issues was mostly the ability to reach into the private internals of classes (mostly private fields).

2 Likes

Two other data-points

  1. Dagger is a compile time dependency injection library for Java.
  2. Serde basically provides a visitor interface for class values - it's an example of something that would traditionally be implemented (in languages with runtime reflection) using runtime reflection
1 Like

I agree. On the other hand, many classic use cases for run-time reflection (such as serialization and ORMs) actually do rely on access to private data fields.

In Rust, those use cases are generally handled by having the crate's author opt-in to such features, which I think is a saner approach (since having these features adds extra library backcompat constraints, the library author should be aware of them).

2 Likes

While writing more examples, I noticed all can be covered using the #[reflectable] approach, as noted by @haohou, by adding more or less additional work. It's hard for me to tell which approach would be better when we factor in productivity, efficiency and design in general. I think both have their pros and cons, and until I can actually use attribute-based reflection in practice and gain experience, it's impossible to decide. I'll try to explore such compile-time reflection, therefore the current proposal can be ignored (for now at least). Thank you for all your comments.

8 Likes

For compile-time reflection, something like Julia’s generated functions could come in handy.

For other code, they behave as generic functions. They are, however, proc macros that are expanded at type specialization time (with concrete type info on which we could reflect).

Edit: They must return some AST that will become the body of the specialized function.

I don't see this mentioned yet, so: the coupling problem here is real and we all encounter it all the time, in the form of orphan rules. Every type that you want to use with serde must implement its traits in its defining crate (or serde's, but that's usually irrelevant). A third crate can't come along and add that impl itself- and it can't even use a newtype wrapper because that would prevent the use of #[derive]

So it's certainly possible to write a serde-like derive to add support for reflection to individual types, but it would not solve this problem. The pre-RFC proposes to solve it by, essentially, auto-deriving a universal serde-like trait on every type, avoiding the need for crate authors to implement all "glue" traits any of their consumers will ever need.

There is an alternative approach that actually provides the same decoupling as reflection, while also remaining opt-in- first-class modules. If you're unfamiliar, they're basically "traits without the orphan rule." This obviously has its downsides (you can no longer automatically pick a single canonical impl; some forms of interop become harder), but it also has upsides (third-party crates could implement the "reflection protocol" for types they don't own; you can have more than one impl per type).

We probably don't want to tack on an entirely new module system alongside Rust's trait system, but it is something to look to for inspiration. Perhaps we could get derive-like access to, or compile-time reflection data for, external type definitions? Perhaps we could generate serde-like code from this information without tying it to a trait, or implement a trait for a newtype?

5 Likes

Although it is often used, I think serde is really not a great example for orphan rule discussions.

A third party crate could not derive serde's traits for an external type, even if the orphan rules were relaxed, because it doesn't have access to private data fields, which is a prerequisite for lossless round-trip serialization. At the end of the day, serde's derive macros are just code generators, they don't let you write code that you're not allowed to write by the language's privacy rules in the first place.

Exposing the private fields that serde needs to look at is, in turn, a choice with profound API compatibility implications that should definitely be opt-in. The opt-in could be some variant of the proposed compile-time reflection concept, but in this particular case what you would need is "just" a convenience shortcut for marking all data fields pub.

That would still not be enough, however. You would need one more language extension in order to be able to derive serde's trait for external types without duplicating the work done on serde_derive. Since derive macros are code generators that take a type definition as input and produce trait implementations as output, one thing which would be missing in order to use them on external types, is a way to tell the compiler "please take the definition of this external type and feed it to this derive macro".

Overall, allowing crates to derive serde's traits for external types would require significantly more than relaxing orphan rules (which itself is a bit of a minefield). It would require at least...

  1. An opt-in way to exhaustively leak all data fields from a type, so that private fields can be accessed by external serde trait impls.
  2. A way to feed an external type definition into a derive macro, so that the logic of serde's derives doesn't have to be completely duplicated.
  3. And, only after that, relaxed orphan rules so that the trait implementations produced by the derive macro are accepted by the compiler.
6 Likes

As far as I understand, doing that would extend the amount of code that would need to uphold invariants of a module to all code that could access/modify the private fields. That would make understanding whether unsafe blocks were sound next to impossible I believe and would seriously compromise the safety guarantees of Rust.

Is that not correct?

3 Likes

Those requirements sound reasonable to me (and I think #2 could be skipped — it's a big convenience, but not necessity).

It's a hard problem, but I think it's a real issue worth solving. Rust currently can't have "adapter crates", e.g. a 3rd party can't make chrono implement ToSql trait (how to do that safely and reliably is an open question, but I think it's clear that having ability to do so is useful).

2 Likes

If it's opt-in, it's safe. It's merely another way of making fields public, so it as safe as existence of pub. Of course if the syntax for it is obscure it could confuse someone and lead to errors, but it doesn't have to be like that. In the worst case it could be done by adding pub(but only via reflection) to all fields.

And I don't think it's even necessary. Lots of types could be serialized and deserialized using only their public interface (e.g. Serde serializes std::time::Duration just fine without having access to its fields).

1 Like

I'm having a difficult time squaring that circle. If something outside the module can modify the private fields of a module, then all unsafe blocks in that module would be undefined behavior because you could never know what other things are not upholding the necessary invariants. Is that not so?

If that's the case, then why not just make them public? Are you thinking it would be something like "protected" or more likely "friend" access where a module can declare "friend" modules that can have access to private members?

2 Likes

The reasoning is that if something else could cause undefined behavior, you would not opt-in to reflection. The same way you don't opt in to fields being public.

Another way to solve it is to make fields public for reading only. AFAIK you can't cause UB just by exposing a field's value.

Writing to a private field could be forbidden (so you solve serialization, debugging, magic DI, but not deserialization in general).

Another solution would be to mandate writing to a field, or construction of a struct, only via a function provided by the type. So the function would have an opportunity to recheck and enforce invariants before accepting fields' values.

3 Likes

So, then, something like "protected" or "friend" access, but, with read-only to fields directly and write only through a provided "protected/friend" interface that had specific "setters"?

What about internal mutability of private fields? If I could read such a field, I could then modify it, no? Wouldn't that also potentially violate invariants?

1 Like

Hm, interior mutability is indeed a problem here. I don't think Rust currently has a way to enforce look-but-don't-touch for such types.

Yet another way I see is to allow a type to expose a projection of itself. A type Foo could say you can inspect FooView<'_> that contains only fields it wants to expose, in a way it can expose them safely. For deserialization it'd have to provide something like From<FooView>.

1 Like

Cannot any type do such a thing today?

They can, but it's not tied to a universal trait, and it's not relaxing orphan rules. I was thinking about it more as an implementation detail for a hypothetical safe-reflection feature, rather than as a feature itself.

OK, so then, what might be useful you're saying is a projection of a type that a type can define that permits 3rd-parth implementations of traits against the projection, but, not against the actual type. Does that sound about right?

So type Foo could define FooProjection that could have 3rd part implementations of traits against FooProjection? Could that better solve these problems? There would need to be a way for ![derive] to get fed the definition of FooProjection though wouldn't there?