Pre-RFC: Runtime reflection

Hi, I would like to ask for your opinions on an idea about introducing a runtime reflection mechanism:

Summary

The proposed feature is an opt-in runtime reflection mechanism allowing inspection of type information along with its properties.

Motivation

Rust currently doesn't support any runtime reflection mechanism, apart from TypeId unique identifier. This makes it impossible to implement patterns like Dependency Injection which rely solely on runtime type information. The only way to work around such problems is compile-time use of attributes, which is not always possible and results in tight coupling between the affected types and attribute crates.

By allowing to opt-in to using runtime reflection, we enable a whole class of high-level patterns and ease development by removing the burden of adding compile-time workarounds with a possible benefit of reducing build times.

Guide-level explanation

By enabling feature-gated runtime reflection, the programmer can inspect type information without the hassle of using attributes/macros. For example, let's consider a model entity which we would like to map to a database:

pub struct Entity {
    pub id: number,
    pub name: String,
}

One way of making the struct persistable in the DB is to manually write functions which have the knowledge on how to map each field to a DB column. This approach has some disadvantages:

  • Requires manual work.
  • Requires manually keeping the logic up-to-date with the model.
  • Doesn't scale - the amount of work required gets multiplied by each type we want to support.

Another possibility is to use attributes/macros - this is the way the diesel crate currently operates. The code needs to persist the entity is generated at compile time by the use of macros representing DB structure, and attributes on the entity types themselves. This approach also has problems:

  • Necessitates the use of macros, which either has to be done by hand or by using a dedicated cli.
  • Attributes add coupling between the model and the specific DB crate.

Finally, we can leverage runtime reflection to extract all the necessary information from the type itself to dynamically create the required database mapping. This solves the problems of manual work (none needed), keeping the code up to date (happens automatically at runtime) and coupling with external crates. Also, we get automatic scaling with increasing number of types involved without scaling up the build time. Consider a simple example:

fn persist<T>(entity: &T) -> Result<(), String> {
    // get the information about the type
    // the actual API will be discussed later in the reference documentation
    let reflection = std::reflect::get_type_info(entity).expect("Cannot get type info!");
    match *reflection {
        std::reflect::Type::Struct(info) => {
            // get field information
            match info {
                // iterate over struct fields and save them
                std::reflect::StructType::Named(struct_info) => for field in struct_info.fields {
                    match field.type.expect("Cannot get field type!") {
                        std::reflect::Type::I8 => {
                            let value: &i8 = field.as_field_ref(entity).expect("Cannot get field value!");

                            // call imaginary persist_field() method with our concrete field value and its name to a table having the name of the struct
                            persist_field(value, field.name, reflection.get_name())?;
                        },
                        // handle rest of the possible types
                        ...
                    }
                },
                _ => return Err("Only named fields are supported!"),
            }
        },
        _ => Err("You can only persist structs!"),
    }
}

By using runtime reflection, we instantly add support for persisting our entities without any changes in the crate itself, thus achieving clear separation of concerns. We no longer need to use macros or do any maintenance work relating to changes in data.

Reference-level explanation

Proposed API (everything is assumed to live in the std::reflect namespace):

enum Type {
    Bool,
    I8,
    U8,
    I16,
    U16,
    I32,
    U32,
    I64,
    U64,
    I128,
    U128,
    ISize,
    USize,
    F32,
    F64,
    Char,
    Struct(StructType),
    Enum(EnumType),
    Tuple(TupleType),
    Array(ArrayType),
    Union(UnionType),
    Fn(FnType),
    Closure(ClosureType),
    Slice(SliceType),
    Ref(RefType),
    ConstPtr(PtrType),
    MutPtr(PtrType),
    Never,
    Unit,
}

impl Type {
    /// Returns the name of the type. The naming rules are the same as for `std::any::type_name`.
    const fn get_name(&self) -> &'static str;
}

The Type enum represents the basic block of information which can be further inspected to get type details. This information should be statically available and provided by the compiler. Retrieval is done by using:

const fn get_type_info<T>(subject: &T) -> Option<&'static Type>;

The function accepts a reference to a subject of any possible type, returning optional type information. Given reflection is an opt-in feature, to avoid adding code and data when not needed, Option is required. Also, returned information is assumed to be an immutable reference to static memory stored in an unspecified (from the user perspective) location. Concrete type-level details can be examined further. Note: the function should retrieve information for concrete implementation of dyn and impl Trait types, thus making it generic at runtime.

struct StructType {
    Named(NamedStruct),
    Unnamed(UnnamedStruct),
    Unit,
}

struct NamedStruct {
    fields: &'static [FieldNamed],
}

struct UnnamedStruct {
    fields: &'static [FieldUnnamed],
}

Basic information about a structure and its contents.

struct FieldNamed {
    name: &'static str,
    type: Option<&'static Type>,
}

impl FieldNamed {
    /// Returns an immutable reference to the field, given the containing struct reference, or None if the field is inaccessible, nonexistent or has a wrong type.
    const fn as_field_ref<T, S>(parent: &S) -> Option<&T>;

    /// Returns a mutable reference to the field, given the containing struct reference, or None if the field is inaccessible, nonexistent or has a wrong type.
    const fn as_field_mut<T, S>(parent: &mut S) -> Option<&mut T>;
}

struct FieldUnnamed {
    type: Option<&'static Type>,
}

impl FieldUnnamed {
    /// Returns an immutable reference to the field, given the containing struct reference, or None if the field is inaccessible, nonexistent or has a wrong type.
    const fn as_field_ref<T, S>(parent: &S) -> Option<&T>;

    /// Returns a mutable reference to the field, given the containing struct reference, or None if the field is inaccessible, nonexistent or has a wrong type.
    const fn as_field_mut<T, S>(parent: &mut S) -> Option<&mut T>;
}

struct FieldNamedUnsafe {
    name: &'static str,
    type: Option<&'static Type>,
}

impl FieldNamedUnsafe {
    /// Returns an immutable reference to the field, given the containing struct reference, or None if the field is inaccessible, nonexistent or has a wrong type.
    unsafe const fn as_field_ref<T, S>(parent: &S) -> Option<&T>;

    /// Returns a mutable reference to the field, given the containing struct reference, or None if the field is inaccessible, nonexistent or has a wrong type.
    unsafe const fn as_field_mut<T, S>(parent: &mut S) -> Option<&mut T>;
}

Information about struct fields for named and unnamed variants. Note: the structs might contain hidden members enabling the necessary functionality.

struct EnumType {
    type: Option<&'static Type>,
    variants: &'static [EnumVariant],
}

struct EnumVariant {
    name: &'static str,
    type: Option<&'static Type>,
}

struct TupleType {
    fields: &'static [FieldUnnamed],
}

struct ArrayType {
    element_type: Option<&'static Type>,
    length: usize,
}

struct UnionType {
    fields: &'static [FieldNamedUnsafe],
}

struct FnType {
    parameters: &'static [ParamType],
    result:  Option<&'static Type>,
    is_const: bool,
    is_unsafe: bool,
}

struct ParamType {
  type: Option<&'static Type>,
}

struct ClosureType {
    parameters: &'static [ParamType],
    result: Option<&'static Type>,
}

struct SliceType {
    element_type: Option<&'static Type>,
}

struct RefType {
    underlying_type: Option<&'static Type>,
    is_mut: bool,
}

struct PtrType {
    underlying_type: Option<&'static Type>,
}

Information above other corresponding types.

All of the above types should implement Debug and may implement Display.

Drawbacks

Generating reflection information might take non-insignificant time and result in large binaries. That's why the feature is gated and needs to be explicitly opted in. Potential security concerns should also be taken into account.

Rationale and alternatives

Runtime reflection has proven to be instrumental in making high-level patterns possible without introducing tight coupling between components. While heavy in itself, it can provide a cleaner alternative to manually generated code or extensive use of macros and attributes. This proposal lies the foundation of the reflection mechanism, which can be extended further in the future.

Prior art

We can see similar mechanism present in non-native languages like Java or C#, where reflection is used extensively to provide various features - from simple serialization, to advanced patterns like Inversion of Control, Dependecy Injection or Data Mapper. We can see example frameworks like Hibernate, Spring or ASP.NET successfully using reflection as a tool for providing an environment where large projects can focus on domain problems rather than lower-level concerns (serialization, dependency management, domain/persistence layer separation, etc.), reducing the need of manual work.

Unresolved questions

  • How to handle lifetimes? Should they be exposed?
  • How to expose attributes?
  • Should async fn/closure types be handled differently?
  • Currently the API exposes non-public fields in structs but makes them inaccessible - it might be a better idea to skip such fields entirely. Maybe add a visibility enum?
  • Const functions are currently implemented as a "minimal" version, which doesn't support all necessary features needed by this proposal. We need to wait for the full implementation to make everything const.

Future possibilities

Crates like serde, rocket or diesel can use the reflection to separate their responsibilities form user code as much as possible. By making all function const, we also enable reflection to work in constant contexts, thus enabling the possibility to shift the work to the compiler.

6 Likes

Since your key motivator is dependency injection, an example of reflection helping for such would be very beneficial to the RFC.

As this could entirely be implemented with attribute proc macros, a proof of concept implementation would go a long way for allowing people to understand the utility by playing with it.

(Explicitly no opinion given on desirability of feature)

9 Likes

My explicit opinion on desirability is a definitive "no", and I fail to see how it is "required" for DI.

Yes, there are DI libraries for object-oriented languages (most of them lacking a strong and powerful type system comparable to that of Rust) that support or use reflection, but it by no means is an inherent need that couldn't be substituted for by other techniques.

Compare this with serialization. Almost all popular OO languages have serialization libraries that work via reflection (Java, Objective-C, Python). Not Rust, though, because in Serde it's got something better: an elegant interplay between derive macros and traits, moving most of the churn from runtime to compile-time.

30 Likes

I would definitely agree with @H2CO3. Without clear use cases demonstrating that this would solve a problem that is not better solved with existing Type-System features and macros, then, this shouldn't even be considered.

This seems like something that should be implemented as a Crate with Macros, and then, if usage is shown to be pervasive, then, consider if it should somehow be made more first-class in the compiler, but, I doubt that would be the case.

5 Likes

@dtolnay has a PoC compile time reflection crate

2 Likes

I agree that DI, or many of other cases when traditionally runtime reflection is used, can be done using macros and/or attributes. Yet, the point of this RFC is to introduce the system when such coupling would not be required. Macros/attributes create a very tight coupling between domain entities and a concrete crate which provides those. Such mixing of concerns is not good from a design standpoint, with various levels of resulting problems. Runtime reflection, while admittedly heavy, solves this problem by acting as an intermediate between the domain and the user. Of course, it is not a blanket statement - some might be fine with such coupling and never experience any problems, but I would argue that complex projects need separated layers. In other words, reflection allows you to add functionality without affecting its subjects.

Taking a look at the provided example, we can see how the persistence layer is separated from the domain model, which now is not easily achievable in rust (if at all when we generalize use cases). Serde is a nice example of this - it's a great crate providing much needed functionality, but it makes the use of attributes necessary. Adding a serde-specific derive to the domain model is conceptually no different than adding functions like serialize_with_serde(&self). While this may be fine in many cases, it might also not be fine in other. Things get worse with more high-level patterns/paradigms, e.g. aspect-oriented programming.

My argument is not to replace existing solutions, but to provide an alternative which is used with success in other languages. Having the possibility to use advanced concepts with the safety guarantees and performance of Rust looks very appealing in my opinion. If the community disagrees, I respect that and will not pursue the matter further, yet before dismissing the proposal, I would suggest taking a look at https://autofac.readthedocs.io/en/latest/lifetime/index.html and thinking about how such features could be implemented today in Rust. Can it be done at all? What would the effort be if so?

4 Likes

This kinda looks like it can be solved in a similar way to scoped threads, see crossbeam::scope.

Lifetime management is basically what Rust was made to do.

This is a perfect example of why digging into specific, concrete examples is absolutely vital for a proposal like this: overly abstract arguments can be plausibly pointed in any direction. To me, the proc macro API (plus the de facto standard syn and quote crates) already provide the decoupling intermediary we need for compile-time reflection. So without any further details, the fact that diesel has its own proc macro doesn't seem any more problematic than the fact that diesel would have its own reflection function you'd have to invoke if it used runtime reflection instead.

Bringing up aspect-oriented programming was probably the most concrete thing I did see. Although I have only a vague awareness of AoP, the people I've talked to who've used it mostly provide stories of incomprehensible configuration regularly leading to indecipherable startup errors. In contrast, your posts appear to be suggesting that not supporting AoP is a serious design flaw of the language. I don't know who's right, but if the utility of this feature proposal rests on AoP use cases, there definitely needs to be some effort spent fleshing out the AoP use cases that you want Rust to support, for those of us who aren't already familiar with and supportive of them.

(you mention a "provided example" in your post, but unfortunately I have no idea what this is referring to; I only see a reflection API sketch in the OP)

It's probably worth noting that "reflection" plus "safety guarantees and performance" is almost certainly going to end up meaning compile-time reflection instead of runtime reflection. After all, most of Rust's magic is compile-time magic; at runtime it's doing the same stuff as every other AoT-compiled language (except the "pull-based" futures I guess?).

AFAIK, the major "other languages" that use runtime reflection today do so not because there are any use cases where runtime is better, but either because they don't have a compile time at all, or because they prioritize flexibility over performance, or because they're managed languages which already have other reasons for paying the price of keeping type metadata around at runtime, so "just" exposing it to the language isn't as big of a deal. Is that not true for whatever language you're thinking of?

13 Likes

I would like to challenge the opinion expressed here that relying on runtime reflection alleviates the coupling problem of compile-time approaches.

As far as I know, runtime reflection actually enables whole new categories of software component coupling, to the point where the Java community was contemplating allowing classes to opt-out of reflection at some point because it completely broke class privacy and led to a bacchanal of implementation detail leakage and reliance on internal interfaces. As a result, what looked like harmless library implementation changes with no API impact recurringly ended up breaking client code which (ab)used reflection to stick its nose where it shouldn't have.

Therefore, can you motivate that adding runtime reflection actually removes more coupling than it enables introducing more coupling?

14 Likes

That's not really the case - my linked example doesn't deal with scope in the language sense, but a logical scope in application sense bundled with dependency management in runtime. You can have e.g. various dependencies being dynamically controlled at runtime, having their lifetime tied to each other depending on given situation in time without them knowing about each other or such system operating at all. This is one example of runtime introspection at work.

I see the problem with being too vague with examples, since my motivation is not directly tied to any single one, rather a class of use cases. I am not arguing specifically for aop, nor am I suggesting repeating the mistakes of other languages. I'm sorry for causing misunderstandings. I will try to clarify the proposal.

2 Likes

I think this would be accomplished in a much better and simpler way by providing a way for any crate to "inject" a #[derive] attribute on types defined in another crate, as long as the crate defining the proc macro opts in to that; the generated code would become part of a crate identified by the proc macro crate (it would have to be the crate defining the traits implemented to respect the orphan rules).

That allows to implement reflection or any mechanism as a derive macro and enable it for any or all types without requiring the crate defining the type to add that derive.

Note that this doesn't work for accessing private fields/methods, but in that case the "coupling" is essential and unavoidable if preserving compatibility guarantees is desired (one could add a way to override this, but it seems inappropriate).

In any case, accessing private members should not be allowed by something like this as "unsafe" relies upon invariants being upheld at the module level on private members. If something like "reflection" could muck with that (like it can in Java) all bets with respect to safety would be off.

11 Likes

Overall, I think the Pre-RFC would benefit from a discussion on why this has to happen at runtime instead of compile time. Beyond current limitations in const fn, I think it's very plausible that a compile time reflection API could support the use cases enumerated such as building a DI framework or an auto-routing web framework.

I agree private members should not be accessible and the API already provides that. Granted, I might have not been clear enough by simply stating that a field might be "inaccessible". The thing I am not sure about is whether such fields should be visible for reflection at all. Nevertheless, I shall update the draft tomorrow with clearer examples.

Agreed, I don't see any advantage of using run-time reflect instead of macro.

In addition, I am worry about after having the relection ability, we introduced a unnecessary burden which make Rust not zero-cost abstract anymore. And I don't see any DI or ORM motivation worth for losing zero-cost abstraction.

Actually, as a native language, there are a lot of things needs to be done to support this, Such as the data structure of each struct should be keep as a run-time data. For my personal usage, all these things would be reasons that I makes me drop Rust and back to C.

3 Likes

Your points are covered by having this feature explicitly opt-in, as stated in the proposal. It's understandable that not everyone would need, or even could use this. I'm not advocating forcing features on people; rather giving more tools to those who would like to use them.

2 Likes

Even that, I don't see any need of reflection, since DI can be done by procedural macro, at least most of the part.

I would rather make more effect on better meta-programming infrastructure than introducing run-time burden. I don't think this is the Rust is design guideline. We should be careful to add anything that have a cost. As you can see, even the async feature doesn't introduce a built-in thread pool. Right?

One of the thing is I personally don't like putting everything that looks cool into a language. Especially the thing doesn't really fit the guide line of the lang. I think that's one of the reason I don't use C++ for a long time.

2 Likes

I won't discuss the motivation here (others are doing so already) but I would like to note that your proposal is not implementable.

The FieldNamed structure you've proposed here does not have enough information to actually know where to offset into the struct value to get hold of the field's value. Remember that Rust does not have a Uniform Value Representation.

1 Like

I understand your points, but I must disagree with some of them. I see use cases which are not possible to achieve at compile time. I also see problems with the coupling I've already told about, which is introduced by macro usage, and which I will try to elaborate on. I do agree with the inherent introduced cost, hence the opt-in feature. I think we need to agree to disagree here on whether the benefits outweigh the costs.

I assumed hidden fields allowing such use case. I'll be honest - this is my first Rust RFC and I don't have the knowledge if such assumptions are valid and the information can be omitted, or if I should be explicit about hidden data and intrinsic usage. I apologize if I've been incorrect.