[Pre-RFC] Add fn path qualifier

Summary

Add a new fn path qualifier to the Rust language. When added as a prefix to the path of a use expression, it opts in to 'fn-respecting' import behavior, which allows importing items from parent function scopes.

Motivation

When writing macros, it's often useful to generate new modules. This can be used to prevent users from accessing auto-generated code, and also prevents cluttering up an existing namespace with new names. If code in a macro-generated module needs to reference existing user-defined types, it will need to import them into the module via use super::SomeUserType

For example, an attribute macro my_attribute might be used like this:

#[my_attribute]
struct Foo;

and expand to the following code:

struct Foo;
mod _my_mod_Foo {
    use super::Foo;
    struct FooWrapper(Foo);
}

Unfortunately, this attribute macro cannot be used on any struct. If the struct is declared within a module (either at the top level of a file, or in an inline mod block), everything will work correctly. However, if the struct is declared within a function:

fn my_fn() {
    #[my_attribute]
    struct Foo;
}

the expanded code will not compile:

fn my_fn() {
    struct Foo;
    mod _my_mod_foo {
        use super::Foo; // ERROR
        struct FooWrapper(Foo);
    }
}

While my_fn will hide Foo from other code, it is ignored by super qualifiers in use statements. That is, it is impossible for code in_my_mod_foo to refer to Foo - paths staring with use self:: will resolve to items inside _my_mod_foo, while paths starting with use super:: will refer to items outside my_fn.

This prevents #[my_attribute] from being applied to all structs. The macro author must make a decision:

  • Require that the macro only be used on structs declared within a module
  • Avoid generating nested modules.

The first option makes the macro much less useful. The second option requires the macro author to generate sub-optimal code, when they might have had perfectly valid reasons for choosing to generate a nested module.

Guide-level explanation

When writing a path (either in a use statement, or when directly referring to a type), you can prefix the path with the special fn qualifier:

use fn::super::FirstType;
use fn::crate::SecondType;
use fn::self::ThirdType

The fn qualifier acts as a modifier, causing functions to be treated like modules when resolving paths:

struct TopLevelStruct;
fn outer() {
    struct OuterStruct;
    fn inner() {
        enum MyEnum { MyVariant }


        use fn::self::MyEnum::MyVariant;

        use fn::super::OuterStruct;
        use fn::super::super::TopLevelStruct;
    }
}

This snippet will import OuterStruct from its declaration inside fn outer(), and TopLevelStruct from its declaration adjacent to fn outer().

The above snippet behaves like this code when resolving paths:

struct TopLevelStruct;
mod outer {
    struct OuterStruct;
    mod inner {
        use super::OuterStruct;
        use super::super::TopLevelStruct;
    }
}

Without the fn qualifier, function scopes will be ignored by super and self qualifiers. For example, the following code does not compile:

struct TopLevelModType;
mod outer_mod {
    struct OuterModType;
    mod inner_mod {
        struct TopLevelStruct;
        fn outer() {
            struct OuterStruct;
            fn inner() {
                use super::OuterStruct; // ERROR
                use super::OuterModType; // OK
                use super::super::TopLevelStruct; // ERROR
                use super::super::TopLevelModType; // OK
            }
        }
    }
}

While it is still possible to import types declared within modules, any types declared directly within a function will not by seen by the use statement.

Here, fn outer() and fn inner() are invisible to the use statement.

The fn qualifier can also be used when referring to types.

fn foo() {
    struct OuterType;
    mod foo {
        struct InnerType(fn::super::OuterType);
    }
}

You normally won't need to use this feature when writing Rust. You will usually not declare types outside of a function, and will almost never declare modules inside of a function.

This feature is primary useful for macro authors. When writing a macro, it's often useful to generate a module, and then refer to some type that's in scope outside of the module. By using the fn qualifier, your macro can be used either within a module, or within a function.

Reference-level explanation

A new fn path qualifier is added to the language. It can be used as a prefix for any of the existing path qualifiers (crate, self, super, or $crate), or as a prefix for an unqualified path.

The fn qualifier can appear at most once, at the start of a path. Is is an error to use the fn qualifeir more than once in a path, or to use it in a position other than the start of the path:

use fn::fn::Foo; // ERROR - cannot use 'fn' qualifier more than once in a path
use super::fn::Foo; // ERROR - cannot use 'fn' qualifier more than once in a pth

use fn::super::Foo; // OK
use fn::Foo // OK
struct MyType(fn::crate::Foo); // OK

When applied to a path, the fn qualifier modifies the resolution behavior of that path. Within that path, self and super statements now treat function scopes in the same way as modules.

Specifically, for the purposes of import resultion, the expression fn foo() { ... } is considered equivalent to the expression mod _unnameable { ... }. _unnableable represents an identifier that cannot be named by the user. Effectively, functions behave like anonymous modules when resolving self and super in paths.

For example, the following code compiles:

fn myfn() {
    struct Foo;
    mod inner {
        use fn::super::Foo;
    }
}

In this snippet, the super qualifier now resolves to the (anonymous) myfn scope. Items occuring after super will be resolved within the context of myfn, which allows Foo to be referenced and imported.

The fn qualifier can be applied to paths that do not involve any function scopes. In this case, it does nothing:

struct MyStruct;
mod foo {
    use fn::super::MyStruct; // Equivalent to 'use super::MyStruct'
}

Note that the namspaces introcded by functions still are anonymous, and cannot be directly named. This code does not compile:

fn myfn() {
    struct Hidden;
}
use myfn::Hidden // ERROR

That is, it is still impossible to refer to types declared inside a function from outside that function.

Currently, fn is a keyword, and cannot be used in paths. This means that adding the fn qualifier to the language is guaranteed to not affect any existing compiling code. Before this RFC, this code does not compile:

use fn::SomeType;

There is no possibility of referring to a module named 'fn', since the path expression itself does not compile. Therefore, this RFC will not change the meaning of any existing legal paths in Rust code.

Drawbacks

This adds an additional niche feature to the Rust language, slightly increaing the complexity of the language spec and implementation.

It may be unclear to users when they should use this new qualifier. However, users are free to ignore this qualifier unless they need to use it - there is no reason to use it unless the user encounters a compiler error when trying to import a type.

This represents another concept that users must keep in mind when reading Rust code. However, since it only affects import resolution, it should be usually be clear from context what type is being imported, even for users who do not know how the fn qualifier works.

Rationale and alternatives

  • We could choose a name other than fn for this qualifier. Whatever name we end up picking must not currently be allowed in paths. This ensures that we do not break any existing rust code by changing the meaning of existing paths.

  • We could do nothing. This leaves macro authors without a guaranteed way to refer to types declared in the parent scope of a module.

  • We could modify the behavior of all paths, without requiring an explicit qualifier. This would almost certainly be a breaking change, as code like this current compiles:

struct OuterType;
fn foo() {
    struct OuterType; // Never used
    mod inner {
        use super::OuterType;
    }
}

This code will import the top-level OuterType into inner, ignoring the OuterType declared directly inside foo. Any change to path resolution would need to ensure that this code continues to compile, and to refer to the same OuterType.

Prior art

The author is not aware of any prior art.

Unresolved questions

  • What should the name of the qualifier be? fn may not be the best choice, since there's not requirement for the path to actually involve a function scope.

  • Will the parser require any changes in order to distinguish a fn as a part of path from a fn that starts a function definition?

Future possibilities

In a future edition, it would be possible to change the default behavior of paths. Every path could be treated as though it had an implicit fn qualifier at the start, and explicit fn qualifiers could be deprecated.

Since this is a breaking change, it would have to be done in a new edition. However, rustfix could likely convert existing paths, by adding in super qualifiers whenever a function scope would have been previously skipped over. Unfortunately, such a transformation would not be idempotent, so it would be necessary for rustfix to mark files in some way after transforming them (possiblly by adding a comment?)

4 Likes

Why don't you just put the actual struct definition inside the module, make it public, then create a type alias outside the module?

type Foo = _my_mod_Foo::Foo;
mod _my_mod_Foo {
    pub struct Foo;
    struct FooWrapper(Foo);
}
3 Likes

The proper solution here is working hygiene I think?

Macros still have no way of accessing any other items declared in the same scope. Also, this is only possible for attribute macros, which have access to the definition of the type. If a bang macro is used, e.g.

fn foo() {
    struct MyType;
    my_macro!(MyType);
}

it's impossible for the macro to move the definition of MyType.

Moving the struct can also cause existing code to break, due to privacy rules. For example, this code compiles:

fn bar() {
    struct Bar {
        field: u8
    }
    let bar = Bar { field: 25 };
}

but when Bar is replaced with a type alias, it does not:

fn bar() {
    use inner::Bar; // ERROR struct `Bar` is private
    mod inner {
        struct Bar {
            field: u8
        }
    }
    let bar = Bar { field: 25 };
}

The same issue applies to any private fields of the struct being moved. In general, a macro cannot simply change the visibility, as this could result in an unsound API (e.g. if a raw pointer is exposed)

1 Like

Def-site hygiene does solve part of this. However, it's still useful for macros to be able to generate modules. Without this feature, any module-generating macros face unnecessary restrictions on imports when used within a function.

As far as I can see, here are the three possible ways that fn:: could be used. The RFC shows examples of each of these.

  • use fn::super::path::T — discussed below.
  • use fn::crate::path::T — redundant. This is equivalent to use crate::path::T.
  • use fn::self::path::T or use fn::path::T — redundant. As of "uniform paths" in rust 1.32, these are equivalent to use path::T (or use self::path::T if path collides with a crate name).

Regarding fn::super, in my experience macros that generate modules only ever want:

  • nothing from the current crate in scope in the module, selectively importing individual items from external crates; or
  • everything that is in scope immediately outside of the module to be in scope within the module i.e. the equivalent of use fn::super::*.

With that in mind, here is a loosely considered counterproposal that I believe addresses all of the real use cases that this RFC aims to address but is simpler.

mod the_module {
    use; // everything from immediately outside the module.
}
4 Likes

If you are sticking with fn::super, there needs to be an example of what happens when the mod isn't at the top level of nesting of the function:

fn f() {
    struct S;
    {
        struct S;
        mod m {
            use fn::super::S; // which S and why?
        }
    }
}
2 Likes

That looks very similar to my #[transparent] mod m { ... } counter-proposal to @Aaron1011 - https://github.com/rust-lang/rust/issues/64079#issuecomment-526963015 and below.


EDIT: Transparent modules were also suggested in other context during the module reform discussions, AFAIR the proposal was to make inline modules transparent-by-default in 2018 edition, but it was too late (I can find the link right now, cc @Centril ).

  1. First of all, I greatly like the idea of this RFC; or in other words, I greatly dislike the way Rust currently handles the scope and hygiene of items defined in a function's body.

  2. I personally think that macros should not be the only motivation, although in practice it is an important one. To me the limitations of the current design also apply to code refactoring or some code patterns that are blatantly impossible to do. For instance:

    fn main ()
    {
        const FO0: i32 = 42;
    
        fn main () -> Result<(), &'static str>
        {
            match 27 {
                // use `self::`-qualified constants to avoid catch-all match
                | self::FOO => Err("unreachable"),
                | _ => Ok(()),
            }
        }
        
        if let Err(msg) = main() {
            panic!("{}", msg);
        }
    }
    
    error[E0531]: cannot find unit struct/variant or constant `FOO` in module `self`
     --> src/main.rs:8:21
      |
    8 |             | self::FOO => Err("unreachable"),
      |                     ^^^ not found in `self`
    

    If the inner main was defined outside a function definition, the code would "work" (i.e., catch the typo).

  3. That being said, I agree with @dtolnay that fn::super does not convey an intuitive meaning. If we had to have a fn:: path element, I would expect it to behave as some sort of crate:: path modifier, but starting at the (innermost / outermost?) function body. This raises its own questions though:

    • Quid of nested function definitions?

    • Quid of fn:: usage outside of a function's body? The most logical solution would be to forbid them, but then macros are unable to generate "position-independent" code, so the logical choice then would be to be equivalent to a super::?

  4. So we go back to @dtolnay's comment about expanded code needing elements from the scope whence the macro was called. They suggest having a use ; statement. That doesn't feel super readable to me either, although it would be indeed quite useful for macros.

    • But use; would be the only use statement required to be able to use things from the outer scope. To me, use ... is just a namespacing shortcut. But with use; it is not, it becomes a required import.
  5. That's why I think two other paths should be considered. I think that both solutions require an edition change, though:

    • Regarding macros, in the same vein as $crate, a special metavariable expanding to crate with the hygiene of the macro definition site, I would love to have another special metavariable, say $call_site. This way use ; behavior could be achieved with use $call_site::*;

      It would also have other applications, such as, when using a macro like this:

      mk_foo! { enum Enum { Variant } } // generates a `foo!` macro
      

      What if the generated foo! macro wants to expand to code unambiguously referring to Enum::Variant? With $call_site (the one from mk_foo!), it would simply use $call_site::Enum::Variant (if the nested macro example feels contrived, trust me, it is not; I have stumbled upon this limitation a few times already).

    • Finally here comes the cleanest solution in my opinion (it is, at least, the most consistent one).

      In a future edition, make fn foo ... define a scope similar to mod foo, but where all the internal definitions are private (with a warning lint against using pub on such declarations).

      This way, use ; behavior could be replicated with use super::*;

3 Likes

That sounds similar to my 'Future Posibilties' proposal:

As far as a I know, there's no timeline for when the next edition would occur. This would make this feature completely unusable for an indefinite amount of time. This seems like an easy way for bugs to creep in, since it will be impossible for anyone to actually use this (even on nightly).

I think it would useful to have a solution that works on the current edition, which could possibly have a better syntax on a new edition.

That's a good point - it does seem a little strange to have a path qualifier that just modifiers other path qualifiers.

Here's another idea: we could mimic the pub(modifier) syntax, and allow something like

use(new_behavior) super::foo;

where new_behavior is some bikesheddable identifier that has the same effect as the fn qualifier would have.

This has the disadvantage of being unusable outside of a use statement. That is, both of these are possible with the fn qualifier:

use fn::super::Foo;
struct MyStruct(fn::super::Foo)

but with the use modifier, only the first is possible:

use(new_behavior) super::foot:
struct MyStruct((new_behavior)::super::Foo // Syntax error

However, it gives us the freedom to pick a more meaningful identifier than fn. It also emphasises the fact that we're modifying the behavior of path resolution, rather than simply resolving a normal path relative to the current function.

1 Like

That's a good point. I think it should resolve to the inner S - that is, the one declared directly above mod m. That preserves the current 'shadowing' behavior - there's currently no way of naming the outer S type from within the { } block.

More generally, when an fn-qualified path is resolving an item inside a function, it will use the 'shadowing' behavior of the closest lexical scope. That is, writing fn::super::MyType should resolve to the same type as when the user writes MyType immediately above the module.

However, it's still possible to access types declared within the same function, but in a higher 'pure-lexical' scope (e.g. a { .. } block, if expression, etc). This preserves the current behavior, where introducing a new 'pure-lexical' scope does not hide any existing items (except for any explicitly shadowed inside the new scope).

For example:

fn outer() {
    struct OuterStruct;
    struct Shadowed(bool); // Unused
    {
        struct Shadowed(u8);
        struct InnerStruct;
        mod inner {
            use fn::super::InnerStruct; // OK
            use fn::super::OuterStruct; // OK
            use fn::super::Shadowed; // Resolves to `Shadowed(u8)`
        }
    }
}

This means that writing a fn-qualified paths within a module mymod will not give you access to a type that you could not have referenced from the parent of module mymod - which I think is exactly what we want.

How would this deal with overlapping fn and mod items?

mod foo {
  struct Foo;
}

fn foo() {
  struct Foo;
  crate::foo::Foo; // <- what does this resolve to?
}
2 Likes

I think @dhm meant that fn foo would only be a module for the purposes of resolving super in paths. That is, it would still be impossible to explicitly name it in a path (crate::foo::Foo) in your example. That's why using pub on types inside a function would emit a warning - there's no way to reference them from outside the function, so pub won't have any effect.

2 Likes

This?

1 Like

I thought that functions and modules shared the same namespace, but since the function definition defines a constant they don't actually collide :sweat_smile:

  • the solution, as @Aaron1011 said, is to imagine that the function defines an anonymous module with an implicit use super::*;, and the inner items thus behave as if being in a classic module (and its being anonymous automagically grants the expected privacy w.r.t. the inner items);
1 Like

That's not strictly true currently. Although I can't come up with an actual useful example of when you would need pub on a function local type (the main thing I thought of was returning some form of impl Trait, but then the returned type also does not need pub so it's not necessary on types it references).

1 Like

One drawback that is not mentioned is that it's one more thing to know when writing macros. If you want a macro to be usable generically, you would have to use fn:: to refer to things from modules the macro adds, like you may also have to use $crate in other cases.

That's definitely true. In a future edition, we could make the fn-qualified behavior the default. However, I don't see any way to enable this behavior by default in the current edition without breaking backwards compatibility.

It would only be 'required' in the sense that failing to use the fn qualifier would result in the current suboptimal behavior. If this RFC is accepted, existing macros would continue to work unchanged - using the fn qualifier would just let them work in more places than they would without this RFC.

I'm starting to think that the use(new_behavior) syntax would be preferrable to the use fn:: syntax. While it's not directly useable in paths, I don't think it's that big of a deal. You can simpy write a use(new_behavior) statement instead of using a fully-qualified path (renaming the import if it would conflict).

The biggest downside that I can see is that this would make use statement 'more powerful' than regular qualified paths. Currently, I believe it's possible to delete every use statement from a project, and only use qualified paths (though I don't know why you'd want to). With this RFC, this would no longer be possible.

1 Like