[Pre-RFC] Warn about inaccessible types in public function signatures


#1

This is my first RFC attempt, corrections are welcome.

Problem

Public structs / enums can currently be declared inside private modules and be used in public interfaces (as function return types or arguments). This is currently legal:

pub mod foo {

    mod private_bar {
        pub struct PublicBar;
    }

    pub fn public_to_foo() -> private_bar::PublicBar { private_bar::PublicBar }
}

This is also legal:

pub mod foo {

    mod private_bar {
        pub struct PublicBar;
    }

    pub fn public_to_foo(arg: private_bar::PublicBar) { }
}

This, in itself is not a problem. The problem comes when using public_to_foo() outside of foo:

fn main() { 
   let a = foo::public_to_foo();         // this works, but if the return
                                         // is an enum, you can't match it

   let b = foo::private_bar::PublicBar;  // this doesn't work at all
   foo::public_to_foo(b);  // so no way to pass the argument to a public function
}

Case A is problematic when a is of an Error type. Because there is no way to get the underlying type of the error, since the error (plus the error does not get documented using rustdoc). Playground example

Case B is problematic in any way, making a public function “useless”.

Both cases are very real in library design right now. It is up to the library maintainer to “correctly re-export modules”.The problem is that it is a pitfall when designing libraries: You want the struct to be public, but the struct / enum isn’t actually “public”.

What led to this

In the topic [lang-team-minutes] private-in-public rules @nikomatsakis said:

I support the “wrongly so” and this (pre-)RFC is about changing the behaviour to be a warning like #[warn(inaccessible_struct)].

I noticed this behaviour here (v.0.5.0): https://docs.rs/osm-xml/0.5.0/osm_xml/struct.OSM.html#method.parse

Compare this against v.0.5.1 (fixed): https://docs.rs/osm-xml/0.5.1/osm_xml/struct.OSM.html#method.parse

All because of a missed pub mod errors. The library designer clearly intended the Error struct to be public. But the compiler didn’t warn that the returned error is effectively useless. You can’t implement std::convert::From on it, you can’t match it, the only two things possible are calling unwrap!() or printing / logging the error and continuing with a half-parsed file. It isn’t possible to get the underlying cause and handle a case-by-case “soft fail”.

Solution

The rule I would like to introduce, is:

If a struct / enum is used in a public function (such as a argument or return type), the full module path (e.g. library::container::struct) must be public OR the full module path / the path to the struct must be publicly exported (via pub use).

This also solves the problem that the returned struct isn’t documented (via rustdoc). For structs, this rule is already enforced (you can’t have a nested struct where the nested value is a private struct). For traits, this is also enforced. It would be logical to enforce this in functions, too.

Risks

This should be introduced as a (on-by-default) warning, not a breaking change - after pub(restricted) is stable. So it would not break anything.

Library maintainers who mistakenly did this in their libraries will correct it. Library maintainers who did this as a design decision, can either turn the warning off or use pub(restricted) to limit the scope of their public types.

Thank you for reading.


#2

We use this on purpose in Rayon to “close” traits, so we can extend them with new methods without incurring a breaking change to outside implementations, knowing outsiders can’t exist.

For cases like ParallelString with only one implementation, we could probably do this differently, like requiring AsRef<str> with a blanket impl, and then implementing all methods directly in the trait definition. Now that I’ve thought of that, I might even do it! But I don’t know how to handle Pattern differently.


#3

Actually, this breaking change would be very large, affecting almost everyone. Because pub(restricted) isn’t on stable yet, any time I want to re-export a type upward (even if not all the way pub) I have to use pub right now, so a huge number of types are annotated pub in private modules.

Put another way, not everything marked pub is actually intended to be a part of the public API. For example:

pub struct PublicStruct {
    field: PrivateImplementationDetail
}

mod detail {
    // Public so it can be used as an implementation detail of something in the parent module
    pub struct Private ImplementationDetail;
}

All of my crates do this, so I’m sure this is very common.

I agree that our current system is confusing & error prone, and this is a good example of the problem. I’d be interested in finding a way to migrate to a system in which publicity is determined at the item level instead of being a kind of union of the item and module publicity.


#4

I understand the issue - it seems that making this a breaking change would not be a good idea. A warning would, however, be appropriate. Most library maintainers will at least notice errors in their API design. It could be turned off - instead of the compiler just doing nothing.

I do support the pub(restricted) approach. Projects like Rayon can then use pub(crate) instead of the public / private hack. And the warning should tell you that if you want to restrict the usage of pub types, you should migrate to pub(crate) , etc. I didn’t know this was even a feature in nightly, sorry.


#5

I think Rayon is actually doing this intentionally - they want the trait to be public but not implementable. There may be a better way to express this, but basically crates like rayon are exploiting this gap right now.

There is another use case for the current system: you want to hide were the item is actually defined, but re-export it publicly at a different point. A lot of crates do this to make their public API surface smaller than their internal module structure.


#6

The title of the RFC is unfortunately misleading – the goal is not to eliminate all uses of public structs in private modules, but only [quote] If a struct / enum is used in a public function (such as a argument or return type) [/quote]

where I assume “public function” means one that is accessible to external clients.


#8

(FWIW the better way would be to allow private associated items (methods etc) in a trait, which would make it impossible for anyone without access to the private item to implement the trait, in much the same way as you can’t construct an instance of a struct without having access to its private fields. This is an awkward question syntactically because Rust already made the choice that the only visibility qualifier is pub (there is no priv) and that it’s not required on trait methods.)


#9

Maybe an attribute on the trait could flip it to opt-in pub?


#10

I edited the last section of my first post. First of all this should be introduced after pub(restricted) is stable - pub(restricted) also works for traits. It should be a WARNING, not an error (changed title). Rayon should use pub(crate) to limit the scope of their public types to the rayon crate.

This discussion was already done here: https://github.com/rust-lang/rust/issues/35005 and it ended with “won’t change because current system is working”. Yes, it’s working, but it’s not working nicely.

If the community still says “it’s ok”, we can end this discussion because it would lead nowhere. Exposing non-accessible items in public APIs is not a good design because it confuses outside library users (my opinion). Meanwhile you still want flexibility like exposing structs / traits within your crate, but not expose them in a public API. This could be solved by using pub(restricted).

Rayon, etc. did it this way because they had no other choice. There is simply no way to do it in stable yet. The logical decision would be to postpone this until after pub(restricted), otherwise it would warn needlessly in crates like this.

Again, there are several things I’ve changed in correction to my original post:

  1. It’s a warning, not an error, it can be turned off.
  2. Do this after pub(restricted) is stable and advise people to either check their libraries API or use pub(restricted)

pub should be handled like pub(absolutely_public).


#11

Rayon doesn’t want pub(crate) for those though. We want ParallelString to be globally usable, yet we want to control implementations so we can freely extend it.


#12

I have several questions:

  1. What is the reason against putting ParallelString in a public module?
  2. For people not too familiar with this, could you give an example of how you intend ParallelString to be used?
  3. Does this only apply to implementations of traits? My original post was about this stuff in function signatures, not trait implementations.

#13

It isn’t ParallelString that is private. It is a public trait in a public module. But it has a method that returns an unnameable type. Here is the implementation. PrivateMarker is a publicly unnameable type that is returned from a method of ParallelString. That means that means that the trait can’t be implemented outside of rayon, but is usable anywhere.

This means that new methods can be added to ParallelString, without needing to worry that dependent crates have implemented that trait on their own types (and thus be broken when new trait methods are added). The point of ParallelString is to add new methods to str, rather than to define a new type of thing, so it makes sense to be able to add stuff, as new uses occur.

Regarding (3), the type occurs in the method signature of a trait method (and incidentally in the trait implementation).


#14

Right - it’s a fully public trait apart from that one hidden method, which uses the pub-in-private trick to let us control implementations.

That said, I opened rayon#327 to make it so we can always implement the methods directly in the trait, with a blanket impl for anything meeting the constraints. So we should get the desired ability to add more methods without incurring a breaking change.

I still don’t have a good way to deal with rayon::str::Pattern. That’s not an API I really want users to call at all – they just need to know what types implement it to use with par_split. If we had stability attributes, I would mark it #[unstable] like std::str::pattern::Pattern. Or if we had the means to define private trait items, I would mark all of its methods private, so only its basic existence is known outside the crate.


#15

I think rayon’s PrivateMarker sounds like a standard enough technique, so if this warning were added then we’d need #[private_allowed_in_public] for traits and types like PrivateMarker or whatever.

One could however reduce its necessity by supporting inherent impls for traits : https://github.com/rust-lang/rfcs/issues/1971 And conceivably a trait could be private while its inherent impls created public methods.


#16

Just asking, but what if you would do: pub(crate) trait ParallelString // pub(crate) trait Pattern? This way nobody outside of the crate can implement the trait, but the functions, etc can be public. The struct PrivateMarker should then also be pub(crate) to disallow other crates to use it.

Or am I thinking the wrong way?


#17

I think the real problem with this proposal has to do with re-exports. Essentially what you want is to actually fully prove that every item used in the public API is actually publicly accessible, whereas right now we take a rather weak best effort approach. The text says that it should be accessible at its canonical path, but that’s pretty problematic - many items are only accessible through a pub use self::whatever::Whatever; type of re-export, and this is intentional.

But a check that accepts those re-exports would have to walk the entire module graph (which could be cyclic) looking for re-exports (which could be transitive). I think solving this perfectly is NP hard or something like that, which is why this incomplete solution was accepted right now.

A solution that I think would satisfy @sharazam’s intent and also be feasible to implement is this: if an item is marked pub, but is not accessible at its canonical name, there will be a warning. You can supress this warning by tagging the item allow(something_or_other), which you have to do if you’ve used a re-export to expose it and don’t want to expose it at its true name.

While that would make it less likely that errors like this occur, I’m not sure that most users would be happy having to add these annotations. I think a large portion of public items are only public through re-export. There’s a trade off here.


#18

Rayon does both:

  • We have traits that we want to be public, but not implementable.
  • We also have types that have to be public because of the “public type in private API” rules, but where what we really want is pub(restricted).

#19

Yes, I am roughly in favor of such a lint, now that pub(restricted) is an option. Or at least I’d like to give it a try and see how it feels.

Another interesting thought experiment that @wycats proposed to me at one point: what if pub(export) meant public outside the crate, and pub meant pub(crate)? I feel like I would (in Rayon) very much appreciate the explicitness of that. Obviously not backwards compatible, but this intuition – that it’s worth calling out the “potentially semver breaking” sort of publicity – might inform other designs (or perhaps if we decided to proceed with some more revolutionary change with an epoch).


#20

In the case of pub fn split<P: Pattern>, I don’t think pub(restricted) will let us restrict just Pattern, and we do want the fn itself to be fully public. So I think here we’d need a public trait with private or pub(restricted) methods.


#21

Right, that falls in the first category (traits that “closed” to impls).