[lang-team-minutes] private-in-public rules

In our most recent meeting, and actually a few meetings, we discussed the Private Type in Public API rules. We’re in a bit of a frustrating place with these rules:

  • Many people are confused as to their intention:
    • Their intention is to ensure that if you have a private struct, then nobody outside of the module can get their hands on an instance of that struct, and nor can they name the type. This implies that methods defined on that private struct (including methods from public traits like Clone which don’t have privacy protections) cannot be invoked except for within the module that defined it. This is a useful guarnatee for unsafe code.
    • The purpose of the rules is not to guarantee that you have properly exported all the things in your API that end-users might want. The initial RFC had that goal, but we scaled it back (perhaps wrongly so).
  • The initial implementation however was riddled with bugs.
  • @petrochenkov did excellent work fixing those bugs, and we now give future compatibility lint warnings for many add’l cases.
  • The expectation was to change those warnings to errors. But when we tried it, we found that a lot of extant code would break.
  • There are also some remaining holes, notably around traits and inference.

So basically we have to choose between:

  • Abandon the rules altogether; I think they play a useful role, though, and am reluctant to do that.
  • Make a minimal set of rules that leave to hard errors and enforce those. We can supply lint warnings for other cases. This would have minimal regressions for existing sources.
  • Enforce the rules as designed. This would regress a fair number of crates.

I’d particularly like to get @petrochenkov’s opinion here. @eddyb can probably supply facts as to the measurements we did about regressions.

What are the guarantees?

Before we do anything, I want to talk a bit about what kinds of properties we are trying to guarantee with these rules. Ideally, we would say that if you have a private struct X, then you know that the only place that can call methods on X or access the fields of X is in your module. But that’s actually not quite true (and you don’t really want that). Consider, for example, that you might want to create a Vec<X>, and then that this Vec will drop values of type X, which is potentially invoking a public API (the drop() method). Similarly, an Arc<X> might invoke clone() on X.

So the real property is something more like this: if you have a private struct X, then you control who can access values of type X. Moreover, when you let a value of type X out of your control (i.e., by passing it to another fn as argument), you know that code outside your module can only access X through members of public traits.

Here are some examples of what I mean. With the originally intended rules, you will find that you can’t just naively return a value of type X:

mod foo {
    struct P;
    pub fn f() -> P { } // illegal: return private type from public fn
}

But you could pass a value to a generic function outside the current module, and this function might use trait members:

fn clone<T: Clone>(t: &T) -> T {
    // this call to clone is not in the module `foo`, but it still accesses
    // a value of type `foo::P` (since T = foo::P)
    t.clone()
}

mod foo {
    #[derive(Clone)]
    struct P;
    pub fn f() { clone(&P); }
}

And you can package up these values as trait objects and they can escape arbitrary far:

trait Object { }
fn to_object<T: Object>(b: Box<T>) -> Box<Object> { b }


mod foo {
    struct P;
    impl Object for P { }
    pub fn f() -> Box<Object> { to_object(Box::new(P)) }
}

But in both of these examples of leaking, access was always mediated by a public trait.

The rules we intended

The current privacy rules are enforced at the definition site, primarily. This means that, for example, it is intended to be an error to define a public method that returns a private type. This way, we know that callers of that public method (which may, in general, be in other modules or crates) can’t get their hands on the private type. However, it could well happen that in practice all callers of the public method are within the current crate. In that case, the fact that it returns a private type isn’t really a problem.

This scenario actually arises fairly often in multi-module setups similar to those I described in my comment on the pub(restricted) RFC. For example, consider this case:

mod foo {
    struct PrivateToFoo;
    mod bar {
        pub fn private_to_foo() -> PrivateToFoo { ... }
    }
}

Here, the function foo::bar::private_to_foo() is public and returns a private type. But because that fn is defined in a private module, it will never actually be used outside of foo. Now, using pub-restricted, private_to_foo() would more properly be declared as pub(foo) fn private_to_foo(), but that is a new feature and there is a lot of existing code that is not using it (and which is currently getting future-compatibility warnings). Under the “error at use site” rules, this code would be accepted, because, while the definition could be used wrong, it isn’t actually being caller from outside of foo.

An alternative approach

@eddyb has been contemplating a more limited approach. After all, if what we want is to guarantee that values of private types are not used outside the module, we can make that an error by making it an error to have an expression whose type winds up being a private struct. That is, we don’t make it illegal to have a public function that might return a private value, but rather we make it an error to call that function from outside the module. So that means the previous example would be legal, since foo::bar::private_to_foo() is never called from outside of foo (and, in fact, is inaccessible outside of foo).

Note that we could still give a lint warning on foo::bar::private_to_foo() – but just a normal lint, warning you that you may have intended for the PrivateToFoo to be public, or else that you may have wanted to declare the function to pub(foo) and not pub.

Here is a variation of the previous example that would get an error under the use site rules. In this version, the module bar is declared as pub, and private_to_foo() is used from main(). This gets an error both today and in the proposed rules, but the site of the error is different:

mod foo {
    struct PrivateToFoo;
    pub mod bar {
        pub fn private_to_foo() -> PrivateToFoo { ... }
        impl PrivateToFoo { pub fn foo(&self) { } }
    }
}

fn main() {
    let x = foo::private_to_foo(); // <-- Error here! Inferred type of `x` is illegal.
}

The problem here is that you may have intended PrivateToFoo to actually be a public struct (ok, you probably would have picked a different name in that case, but bear with me). But you won’t know that there is a problem (at least based on the errors alone) until you try to use your library. That is why we’d still prefer to give a lint in this case suggesting that the public/private declarations are misaligned.

The catch: associated traits

It’s not enough to check the types of expressions for errors. We also want to prevent people from naming a private type from outside the module (and hence keep them from invoking associated functions on that type, like PrivateToFoo::new, and so forth). These names can result from various sources, but one of them is by normalizing associated types. Consider a case like this:

mod foo {
    pub struct Public;
    struct Private;
    impl Iterator for Public {
        type Item = Private;
    }
}

Now any code could write <foo::Public as Iterator>::Item to name the type Private. This is problematic. So we would still want to keep the existing rule that a private type can only appear in an associated item if at least one of the input types (i.e., Public here) is private.

(Similarly we have to close the existing hole that allows trait inference to sometimes infer a use of a private type the user could not have written manually.)

Now that we have to enforce this error in the impl, because otherwise generic code might name the associated type without knowing that it normalizes into a private type:

fn f<T: Iterator>() {
    do_something::<T::Item>();
}

Here I could call f::<foo::Public> and it would wind up using Public::Item which is Private.

Connection between pub use and associated type paths

One can view an impl with an associated type (as in the previous section) as defining a new “name” for a type. The type foo::Private can now also be named as Public::Item – but we make it illegal to define such a path. But there is another way to define an “alias” for an existing path: by using pub use. For consistency, then, we might consider making it illegal to pub use a non-public item:

mod foo {
    pub struct Public;
    struct Private;
    pub use self::Private; // Error here at def'n site?
}

Conclusion

Sorry, I’m finding it hard to organize my thoughts cleanly on this topic. But let me conclude by saying this. The way I see it, the first choice is if we will enforce “strong privacy” guarantees of the kind I described here. I still think we should – though there might be an argument saying it’s not worth the effort. But that would also allow oddities like the ability to infer a type that you would have been illegal for you to write explicitly (i.e., when a public function has a private return type).

Presuming we want to keep that general property, then we have to make some practical choices. I think all things being equal the definition-side enforcement is better, but if we wish to avoid regressions, then it seems pretty acceptable to me do the strict enforcement at use-site, but supply lints at the definition site.

In fact, this may be the more ergonomic option: it means that you can get away with slightly inconsistent public/private declarations (such as public functions that return private types) so long as you don’t use private types from the wrong places (that is the hard limit). Then you can come back and fix the lints once you’ve got your stuff setup.

6 Likes

Thanks for the writeup, @nikomatsakis!

There was one point I found a bit confusing. If we enforce at use site, why are associated types a problem? AFAICT, just being able to name a type by itself is not problematic in that world, because the only things you’d be allowed to do with that type would be mediated by public traits – anything else would be rules out by the hard error at use site. But maybe I’m missing something here?

I actually think that providing a hard error at use site and lint at definition site is a great approach. On the one hand, I believe it provides the strong privacy guarantee that “private types are accessed externally only through impls of public traits”. On the other hand, as you say, it’s the most flexible and ergonomic approach that retains that property.

Regardless of what exactly the compiler should object to, when it comes to patterns that used to work, what about giving those objections the status in between a hard error and a warning, namely a permanent deny-by-default lint? This of course causes breakage but not as much, since Cargo should disable it for dependencies and an end crate can manually disable it as a temporary workaround. After all, there doesn’t seem to be a fundamental soundness or safety hole here. If a safe module gets its hands on a private type accidentally exposed by an unsafe module, unsafety could result, but that’s a bug in the unsafe module, for which a lint is sufficient; there are certainly many other ways for unsafe modules to have bugs. A deny-by-default lint is a bit hackier than an error, to the extent that violations have to be properly supported forever, but sometimes hacks are the cost of backward compatibility.

How does this interact withimpl Trait? Is returning a private typed value as impl of a public trait OK?

@llogiq Yes, “leaking” private types in anonymized form (trait objects or impl Trait) is allowed (this way you can access only something public - the Trait's interface). Otherwise you would never be able to return closures, for example.

2 Likes

Good question, and something that @pnkfelix and I discussed for quite some time. =) It's a question of consistency and guarantees. Let's start with this setup:

trait Factory {
    type Element;
    fn make() -> Self::Element;
}

mod factory {
    struct Priv;
    impl Drop for Priv { ... }

    pub struct Pub;

    impl Factory for Pub {
        type Element = Priv; // illegal now, but we're talking about making it legal
        fn make() -> Priv { Priv }
    }
}

First, consistency. Imagine I write this code (outside of factory):

use factory::Pub;
fn foo() {
    let x: Pub::Element = Pub::make_element();
}

If we are checking the post-inference types, this will be an error. It will be an error because we normalized it to factory::Priv and you are not supposed to have access to factory::Priv. (If you imagine that we're going to distinguish between a type that you wrote in normalized vs unnormalized forms, then the whole idea of checking types post-inference starts to fall apart, since normalization is no longer something we can do at all in the type checker etc, and I'm not willing to go there.) But if I were to write this function differently, it becomes legal:

use factory::Pub;
fn foo() {
    bar::<Pub>();
}

fn bar<T: Factory>() {
    let x: T::Elem = T::make_elem();
}

Now it is legal because bar doesn't know what T is, so we can't normalize the type. I find this inconsistency surprising, but moreover, it means that now bar() has gained access to a factory::Priv. That in and of itself is not so surprising -- I mentioned in the beginning that factory could leak an instance of its private type outside of the dynamic extent of one of its fns, but hidden in an object type. But this is different: we leaked the raw type.

In other words, it's consistency again. If I wrote this function in factory, which is the same as Pub::Factory::make_elem, but not going through a trait:

mod factory {
    fn make_priv() -> Priv { Priv }
}

Then I wouldn't be able to call it from outside factory, right?

I contend that if the rules are this inconsistent, they will be easily misunderstood. If you have a public function that returns a private type -- whether it be declared on a trait or as a free fn -- I feel like the guarantees, such as they are, should be the same.

I'm still struggling to state those guarantees in a precise way. My first stab was that private things could only be manipulated from inside factory. But that's not right (and couldn't be) because factory can invoke generic functions like fn bar<T: Factory>(). Then I thought that such a fn can only be invoked from within the dynamic extent of code in factory, but that's not quite right, because factory could launch a thread, and it could return an object. But what both of those have in common is that factory started out in control of the private value (i.e., it created it) and it leaked it through an object or by calling a generic function from another thread. Those seem "intentional" to me -- i.e., if I am tracing through the code in factory, I will see (or could see) that it happens. In contrast, with the code snippets above, where foo calls bar::<factory::Pub>(), if I just read the code in factory, I cannot know that a value of Priv will get created and dropped from other code -- it's outside of my purview entirely.

Now, you might argue that factory may have created the foreign alias intentionally by defining the impl of Factory:

mod factory {
    ...
    impl Factory for Pub {
        type Elem = Priv; // <-- here
        ...
    }
}

In other words, that's like defining a publicly accessible alias. But that violates the principle we've been striving for, that I should only have to look at where Priv is declared (at the struct Priv declaration, in other words) to know definitively it is a "private" or "public" type. I shouldn't have to look at the transitive closure of pub use and impl items to figure it out.

2 Likes

I view that as the same as returning an object. This fits into the view of "I can leak values of my private types, but foreign code will always use them mediated by a trait". But really the right way to think about it is that if you leak a value to the outside, its entire public API surface becomes accessible. If you want to tightly control that, you should use a newtype with a narrower surface. I say this because even if a private type is leaked as an impl Trait or object, using specialization it could potentially be downcast to other traits. While it is true that, because the type is private, foreign code couldn't "downcast" to the true precise type, they can downcast to other traits and those other traits may easily invoke public inherents methods of your type that you didn't expect.

It's possible, but what does it really buy us? If we adopt the "hard error at use site, lint at definition site", then the lint is really there not to protect you but to warn you that you have an internal inconsistency. i.e., you may have forgotten to declare something as pub, or you may have declared something as pub that you didn't intend to. This view makes returning a private type from a public fn seem analogous to e.g. having a let mut x where x is never mutated. It's not wrong, but it's...suspicious.

Actually, I didn't fully process what you were suggesting. I think you were saying that we could make it a deny-by-default lint at the "use site" as well? I am reluctant to do that, because I think it sends a weird message. First, it suggests to unsafe code authors that this is a guarantee they can rely on, when it is not, since other code might disable the lint. Second, those warnings come at the use site, where unsafe code authors might never see them. I think the lints only make sense at the definition site.

This is an underdocumented detail, thank you for bringing it up.

What is the actual effect of pub vs private? It seems real world effects are derived from other things, like “is publically reachable” (one real world effect would be: “This code is compiled into the library even if not used internally”). When organizing a large crate, I have to mark a lot of things pub to be used across the crate, but these things are never intended to be exposed.

One question is, do I have to answer for these internal pubs somehow? Are they part of a public front of the crate in some way.

Quoting from the "alternative approach" section, the idea is:

making it an error to have an expression whose type winds up being a private struct.

How does the following fit in? All of my types and all of my functions are "pub" but the user can write an expression whose type is not "public" in a different sense. Is this working as intended as is, or is this something we would aim to lint against or prohibit either on my_crate side or the caller's side?

I see @petrochenkov's comment about it being fine to "leak" private types in anonymized form (trait objects, impl trait, closures). This doesn't feel like the same situation but maybe it is?

pub mod my_crate {
    pub use self::structs::Pub;

    mod structs {
        pub struct Pub(pub Priv);
        // pub because it is used elsewhere within my_crate
        // but is not public API
        pub struct Priv;
    }

    pub fn get() -> Pub {
        Pub(structs::Priv)
    }
}

// Users of my crate
fn main() {
    // Cannot name this type
    let _ = my_crate::get().0;
}

That is what this thread is about, isn't it. The goal of my rules is to have a (relatively) succint answer: intuitively, if a module X defines a private struct/enum, then X "controls" all instances of that struct/enum, meaning that code outside of X cannot construct an instance nor can they directly manipulate one, except when (a) intermediated by traits and (b) given access to the instance by X.

This is precisely a scenario I would like to avoid. I want to be able to quickly tell how visible something is from its definition. This has been a motivation for things like pub(restricted) as well: rather than declaring everything that is not private to be public and then having pub use define how public something is, we can use up-front declarations to state our intentions (and then we can check that pub use doesn't expose the item too broadly). This is also advantageous for things like inherent methods, of course, which don't have a mechanism like pub use to limit their use.

It is. The authoritative thing is not the pub use, it is the declaration on the struct (in this case, pub struct Priv). This is precisely the scenario that pub(restricted) was meant to address: you should be able to write pub(my_crate) struct Priv, which then declares that Priv is public to the path my_crate but not outside of that. (Alternatively, pub(super) would have worked here.) This makes the default ("private") state expressable as pub(self).

That said, I think it'd be advantageous to have a lint that warns you about this scenario, because it does represent an inconsistency. Once pub(restricted) is stabilized (hopefully soon), the lint could suggest that you alter the pub declaration on struct Priv (and the pub Priv field) to be pub(super), or that you add an additional pub use, since there is an inconsistency.

3 Likes

I would like to give an example in which I want “provide” private type in public API. Say we have a generic private type Foo<T>, I write generic impl's for it, but do not want to expose this type, because it’s essentially an implementation detail. What matters is derived types which I make public pub type FooBar = Foo<Bar>and pub type FooBaz = Foo<Baz>. Currently compiler forces me to unnecessary move Foo<T> into separate module and I am afraid more strict rules will forbid such kind of public type aliasing altogether forcing me to write more unnecessary code.

@nikomatsakis

what did you mean by "reexport checking in resolve"

Checks preventing pub use Private;. They are performed early and work on "name level"/"path level" (as opposed to "type level") and prevent suddenly turning private items into public items by reexporting them. By checking types alone we cannot prevent leaking "frontend" entities like modules through reexports. (The exact rules implemented as part of RFC 1560 are slightly more complex than "pub use Private; is an error", because they need to work with pub(restricted), globs and imports in several namespaces).


I think that (as I explained in the internals thread) we do need some checking for associated types in impls, at least.

I don't think associated types need any special treatment (modulo linking, see below) because they are equivalent to impl Trait from privacy and stability point of view when used in generic context. Suppose we have a private associated type:

trait PubTrait {
    type A: Bounds;
}
impl PubTrait for PubType {
    type A = PrivType;
}

When this type is used in concrete, non-generic context

let a: PubType::A = ...;

we leak everything about PrivType - we can call its inherent methods, copy it several times if it's Copy, cast it to other implemented non-Bounds traits, transmute it to something, etc, etc, etc. The author of PrivType can't change anything about PrivType backward compatibly, because all the details of PrivType can be used by third parties. This is wrong, you can't name a type private if it has to adhere to such severe restrictions. That's why concrete, "normalizable" associated types need to be checked for privacy on use.

When the associated type A is used in generic context, like TypeParam::A it's equivalent to public anonymized type impl Bounds for users. impl Bounds is the only knowledge we have about PrivType when TypeParam is suddenly equal to PubType. The author of PrivType can change it in arbitrary ways and even completely purge it and provide some other implementation of his public contract impl PubTrait for PubType.


Link time visibility is very useful, but under-appreciated perspective on privacy issues. Here are some points.

  1. We want to minimize link time visibility of all crate symbols.
  2. Language privacy is the primary fuel for this minimization.
  3. Impls of public traits for private types is the tricky case.
  4. We want all the numerous #[derive(Clone, PartialEq, Debug, Hash)] impls for private types to be "internalized" and not to stick out from our crates.
  5. We still want impl Trait to work for private types at link time, so impls participating in impl Trait need to be "externalized".

The last point is relevant to private associated types in generic contexts as well, because they are so similar to impl Trait. If impls participating in Bounds for these associated types are "externalized" as if they participate in impl Bounds, then we are good and don't need any additional interface checks for associated types.

Regardless of precise details of definition site checking, I think the immediate way forward is:

  • Disable private-in-public checks in bounds/where-clauses to enable "perfect derive" (RFC 1671 talks about this)
  • Implement late type privacy checking in addition to the existing system (this closes some type inference related privacy holes and improves forward compatibility).

I found the warning at definition site useful when writing Rust wrappers for C libraries. It does catch cases where I forget to wrap or export all structs. I wouldn’t catch it otherwise, because I’m not calling any of the functions myself.

1 Like

I would expect pub use of a private type to be OK. I’d expect that to override type’s privacy, making it public under the new name (but still not allow it to be named using the old name).

I’d use it for:

struct ImplementationSpecificName;
pub use ImplementationSpecificName as GenericName;

To e.g. rename linked list to a container, since user doesn’t need to know how data is stored.

Note that "perfect derive" is a chimera. It is not actually possible in the general case, because it will overflow for recursive types. Fixing this would imply coinductive matching, which is unsound in general. Given all the things we are trying to achieve with the trait system, adding general purpose coinductive matching feels out of the question to me. (There are interesting papers on combining inductive/coinductive reasoning, but those same papers indicate that it is easy to get surprising, unsound results from such systems, and hence that they must be used with caution. I confess I don't fully understand the bounds here.) (Sorry this is kind of a jargon-y response, I can try to elaborate more later.)

I think for the purposes of derive, a more fruitful alternative is to allow people to annotate their type parameters to indicate what the requirements ought to be. For example, the derive for this type is too strict, because it requires that T: Copy in order for Foo<T>: Copy:

#[derive(Copy)]
struct Foo<'a, T: 'a> {
    x: &'a T
}

We might therefore have some kind of annotation on T to indicate that this is not needed. It's not ideal, but to be fair it also does seem like part of the public API for the type:

#[derive(Copy)]
struct Foo<'a, #[ignore(Copy)] T: 'a> {
    x: &'a T
}

Shame it's so verbose though. =(

1 Like