miniRFC: RPIT auto-trait "inference" instead of current "leakage"

(Here, “mRFC” means mini RFC. I’m curious about community opinions so wrote this idea up in a rough RFC format. I don’t necessarily support fully adopting this RFC’s ideas, but I think they’re worth discussing.)


Summary

Currently, return position impl trait (that is, -> impl Trait) “leaks” auto-trait implementation based on the concrete type returned. We migrate towards inferring the auto-trait implementation based solely on the function declaration by using early and specific lint warnings.

Guide-level explanation

Though impl Trait is very useful for restricting the client usage of a function, one tricky point comes up around auto traits, such as the Send and Sync markers. Consider the following function definition:

impl<Complex, Type> My<Complex, Type> {
    fn my_complex_future(&self) -> impl Future<Output=()> + '_ {
        /* implementation */
    }
}

For futures especially, but additionally for almost any type, it would be useful to be able to Send this type between threads.

For this use case, Rust infers the auto-trait implementation status from the parameters to the function. That is, if all parameters to the function implement some auto-trait, then the return type is required to implement the auto-trait. If any parameter doesn’t implement the auto-trait, then the return type does not promise that it must implement the auto-trait.

If the return type is inferred to implement the auto-trait but doesn’t, you will receive a warning suggestion to add the ?AutoTrait bound to the return type to indicate that the return type may not implement the auto-trait. Similarly, if the return type is inferred to not implement the auto-trait but does, you will receive a warning suggestion to add either AutoTrait or ?AutoTrait to the return type for your desired semantics.

In editions 2015 and 2018, these are just warnings, and the actual auto-trait implementation of return position impl trait is inherited from the actual concrete type returned. In a future edition, these warnings may be upgraded to errors to support more deliberate auto-trait behavior for RPIT by default.

Reference-level explanation

A lint is added to RPIT functions to enforce the above guidelines for auto-trait behavior in RPIT. Here are the combinations and sketched versions of concrete warnings that could be produced to fulfill this RFC’s intent:

fn a(_: impl  Send) -> impl Trait { /* impl  Send */ }
fn b(_: impl !Send) -> impl Trait { /* impl  Send */ }
fn c(_: impl  Send) -> impl Trait { /* impl !Send */ }
fn d(_: impl !Send) -> impl Trait { /* impl !Send */ }

fn e(_: impl  Send) -> impl Trait +  Send { /* impl  Send */ }
fn f(_: impl !Send) -> impl Trait +  Send { /* impl  Send */ }
fn g(_: impl  Send) -> impl Trait + ?Send { /* impl  Send */ }
fn h(_: impl !Send) -> impl Trait + ?Send { /* impl  Send */ }
fn j(_: impl  Send) -> impl Trait +  Send { /* impl !Send */ }
fn k(_: impl !Send) -> impl Trait +  Send { /* impl !Send */ }
fn m(_: impl  Send) -> impl Trait + ?Send { /* impl !Send */ }
fn o(_: impl !Send) -> impl Trait + ?Send { /* impl !Send */ }

// Expressed in an ad-hoc syntax
fn one<T: ?Send>(_: T) -> impl Trait + «if (T: Send) { Send } else { ?Send }»;
fn two<T: ?Send, U: ?Send>(_: T, _: U) -> impl Trait + «if (T: Send) && (U: Send) { Send } else { ?Send }»;
warning: fn b should have a `!Send` return type, but `ReturnType: Send`
suggestion: Add a `?Send` bound to not provide the auto-trait
suggestion: Add a `Send` bound to promise the auto-trait

warning: fn c should have a `Send` return type, but `ReturnType` is not `Send`
help: (trace as to why)
suggestion: Add a `?Send` bound to not provide the auto-trait

error: fn j promises a `Send` return type, but `ReturnType` is not `Send`

error: fn k promises a `Send` return type, but `ReturnType` is not `Send`

Drawbacks

  • We change stable behavior (RPIT auto-trait leakage) for a clarity benefit, introducing an edition deprecation warning
  • This puts a strong requirement on -> impl Future authors to include + ?Unpin for self-referential futures
  • async fn desugaring needs to add a + ?Unpin bound
  • “Initialization pattern” for -> impl Future where initialization includes adding an auto trait to a parameter now would require a named return type to return a future implementing that auto trait
  • Cases such as the above where the return type does not implement auto traits conditional on the sum of all parameters, but rather just some, require the more explicit named type syntax (the author suggests that the case where this RFC’s inference applies is the common case)

Rationale

This idea came up when discussing async and Send bounds. async fn in traits would not be able to leak the Send implementation like is done in current unstable for non-trait fn. By inferring RPIT auto-traits by the function declaration rather than the body, this problem goes away.

The warnings are structured such that actual stable behavior is not broken, and most cases should “just work”. In the cases where this model and the actual behavior do not line up, a warning is given, and could potentially be upgraded to an error in the future.

Alternatives

  • Just don’t do this and stick with “leakage” as it exists today, and find some reasonable behavior for RPIT in traits other than this
  • Just do this for RPIT in traits, leave non-trait RPIT alone
  • Just do this for async fn, leave RPIT alone
  • Come up with an alternative inference scheme
  • Fully descriptive inline auto trait implementation description

Prior art

This inference behaves similarly to lifetime inference, specifically in async fn, as well as the regular auto-trait inference in structures. The inference is the same as if the function just returned a structure holding all of the arguments.

Unresolved questions

  • When should this be a hard error, and when should it be a warning?
  • How practical is the lint in cases where the parameters conditionally implement auto-traits? Can this have a reasonable warning?

Future possibilities

  • Upgrading warnings to a hard error in a future edition (explicitly not part of this RFC)

10 Likes

Addendum:

I have no idea what behavior type Existential = impl Trait should have in this case. This basically will require some form of impl Trait for struct members to make wrappers around unnamed types with complex auto-trait implementation feasible, but I don’t know which direction they should go on auto trait inference.

It seems to be that internally, auto trait “leakage” is good and error messages can be clear enough. It’s where it crosses a semver boundary that greater control really is wanted.

That makes me feel that (in this system) type Existential = impl Trait should unconditionally not promise any auto trait, and syntax somewhere around the form

pub struct S {
    inner: impl Trait,
}

should not promise auto traits on S either, but allow for safe impl AutoTrait even when unsafe trait AutoTrait if and only if the auto trait could be inferred for the generic parameterization provided on the impl. This way, it serves as a documentation and a commitment to the auto trait presence, though it weakens the “everything infers auto traits” benefit of auto traits to non-core auto traits.

That’s the main downside of any manual auto trait manipulation (and even 3rd party auto trait definition): auto traits you don’t know about may have the wrong behavior.

This doesn’t appear to give any consideration to generic type parameters which are ?Send:

#[derive(Clone)]
struct Wrapper<T>(T);

fn foo<T: Clone>(x: T) -> impl Clone { Wrapper(x) }

The output of foo here currently implements Send if and only if T implements Send. My understanding is that this was the major motivation for auto-trait leakage.

The intent of the description is for it to behave that way:

The intent of this description is that T: Send <-> RPIT: Send. (It’s when this mapping doesn’t line up that you would have to specify an always or fall back to manual wrapper types.) I can’t write it down formally like @Centril probably could, but the auto-trait inference is intended to behave as if you were returning a struct holding each parameter, and conditional auto-trait implementation flows through that.

Okay, it was the functions a-o that threw me off. These should have ?Send arguments as well.

…wait, no, I don’t think you can get quite the same behavior as what exists even with these rules. If a function takes <A, B>(a: A, b: B) and the output only contains A, how could you express the current behavior?

That’s where this scheme falls short, and requires falling back to manual wrappers (unfortunately), as we don’t have a more expressive way to express conditional trait adherence inline. (In this way it works similarly to async fn lifetime inference: make simple simple and allow control otherwise.) What you gain for this is clarity of trait adherence.

I added two new examples in an ad-hoc more expressive syntax to hopefully make the behavior a bit clearer.