[Idea] About soundness in specialization

I am sure most people are aware of Maximally minimal specialization: always applicable impls. One rule it describes is that "each type parameter can only be used once". I argue that such a rule is not necessary for soundness. I explain more in this gist.

With all the above said, it might surprise some people that the nightly compiler (as of 2/3/2022) already appears to behave in the manner described above.

(quoted from here)

Not in a way that helps with soundness:

#![feature(specialization)]
#![allow(incomplete_features)]
#![allow(unused_variables)]

trait Is<T> {
    fn is(&self);
}

impl<A, B> Is<A> for B {
    default fn is(&self) {
        println!("no");
    }
}

impl<T> Is<T> for T {
    fn is(&self) {
        println!("yes");
    }
}

fn is_static_str<N: Is<&'static str>>(t: N) {
    t.is();
}

fn fully_generic_wrapper_function_without_trait_bound<N>(t: N) {
    is_static_str::<N>(t);
}

fn main() {
    let static_str: &'static str = "hey";

    let deallocated_at_end_of_main: String = "hello".to_string();
    let not_static_str: &'_ str = deallocated_at_end_of_main.as_str();

    is_static_str(static_str); // Ok, prints "yes"
    is_static_str(0u32); // Ok, prints "no"
    is_static_str(&0u32); // Ok, prints "no"
    // is_static_str(not_static_str); // Error, borrowed value does not live long enough

    fully_generic_wrapper_function_without_trait_bound(static_str); // yes
    fully_generic_wrapper_function_without_trait_bound(0u32); // no
    fully_generic_wrapper_function_without_trait_bound(&0u32); // no
    fully_generic_wrapper_function_without_trait_bound(not_static_str); // yes (no Error!)
}

And on a quick read of your proposal I don’t see how your rules can address the soundness problems from trait-bound-less wrapping functions, since the information about the specialized instances is completely lost/hidden to type-checking the call to such a wrapper function. The only (somewhat sane) sound approach I can come up with is to forbid a function like the fully_generic_wrapper_function_without_trait_bound above entirely.

2 Likes

we'd expect the fully generic wrapper function to always print no.

1 Like

I admit I was surprised it is even possible to call a function like that with no trait bounds. Although I have been casually using rust since just before 1.0, I only learned about it recently while desperately trying to get my use case working. So I had not considered it for my proposal.

In considering it now, I agree with @Soni that I would expect the fully generic wrapper function to always print no.

I think this would be consistent with the closest current behavior shown in the Playground. I was trying to use the "trick" explained here and here. I thought that monomorphization would mean that the call to get would know the types involved enough to return Hnode<&u8, End>, but unfortunately it returns End.

I consider calling functions without bounds like that is very rare and not actually very helpful. I think always including trait bounds should be necessary to make things clear (but that is a different issue).

In considering fully_generic_wrapper_function_without_trait_bound, I would add to my proposal that it should be considered that the compiler should not be able to see enough to know to use the specialized version, and should always print no.

EDIT: I had more time to think about this, and I actually consider the right way to handle the situation would be to restrict users from ever writing fn is_static_str<N: Is<&'static str>>(t: N) in the first place (fully_generic_wrapper_function_without_trait_bound should not be restricted). I consider that if the Is trait is marked with #[specialization_predicate], then users of the trait should be restricted from passing non trivial lifetimes as input parameters like in is_static_str (fn is_str<'a, N: Is<&'a str>>(t: N) should be fine). I believe this would avoid all lifetime based dispatch issues.

I'd also like to point out that while I said in my gist that the compiler already appears to behave in the way I described, it seems it actually doesn't in view of @steffahn's above post. This is because inside of fully_generic_wrapper_function_without_trait_bound, the compiler does not seem to be aware of impl<T> Is<T> for T at all, and is only considering impl<A, B> Is<A> for B. Otherwise, fully_generic_wrapper_function_without_trait_bound(not_static_str) should still fail. But if it did fail, this would also be very surprising since failure would be caused enterally due to internal implementation details, and it is not obvious from the function signature that a non-static str would cause a compile error.

2 Likes

Further to my above, while I was trying to use the "trick" explained here and here at first, I think the code I ended with did not actually use it at all, but it should not matter. After monomorphization, I am surprised it returns End. I actually expected it to complain about multiple applicable items in scope.

Having code generation sometimes pick a less-specialized impl when a more-specialized one would apply is surprising, but not inherently unsound. Where it gets unsound is when associated types come into play. Read this post:

http://aturon.github.io/blog/2017/07/08/lifetime-dispatch/

6 Likes

I'd like to restate my proposal more concisely, and point out how the compiler would need to be modified based on my proposal in view of the fn fully_generic_wrapper_function_without_trait_bound<N>(t: N) example above.

Basically, my proposal is to make both the type checker and trans agree on what implementation to choose, by making them both choose without considering lifetimes at all (pretend they are not there). The type checker still confirms that the lifetimes checkout, but only after the initial decision of what implementation to use. In this way, it should be sound, even when associated types come into play.

In view that the fn fully_generic_wrapper_function_without_trait_bound<N>(t: N) example above currently prints "yes" for both static_str and not_static_str, it appears to me that within the function body, the type checker is only considering the more generic implementation for the call to is_static_str::<N>(t). So the type checker thinks "no" will be printed for both static_str and not_static_str. However, trans picks the more specific implementation and we see "yes" for both. Thus, we see the type checker and trans disagreeing, mainly because the type checker is not considering the more specific implementation.

In order to be in line with my proposal, the compiler would need to be modified so that the type checker also considers the more specific implementation inside of fn fully_generic_wrapper_function_without_trait_bound<N>(t: N), such that the error "borrowed value does not live long enough" will still be generated when not_static_str is passed.

I do note it may be surprising if the type checker fails with "borrowed value does not live long enough" when a user tries to call fully_generic_wrapper_function_without_trait_bound(not_static_str), since the function signature does not suggest any lifetime requirements. I wonder if it is possible to implement warnings to mitigate the issue. For example, I wonder if it is possible to generate a warning for the author of fn fully_generic_wrapper_function_without_trait_bound<N>(t: N), when it is detected that specialization is being used in the body. The warning could suggest that users of the function may be surprised by some behavior, and suggest that explicit trait bounds be provided. Or maybe using specialization in fully generic functions should not be allowed (at least is some cases).

I also just want to say I appreciate the time spent considering my above discussion. I know everyone does not have all the time in the world to spend on this, and I feel like I am far behind everyone in my understanding here. So thank you.

Note that this goes against two principles that Rust (mostly) tries to follow:

  • generic functions should never fail at instantiation-time, no C++-style template instantiation errors
  • function signatures fully reflect the interface of the function (in terms of whether or not calling the function results in a type error); the function body is irrelevant

Rust does already violate these principles to some degrees, e.g.

  • associated constants in (generic) trait impls are only evaluated if they’re actually used, for each instantiation of generic arguments, and this evaluation can panic (causing compilation errors)
  • impl Trait return types or async fn infers auto-trait bounds on the return type based on the body of the function, without those being reflected in the signature

Still, I think we should try hard not to make it even worse.

4 Likes

As of specialization, can we make the type checker to also recommend an impl for codegen to use? Like, an alternate scheme where typechecker picks the most specific version of code for a specific context, and recommends it to codegen.

If codegen faces a situation when each context (read crate) recommends it a different impl then it's error, but should probably avoid it somehow...

For what it's worth, I was wrong here: the current implementation is more unsound than I thought.

The fundamental design flaw in specialization, as originally designed and still implemented, is as follows: You can have a function foo<T>() where foo<&'static str>() and foo<&'not_static str>() should logically behave differently, because the function body references some impl that has a specialization for &'static str. Yet Rust erases lifetimes prior to monomorphization (and this decision is baked into the semantics of the language, not just an implementation limitation). So the compiler has to generate code for foo<&'??? str> without knowing whether the specialization should apply.

What does rustc currently do in this situation? I thought it conservatively assumed the specialization did not apply, which seemed to me like the 'most correct' behavior. As I mentioned before, this is unsound when combined with associated types, as described in the blog post I linked. But without associated types it's merely surprising.

In reality, though, rustc sometimes handles this situation by assuming the specialization does apply. (I haven't delved into rustc's code to know exactly when.) @steffahn's example in this thread is one demonstration of this. Using this, you can conjure lifetime bounds out of thin air without any use of associated types.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.