Postmonomorphism Asserts and Bound Hiding

While having our bounds stated up front in the type system is almost always preferable, sometimes post-monomorphism errors occur. Some of these are ones that would be impractical or even impossible to avoid.

For example, up until recently the bounds that transmute wants could not even be stated in rust. This would be a case of 'impossible'. Even more annoyingly in my opinion, it does not even use post-monomorphism errors, but instead errors on any use of generics at all, even ones that could be trivially checked pre-monomorphism, such as transmuting a reference to a type to a reference to a transparent wrapper around the type.

But as we get more const generic features, I expect it will become increasingly often the case that one merely wants to 'hide' a bound. For example, if one wishes to call two functions which wind up having the const bounds that assert (A + B) + C = X + Y and A + (B + C) = X + Y, one probably does not want to have a signature which includes both bounds: after all, either implies the other!

However, in full generality figuring out what bounds are redundant with each other is turing complete, and we probably don't want the compiler to do a fully general proof search that your bounds are sufficient.

A case where limitations of the compiler's reasoning wind up actually creating the problem of 'impossible' occurs with traits. A trait may specify what the strictest bounds a method on it is permitted to have. If the trait says the bounds are (A + B) + C = X + Y, and we want to call a function that requires A + (B + C) = X + Y we are sunk (Or are we... more on that later.)

Even without const generics, the same problem can happen by simpler limitations on the compiler's reasoning. If a method has the bound Self: Sized, and I create a transparent wrapper, forwarding that method is impossible because the compiler will not infer T: Sized from Wrapper<T>: Sized. Though I would like to see a transparent newtype wrapper that is sized without its field being sized!

While that last one I think should probably actually be fixed, the const generics case shows that this is rather fundamental.

Therefore, it may be desirable to let the users freely access post-monomorphism asserts and errors. That is to say, make it so a block of code can add extra bounds which are not 'exposed', and are only checked post-monomorphism. I recognize that such things could be easily abused. Hiding a 'debug' bound only to give people unexpected errors when they pass in a non-debug type would be annoying library behavior, to say the least.

However, on nightly this whole system, admittedly with a lot of boilerplate, can be done using unstable features. Including the more dubious uses.

I have been experimenting with it, and wish to release the general form to the community. Both to potentially use on nightly, and to seriously consider if this is actually desired. I think it should be permitted, as many valuable patterns seem to require it, but the tool can easily be abused to basically have ones types simply lie and push us back towards the error situation of C++ templates.

While I kind of want to turn it into a crate with a macro setup, as especially demonstrated by the woggler example it can get complicated, and I am rather uncertain how to make it work in an automated fashion. It is rather similar to the general case of trying to write closures, but trying to reuse the existing closure machinery without any of the intermediate types being invalid requires some extra caution. So while I may work more at trying to wrap it in a nice interface, I encourage those more practiced at such macro and closure manipulations to take their own stabs at it.

trait GivenSizedCallback<R> {
    type Target: ?Sized;
    fn go(self) -> R where Self::Target: Sized;
}

trait SizedProof<T: ?Sized> {
    fn prove<R>(&self, func: impl GivenSizedCallback<R, Target = T>) -> R;
}

impl<T: ?Sized> SizedProof<T> for ! {
    fn prove<R>(&self, _: impl GivenSizedCallback<R, Target = T>) -> R {
        *self
    }
}

impl<T> SizedProof<T> for () {
    fn prove<R>(&self, func: impl GivenSizedCallback<R, Target = T>) -> R {
        func.go()
    }
}

trait GivenDebugCallback<R> {
    type Target: ?Sized;
    fn go(self) -> R where Self::Target: std::fmt::Debug;
}

trait DebugProof<T: ?Sized> {
    fn prove<R>(&self, func: impl GivenDebugCallback<R, Target = T>) -> R;
}

impl<T: ?Sized> DebugProof<T> for ! {
    fn prove<R>(&self, _: impl GivenDebugCallback<R, Target = T>) -> R {
        *self
    }
}

impl<T: ?Sized + std::fmt::Debug> DebugProof<T> for () {
    fn prove<R>(&self, func: impl GivenDebugCallback<R, Target = T>) -> R {
        func.go()
    }
}

trait MaybeConjureProof: Sized {
    const RESULT: Option<Self>;
}

impl MaybeConjureProof for ! {
    const RESULT: Option<Self> = None;
}

impl MaybeConjureProof for () {
    const RESULT: Option<Self> = Some(());
}

const fn try_sized_proof<T: ?Sized>() -> Option<impl SizedProof<T>> {
    trait TryProof {
        type Result: SizedProof<Self> + MaybeConjureProof + Copy;
    }

    impl<T: ?Sized> TryProof for T {
        default type Result = !;
    }
    impl<T> TryProof for T {
        type Result = ();
    }
    const fn go<T: ?Sized>() -> Option<<T as TryProof>::Result> {
        <<T as TryProof>::Result as MaybeConjureProof>::RESULT
    }
    const { go::<T>() }
}

const fn try_debug_proof<T: ?Sized>() -> Option<impl DebugProof<T>> {
    trait TryProof {
        type Result: DebugProof<Self> + MaybeConjureProof + Copy;
    }

    impl<T: ?Sized> TryProof for T {
        default type Result = !;
    }
    impl<T: ?Sized + std::fmt::Debug> TryProof for T {
        type Result = ();
    }
    const fn go<T: ?Sized>() -> Option<<T as TryProof>::Result> {
        <<T as TryProof>::Result as MaybeConjureProof>::RESULT
    }
    const { go::<T>() }
}

fn debug_format_given_proof<T: ?Sized>(proof: impl DebugProof<T>, value: &T) -> String {
    struct Worker<T>(T);
    impl<'a, T: ?Sized> GivenDebugCallback<String> for Worker<&'a T> {
        type Target = T;
        fn go(self) -> String where T: std::fmt::Debug
        {
            format!("{:?}", self.0)
        }
    }
    proof.prove(Worker(value))
}

// We can even branch on it in the current nightly
fn try_debug_format<T: ?Sized>(value: &T) -> Option<String> {
    try { debug_format_given_proof(try_debug_proof()?, value) }
}

// Enabling tricks like this
fn debug_with_fallback<T: ?Sized>(value: &T) -> std::borrow::Cow<'static, str> {
    match try_debug_format(value) {
        None => std::borrow::Cow::Borrowed("{NO DEBUG}"),
        Some(v) => std::borrow::Cow::Owned(v),
    }
}

fn call_only_with_debug<T: ?Sized>(value: &T) -> String {
    let debug_proof = const { try_debug_proof::<T>().expect("Only call with debug types") };
    debug_format_given_proof(debug_proof, value)
}

trait Wobbler {
    fn widdle(&self) -> usize;
    fn woggle(self) -> usize where Self: Sized; //Wants to be dyn safe!
}

struct Wrapper<T: ?Sized>(T);

impl<T: ?Sized + Wobbler> Wobbler for Wrapper<T> {
    fn widdle(&self) -> usize {
        self.0.widdle() + 1
    }
    // This doesn't work
    /*
    fn woggle(self) -> usize where Self:Sized {
        self.0.woggle() + 1
    }
    */
    // Nor this
    /*
    fn woggle(self) -> usize where T:Sized {
        self.0.woggle() + 1
    }
    */
    // This works, and I believe may be impossible without this trick
    fn woggle(self) -> usize where Self: Sized,
    {
        struct Woggler<T>(T);
        trait WogglerHelper {
            type Target: ?Sized;
            fn woggler_helper(value: Woggler<Self>) -> usize where Self: Sized, Self::Target: Sized;
        }
        impl<T: ?Sized + Wobbler> WogglerHelper for Wrapper<T> {
            type Target = T;
            fn woggler_helper(value: Woggler<Wrapper<T>>) -> usize where T: Sized
            {
                value.0.0.woggle()
            }
        }
        impl<T: WogglerHelper> GivenSizedCallback<usize> for Woggler<T> {
            type Target = T::Target;
            fn go(self) -> usize where T::Target: Sized
            {
                T::woggler_helper(self)
            }
        }
        let proof = const { try_sized_proof::<T>().expect("How is this type sized but its contents aren't???") };
        proof.prove(Woggler(self)) + 1
    }
}

fn main() {
    struct NotDebug;
    println!(
        "Look ma, no exposed type constraint: {}",
        &*debug_with_fallback(&5)
    );
    println!("Here we fallback: {}", &*debug_with_fallback(&NotDebug));
    //println!("If this weren't commented it would fail: {}",call_only_with_debug(&NotDebug));
    println!("This works: {}", call_only_with_debug(&6));
    struct Wob(usize);
    impl Wobbler for Wob {
        fn widdle(&self) -> usize {
            self.0
        }
        fn woggle(self) -> usize {
            self.0
        }
    }
    println!("Here we use the wobbling: {}", Wob(7).widdle());
    println!("Here we use the wrapper: {}", Wrapper(Wob(8)).widdle());
    println!("Woggling works too on the wob: {}", Wob(9).woggle());
    println!("And the wrapper: {}", Wrapper(Wob(10)).woggle());
    let dyn_demo: Box<dyn Wobbler> = Box::new(Wob(11));
    println!("And we can use dyn Wobblers: {}", dyn_demo.widdle());
}

This is not sound for the same reason as why specialization is not sound: lifetimes are erased post-monomorphization, but are needed to properly check bounds. In fact, the following use of the specialization feature is unsound.

Here's how this can be exploited Rust Playground

2 Likes

Yes, current specialization is unsoundly strong, and needs to be replaced with something weaker. However, rather than the blanket impl you can always do it soundly for a long list of types instead using options like const TypeId (though that is also still unstable I think). One can, though with even more boilerplate, keep it extendable I believe, by adding additional throwaway types to each time you need to use the pattern. Or one can simply reconstruct the entire thing but only impl it for your specific types which it is sound for.

The ergonomics of doing it without specialization are even more atrocious though, and it seemed like such an ergonomics atrocity already so I left it with using specialization.

Which leaves the question: How sanctioned should this be?

Some cases seem like they should obviously work, to the point that perhaps it is a bug that they don't, like the Sized bound issue.

Stating pure mathematical laws like associativity of addition (in the correct type) is, in full generality, beyond the compiler to automate, and so can't be written off in the same way. Still, having to have (A + B) + C = X + Y and A + (B + C) = X + Y both as where clauses seems absurd.

Both of these are I think completely sound, as I believe neither Sized nor const can itself soundly rely on anything erased.

Then we get into more extreme options: specialization on this (or other things). Should code detect if a const argument is a power of two and use a better algorithm that requires such things in that case? Still sound.

Right now we always have stable access to TypeId (though last I saw that itself might have had some soundness holes as well?). This means parametricity is out the window, unless we add things like Pre-RFC: A top of the opt out hierarchy and parametricity, and even then it seems very unlikely it will be made the default.

A sound "best guess" version of try_debug_proof remains on the table, and we already have things like core::mem::needs_drop as precedent for such. But needs_drop is meant to be purely an optimization hint, and can't meaningfully return a useful value anyway since there isn't anything to reasonably do with a drop type specifically.

So how far down this slide should the language go? Which patterns 'deserve' support? TypeId seems to have been decided to be one that deserves support, and when it has a soundness bug it is tracked as a bug to be fixed. I expect plenty of uses exist for debug_with_fallback, especially if combined with a derive macro (or adding it to debug itself if it is so blessed) so that outputs like Some({NO DEBUG}) can be generated. The strict requirement of Debug on Result::expect could be changed, but should it?

I myself feel like the reverse trait implications (such as Sized, though it would also apply to things with other where clauses), the math is less clear but still overwhelmingly the right answer, and beyond that things start to get dubious.

The const associativity case could be written in my mental sketch as something like:

fn left_group()
where
    const { (A + B) + C == X },
{
    static if A + (B + C) == X {
        right_group()
    } else {
        const { unreachable!() }
    }
}

fn right_group()
where
    const { A + (B + C) == X },
{
}

The idea behind static if is that the test condition is assumed true to typecheck the primary arm, and then during monomorphization the test condition is actually evaluated and only the selected branch is monomorphized.

For the type bound case, compare also the current const_eval_select, which has a requirement that both potential bodies to run have externally identical observable behavior. It's theoretically possible that Rust could support a kind of static where T: Debug test, but the user would need to tolerate spurious false results when impls are lifetime bound dependent.

1 Like

That would be an extremely massive reduction in boilerplate compared to the current horrorshow I demonstrated. I believe (modulo the question of if T: Trait is allowed and fixing the soundness) they are exactly equivalent in power, making it possibly not even an actual formal extension?

However, I do think the Sized case really highlights that at least statically forcing a compile failure if a trait isn't met is probably also important. While the Sized case could, as I mentioned, be fixed, the actual implications of traits can be arbitrarily hidden from the compiler.

Adding compiler checked "sealed" traits could help some, if it could infer (at least in the defining crate) that if all types that implement the "sealed" trait implement another trait, then it is implied by the sealed trait, but these can get arbitrarily complex.

As an absurd example, one could encode the collatz conjecture (an open conjecture in mathematics that says every sequence of a certain type eventually hits 1) in the type (not const!) system and then have a trait that is only satisfied by collatz sequences that hit 1.

If we unsafely assert it, we risk unsoundness, since then if someone calls our function with a starting point for a collatz sequence that doesn't hit 1, anything could happen. But also I'm very confident all of those static checks in practice will work out.

More realistically, unsafe traits or even just additional safe semantic implications mean the user is allowed to add arbitrary invariants that the compiler doesn't know about, so no system of inference can always check whether or not T: Debug is implied by the current context.

// This trait should only be implemented for 'numbers' (which have Add, Sub, etc) or Strings
trait NumberOrString: Debug {
}
struct MyWeirdDynamicSystemForMakingRustBeLikePythonOrSomething<T: NumberOrString>(Box<dyn Any>)

We can always work around that with horrible custom callback traits to turn arbitrary bits of code into awkward immediately called closures, but your static if is obviously vastly superior, avoiding the boilerplate, lifetime and scope issues (I'm not even sure how you could with any reasonableness write something that in both branches initializes a variable in the outer scope and then uses it after the paths unify, I think you might have to actually just duplicate potentially massive amounts of code or start writing in continuation passing style or some nonsense like that), and so on.

What about an option to provide a proof to the compiler (for example using lean)?

1 Like

In the case of Sized, this seems one of the "specialization predicate"s mentioned in this old blogpost on specialization Maximally minimal specialization: always applicable impls · baby steps

The same blog post also talks about reverse implied bounds which you might fine interesting.

That is interesting. If I follow correctly, that has some interesting implications.

That is to say, in the case of lifetime issues, it could be that

fn print_debug<T: Debug>(v : T) {
	print!("{:?}",v)
}
fn print_debug_hidden<T>(v : T) {
	static assert where T : Debug {
		print_debug(v)
	}
}
fn print_debug_hidden_alt<T>(v : T) {
	static if where T : Debug {
		print_debug(v)
	}
	else {
		const { panic!() }
	}
}
fn force_print<T>(v : T) {
	static if where T : Debug {
		print_debug(v)
	}
	else {
		print!("{NO DEBUG}")
	}
}

For consistency, it seems that force_print and print_debug_hidden_alt should behave the same in terms of which path they pick. Backtracking seems... dubious.

However, I think print_debug_hidden could soundly behave exactly like print_debug, just with a different compile error message.

This suggests that (unless static if has backtracking and some rather interesting codegen), a static assert could soundly be stronger than static if!

I'm not sure what you want to get at with this example. It seems to me that print_debug_hidden, print_debug_hidden_alt and force_print all need to figure out whether T: Debug at codegen time, when they are monomorphized, which is equivalent to specializing on Debug and thus unsound, so they should all be disallowed.

OK, static assert is perhaps not the best way to express it, since it can't actually go inside a static if. The point I was trying to express is that postmonomorphization errors and bound hiding are actually separate things.

Unless a method for typechecking to feed into codegen is added, which it sounds like people don't want to do, having postmonomorphization errors prevented by revealed bounds is... well, arbitrary symbols for constraints could actually be used I suppose, but by definition we are talking about things we don't have the setup to check earlier on so the usefulness of such would be somewhat limited, though perhaps still a good idea on principle.

However, the absence of visible bounds can actually in principle occur anywhere, without worrying about the soundness issues.

If we are going to allow print_debug_hidden_alt, it is more convenient to the end user allow a more general one that can avoid the spurious falses of the where T : Debug. It will do the same thing in all cases where the alt version compiles, and will result in fewer cases where it fails to compile.

Perhaps a better syntax example would be:

fn print_debug_hidden<T>(v : T) 
	where hidden!(T : Debug)
	print_debug(v)
}

Where hidden! magically makes the constraint deferred as long as soundly possible (perhaps erroring when known to be impossible). Up until you try to take a true polymorphic reference to it, where T : Debug depends on the instantiation of that polymorphic variable, it stays.

Proof of soundness:

  • Outside of traits, this is equivalent to just passing the constraint around at each function
  • Inside of traits, it is equivalent to creating a new supertrait where the method has an extra constraint on it. Until it becomes a dyn object this does nothing. When turning it into a dyn object we either are in a function with the constraint such that we can have a dyn object of the original constrait, or we are in a true polymorphic reference case. Both are sound

The error messages, admittedly, might be confusingly nonlocal due to the effective non-local inference implied by passing the bounds around implicitly. Same for the semvar hazard, suddenly temporarily turning a generic T into a dyn Trait for object safe T creates an implicit bound on T that its specific T : Trait impl doesn't have a hidden lifetime dependent bound.

In some ways this is similar to the 'trait transformers' and the 'Send bound problem'. Often we wind up with things that implicitly create families of very similar constraints, and we want them to be a trait that is only written a far more limited number of times. Though here, rather than it being "is the returned future Send?" it is "does the trait method exist unconditionally, or with some extra condition?".

This well might be enough reason to not permit it, but I do find it interesting that it is sound and in many ways barely an extension over something which is almost unavoidable from any sound implementation of fairly general specialization.

Further, the issues are substantially reduced if it is forbidden in trait impls (including ones like closures), though making this work nicely with things like closures both defined and used purely in a single function would potentially be annoying.


Code for clarity:

Expanding on the trait case, it is basically the same turning this:

trait Foo {
	fn exec(self);
}
impl<T> Foo for Wobbler<T> {
	fn exec(self) where hidden!(T: Debug) {
		wiggle_the_wobble(self)
	}
}

into this:

trait Foo {
	fn exec(self);
}
trait FooRequiringWithDebug {
	type RequiresDebug
	fn exec_with_debug(self) where Self::RequiresDebug : Debug;
}
impl<T: Foo> FooRequiringWithDebug for T {
	type RequiresDebug = /* magic coherence stuff, handled because it isn't actually exposed as a type, just used as a constraint */
	fn exec_with_debug(self) where Self::RequiresDebug : Debug {
		self.exec()
	}
}
impl<T: FooRequiringWithDebug> Foo for T where <T as FooRequiringWithDebug>::RequiresDebug : Debug {
	fn exec(self) {
		self.exec_with_debug()
	}
impl<T> FooRequiringWithDebug for Wobbler<T> {
	type RequiresDebug = T
	fn exec_with_debug(self) where T : Debug {
		wiggle_the_wobble(self)
	}
}

Automatically having all the impls that should exist hits coherence issues, but these are actually coherent (given the magic within the RequiresDebug type, which is fine because that only happens when the constraint [which is the only thing it is used for] is satisfied.)

If we had first class constraint types (or first class trait types which can be turned into the latter by hanging them all on ()), we could actually do this all with associated types. Though we would want them to be covariant, and have subtyping based on if they were satisfied and it would be a huge extension.