Half-baked idea: local deimplementations


#1

Jotting down an idea, apologies for brevity.

Motivation: floating-point equality is a massive footgun due to precision loss; you almost always want equality-with-epsilon. Thus, I’d like to prevent floats from being used where they might be tested for equality, e.g. HashMap<f64, V>.

Introduce $vis impl !$trait for $ty {}, where $vis != pub. For all contexts able to see the trait and this impl, the compiler pretends to be unable to prove $ty: $trait, and reports an error along the lines of “cannot prove trait implementation due to explicit local deimplementation”. Thus, I can write

crate impl !PartialEq for f32 {}
crate impl !PartialEq for f64 {}

in lib.rs and live happily ever after.

Open questions:

struct K<T: Trait> { .. }
crate impl !Trait for J {}
// downstream ...
let x: K<J> = ..; // should we allow the author of upstream to
                  // to assume `K<J>` cannot exist?
// Answer: probably not, since local deimplementations mostly
// act as lints.

crate impl !Trait acts a bit like #[forbid], since child modules can’t undo it. Should there be a #[deny] + #[allow] version?

My intuition is that public deimplementations will, without a doubt, lead to exciting problems when one upstream crate depends on T: Trait, but another publicly deimplements it.


#2

This is awfully dogmatic… but moreover, PartialOrd: PartialEq, so you’ll probably regret this solution.

HashMap<f64, V> already can’t be used. That requires Eq. (though admittedly the reason f64 doesn’t implement Eq is entirely unrelated to the problem in your motivation, so another example could likely be found)


#3

Aren’t newtypes able to solve this problem?


#4

It seems like a much easier alternative to this is to define an optional lint in clippy, or your own custom tool-linter.

FWIW, I love the crate impl syntax from a totally personal “feels right” perspective; I’ve also occasionally wanted to be able to extend impls in a binary “leaf” crate which is kind of related. Thinking specifically about negative crate assertions— I’ve had a case I’d find use in crate-local negative impls of blanket trait impl defined by a library.

Lints also avoid this problem and can address other things if you want to be pretty dogmatic. I thing dogma is worth it some times (like using a BigDecimal crate when correctness matters, even when f64 is “theoretically” high-precision enough for such and such a use case (hehe :sweat_smile:) .

My experience with newtypes is that they are still pretty expensive ergonomically, but hopefully that’ll be resolved when we get closer to one of the delegate impl RFCs.

The combination of newtypes and lints does seem to poke holes at the motivation for crate-local impls (as describe by OP), but I’m also suspecting there are other good motivations which might pop up ^_^.


#5

FWIW, you can (ab)use the Deref<Target=Inner> trait and implement it for the newtype to get around reimplementing the delegate methods. So e.g.

struct Inner;

impl Inner { /* a bunch of methods */ }

struct Outer(Inner);

impl std::ops::Deref for Outer {
    type Target = Inner;
    fn deref(&self) -> &Self::Target { &self.0 }
}

The cost of this is that all methods of Inner are exposed, which may be rather undesirable.


#6

Well, yes, this is dogmatic view. In some domains, such as cryptography or scientific computing, justified dogma can prevent costly mistakes. If I’m writing tests for APIs which produce floating point numbers, I’d like to disable floating point comparisons to avoid flaky tests. I will admit I forgot we already didn’t have HashMap<f64, V> due to NaN and subnormal hilarity. FWIW, it’s not clear to me that PartialOrd: PartialEq was the right choice, but that is out of the scope of what I’m bringing up.

One could imagine disabling Div for an integer type in a crypto library, since divs (which you might accidentally use in your bignum implementation) can leak key material (and as we all know, hard errors go a long way in avoiding mistakes in otherwise well-audited software), or maybe removing PartialEq for function pointers (which does not behave how you might hope in the presence of dynamic linking), or temporarily making a type uncloneable in an effort to use the compiler’s excellent error reporting to bring down unnecessary clones.

Sure, this is very lint-shapped, which is why I compared it to #[forbid]. Is Clippy smart enough to be able to tell that, if I want to forbid f64: PartialEq, I also don’t get [f64]: PartialEq?


#7

Clippy or your own linter is a rustc compler plugin (see how clippy works), which means that you can write a lint where you have access to the types after type checking and the ast which I believe is enough information to write a lint that forbids implicit use of f64: PartialEq such as [f64]: PartialEq.


#8

See, my suspicion is of the fact that it is “justified.” There are all kinds of places where exact equality of floating point numbers is the only correct solution.


Right now, I am in the middle of implementing a rather complicated mathematical function full of tricubic and bicubic splines. I.e. the kind of place where you would expect to see exorbitant amounts of numerical roundoff. But even in a place like here, exact equality of floats can be important. The function is chock full of switching functions, which take the form

             0     if x <= xmin
f(x)  =  {  g(x)   if xmin < x < xmax
             1     if xmax <= x

where g(x) is a function that transitions smoothly from 0 to 1 as
x goes from xmin to xmax

For any stable molecule, these numbers are never in the switching region. And as a result, there’s an awful lot of things in the function that almost always take on the value of an integer… including the parameters to those tricubic splines I mentioned earlier.

Tricubic splines are produced by fitting curves to a set of known values at integer points, so when implementing this code, it is extremely useful to have a fast path in the splines for checking if the point is an integer (in which case we don’t need to evaluate a polynomial, and can simply return the known value).

The way that I do this:

impl TricubicSpline {
    fn evaluate(&self, point: V3<f64>) -> (EvalKind, (f64, V3<f64>)) {
        let indices = point.map(|x| x as usize);

        if point == indices.map(|x| x as f64) {
            // Fast path (i, j, and k are all integers)
        
            unimplemented!() // look up the known values
        } else {
            // Slow path (fractional point)

            unimplemented!() // evaluate a cubic polynomial in 3 variables
        }
    }
}

minimizes roundoff error, while the method used by people following dogma:

if (fabs(Nij-floor(Nij)) < TOL && fabs(Nji-floor(Nji)) < TOL
    && fabs(Nijconj-floor(Nijconj)) < TOL) {
    // fast path
} else {
    // slow path
}

needlessly introduces small discontinuities into the value and derivatives of the computed spline.


#9

I’m not sure how you read into what I said being some kind of universal truth that applies everywhere. See: “almost always”. Moreover, the problematic comparisons are often those involving a high-magnitude exponent or many repeated additions or multiplications with non-integral floats, performed across different processors (see: parallel computations or on opposite ends of a connection). Maybe you can make a case you want to use float equality; that’s not for me to judge.

For what it’s worth, you can determine if x: f64 is integral via x.fract().to_bits() == 0, but I think that’s needless hair splitting.


#10

Okay, I won’t argue this further.

Just an aside, I wrote it that way because the indices are needed by code I left out of the snippet.

Edit: Never mind, I see why you said that.