Implicit widening, polymorphic indexing, and similar ideas

Admittedly not having read the entire thread, the main “gotcha” for me is that it’s unergonomic to throw in casts everywhere and it clutters the code. It also makes it harder to distinguish from cass where a cast is losing information.

It would be much nicer to avoid writing either “as” or “into” if the conversion is possible without data loss. I’ll go read the thread now to see some arguments against this, but just wanted to throw in my impressions as someone who uses the language casually and mainly for numerical-heavy code.

1 Like

So I just wrote this code in a benchmark, those casts do hurt:

fn u8_runner<F: Fn(u8) -> u8>(bench: &mut Bencher, f: F) {
    let mut vs: [u8; std::u8::MAX as usize] = [0; std::u8::MAX as usize];
    for i in 0..u8::max_value() {  //.. i know
        vs[i as usize] = i;
    }
    bench.iter(|| {
        for mut v in vs.iter_mut() {
            *v = bencher::black_box(f(bencher::black_box(*v)));
        }
    })
}
2 Likes

Doing this with any amount of usefulness requires a rather complicated and fragile analysis, probably with an abstract interpreter. I really don't think that would be a good idea for Rust's type system to have. Do you have any link to something about how D does this? I'm not finding anything on their website or in DMD that indicates that this is done.

D has various parts that are not easy to find unless you’re a D programmer for some time. D Value Range Analysis works only inside a single expression, a simple example (here 155 is OK, 156 is a compile-time error):

ubyte foo1(in uint x) {
    return x % 101 + 155; // OK
}
ubyte foo2(in uint x) {
    return x % 101 + 156; // Error
}
void main() {}

See the compiler error: https://dpaste.dzfl.pl/3767fcbe05e2

The kind of Value Range Analysis I’d like for Rust is similar, but I’d like the range value to be carried between different expressions. To simplify the analysis the value range should not be carried outside a function (for this it’s better to use contrats or static analysis tools external to the compiler), and should be computed only for constants (let but not let mut). When the flow analysis becomes too much long inside very large functions with a very high complexity I think it’s OK stop the computation of a value range after an amount of time.

In Rust the value range should not be used as in D to perform implicit type conversions (like in the D example above that contains an implicit but lossless u32->u8 conversion in the foo1 function) but allow code like:

fn foo1(x: u32) -> u8 {
    u8::from(x % 101 + 155)
}
fn main() {}

The from perform a lossless explicit cast.

LLVM already has these optimizations, this is straightforward inlining plus constant folding plus range analysis if those functions existed, and would involve no type system modifications…

Perhaps you're missing the point. This isn't about optimization. Value Range Analysis (VRA) has various usages, one of the main ones for Rust is to replace some "try_from" or some unreliable "x as Type" with a proved correct "Type::from()" (and similar things like using functions similar to slice::get_unchecked in safe Rust code that are proved to perform in-bound accesses).

It's never OK for the correctness of a program to depend on how clever the implementation is, where the cleverness might vary from implementation to implementation. With features like this, where the compiler has to do some nontrivial analysis to decide whether the program is valid, the exact power of the analysis needs to be part of the specification.

So, for instance, the specification could say that value ranges are considered for correctness only within a single expression (like D) or within a single basic block (for more power) or even through the entire flow graph with the caveat that if a phi operator consumes a back-edge the range becomes unconstrained (this is probably as powerful as anyone actually wants, and is still linear in the size of the CFG). But "after an amount of time" is not acceptable.

It may not be obvious why this is important with the status quo where there is only one complete implementation of Rust, but hopefully that won't remain the case. Programs that compile with one implementation and not another (and aren't deliberately using an extension) are a problem - look at C. Programs that might or might not compile depending on how powerful the host CPU is (where the failure mode blames the program and not, like, "sorry, you ran out of memory") are even more of a problem.

3 Likes

I see and thank you for your answer.

Is the problem you discuss about the same problem I’ve hit with type inference? https://github.com/rust-lang/rust/issues/44973

The amount of type inference a Rust compiler is able to perform is not formally specified, and sometimes it regresses, like in this case where it has broken lot of my code.

That does look like a problem under the same umbrella, if type inference within the body of a function is as weakly specified as that discussion makes it sound. I don’t think you should have closed your issue; it may be a known regression but it’s still a regression and it sounds like there may be fixes in the works.

It does seem like trait improvements could potentially improve these cases, since there’s definitely no Into::into for u32 -> &u64. I don’t know if this is one of the cases that the chalk search work would resolve.

I’d think epochs could eventually permit us to restrict as to being non-lossy: If we added refinement types then as could do everything it does now, but require the compile verify losslessness.

Would not compile:

some_u64 as u32
some_i32 as u32

Would compile:

(some_u64 & <u32>::max_value().into() ) as u32
some_i32.abs() as u32

It’s more readable to add custom methods like some_u64.truncate_to::<u32>() or whatever, and just leave as alone, though.

I think refinement types are still the ultimate solution whatever syntax you choose though because often you just want the compiler to verify that nothing bad happens, and to float the error all the way back out to the beginning of the computation.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.