Tackling Undefined Behaviour Casts

Gankra · June 18, 2015, 8:51pm

Currently the result of certain floating-point casts are Undefined (as in can cause Undefined Behaviour):

https://github.com/rust-lang/rust/issues/15536: f64 as f32 will produce an Undefined result if input cannot be represented by the output. From discussing on the #llvm irc, my understanding is that this generally means that the input is finite, but exceeds the minimum or maximum finite value of the output type. ex: 1e300f64 as f32
https://github.com/rust-lang/rust/issues/10184: f* as i/u* will produce an Undefined result if the input cannot be represented by the output when rounded to the nearest integer (rounding towards 0, signed or unsigned as appropriate). ex: 1e10f32 as u8. Note that e.g. -10.0f32 as u8 is defined as 0.

This is an annoying wart on Rust’s current implementation, and we should fix it. Note that at least on x86_64 linux the example f64 as f32 cast just produces inf (which is is pretty reasonable IMHO), while the f32 to u8 example seems to produce completely random results (not sure if actual undefs are being made, but that seems believable).

I’m happy with these “nonsense” casts having unspecified behaviour so that we can e.g. inherit whatever the platform decides to do, as long as it doesn’t violate memory safety like the current design can. A solution that doesn’t add overhead seems ideal to me. Having to specify that e.g. 1000.0 as u8 == u8::MAX may be too cumbersome. Although note that this has a complex interaction with cross-compilation and const-evaluation.

I lack the requisite familiarity with LLVM to know what the best way forward is, though. I’d also be interested to hear if there are usecases for these casts having specified behaviour.

arielb1 · June 18, 2015, 10:08pm

This makes undefs.

huon · June 18, 2015, 10:34pm

Just to be clear, are you referring to the following?

arielb1 · June 18, 2015, 10:45pm

Yes

frankmcsherry · June 18, 2015, 11:16pm

My understanding of the Rust literature is that creating undefs is undefined behavior (it’s in the list), but that only some of the LLVM uses of undef actually lead to undefined behavior (e.g. floating point division by an undef). Is it possibly to articulate which instances of undef actually lead to UB, for example “just the ones that lead LLVM to UB”, or is it more complicated than that?

kstep · June 19, 2015, 1:55am

My first thought on this, is make such things panic, as it’s already a case with wrapping arithmetics. I’m not a language designer, so I’m not sure I have a voice here, but as a language user it seems reasonable and consistent behavior.

rkjnsn · June 19, 2015, 2:27am

I agree that it would make sense for them to panic in debug builds, but it is still necessary to figure out what should happen for builds without overflow checks.

comex · June 19, 2015, 3:01am

Undefs, huh? Undefs are fun. They tend to propagate. After a few minutes of wrangling…

#[inline(never)]
pub fn f(ary: &[u8; 5]) -> &[u8] {
    let idx = 1e100f64 as usize;
    &ary[idx..]
}

fn main() {
    println!("{}", f(&[1; 5])[0xdeadbeef]);
}

segfaults on my system (latest nightly) with -O.

eefriedman · June 19, 2015, 8:08pm

You can access platform-specific behavior through LLVM intrinsics; on x86, for example, you can use @llvm.x86.sse.cvttss2si and friends. A bit annoying, but workable.

There are essentially three behaviors Rust can provide: saturate, fail (either Option or panic), and platform-specific. No matter what as does, it’s probably a good idea to make all of these available as standard library functions. I would guess the right default for as is to panic in debug builds, and use platform-specific behavior in release builds. This parallels integer overflow: the performance cost of checking the conversion by default is probably too high.

Gankra · June 19, 2015, 9:11pm

We’ve previously established that as is an unchecked op regardless of build mode (1000u32 as u8 just truncates), so doing anything special in debug builds is almost certainly not going to happen. We do however have plans for “checked cast” variants somewhere in the std lib.

system · March 25, 2019, 8:10am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Taming Undefined Behavior in LLVM Unsafe Code Guidelines	6	2198	March 25, 2019
Help Us Benchmark Saturating Float Casts!	19	8056	March 25, 2019
Std::io::seek	13	1057	March 25, 2019
Peculiar behavior when doing bad things compiler	27	1929	February 17, 2022
Pre-RFC: Add explicitly-named numeric conversion APIs libs	26	4867	March 11, 2020

Tackling Undefined Behaviour Casts

Related topics