Walking to the very edge of UB (solved, this one definitely causes UB)

This might be better on user's forum, please let me know if I should go there instead.

I'm trying to understand the unsafe coding guidelines better, and towards that end I've started writing some tests that are definitely unsafe, but may or may not cause undefined behavior. One of them can be found on the playground. The code is copied below for those that would prefer to just read.

The torture test is fairly simple; I have a struct that has an identifier value embedded within it. I create a raw pointer to this that I copy and pass into multiple threads. Each thread overwrites the identifier field with the same new identifier, without synchronizing this write with any other thread. After all threads have completed and I've joined all of them, I test that the header and data fields are unchanged, and the identifier field is identical to the new identifier.

Now, this works on my machine in both debug and release modes, and it works on the playground, again in both debug and release modes. However, that just proves that the compiler currently has a particular behavior, it doesn't prove that this is defined behavior (opposite of UB). The real question is, is this a data race? On the one hand it definitely meets the requirements to be a data race per the definition given in the nomicon, but since every write has the same data, I don't know if I've landed in undefined behavior territory or not. Any insights would be much appreciated.

EDIT

Yeah, so, 5 more minutes of twiddling and I would have triggered the UB I was looking for... turns out what I was missing was #[repr(packed)] on my struct; as soon as I added that, all kinds of UB came out (threads terminating instantly, threads never terminating, and in one case, spawning an unlimited number of threads). So, yay, I have found that this is definitely off the edge of the cliff...

The code

use rand::{distributions::Standard, prelude::*, Error, Fill};
use std::thread::{spawn, yield_now};

const HEADER_LENGTH: usize = 0usize;
const TRAILER_LENGTH: usize = 1024usize;

#[derive(Debug, Copy, Clone, PartialEq, Eq, PartialOrd, Ord, Hash)]
struct Example {
    // Fake header to force unaligned access on packed structures.  The contents
    // are randomly generated, and should remain unchanged throughout the tests.
    header: [u8; HEADER_LENGTH],

    // Pseudo identifier
    identifier: u128,

    // Fake trailer to force unaligned access on packed structures. The contents
    // are randomly generated, and should remain unchanged throughout the tests.
    data: [u8; TRAILER_LENGTH],
}

impl Fill for Example {
    fn try_fill<R: Rng + ?Sized>(&mut self, rng: &mut R) -> Result<(), Error> {
        for i in 0..self.header.len() {
            self.header[i] = rng.gen();
        }
        self.identifier = rng.gen();
        for i in 0..self.data.len() {
            self.data[i] = rng.gen();
        }

        Ok(())
    }
}

impl Distribution<Example> for Standard {
    fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> Example {
        let mut ex = Example {
            header:     [0; HEADER_LENGTH],
            identifier: 0,
            data:       [0; TRAILER_LENGTH],
        };
        ex.try_fill(rng).expect("This should never fail");
        ex
    }
}

fn unsafe_test() {
    let mut rng = rand::thread_rng();

    // First, create our test subject, and clone it as-is so we can verify that
    // all and only the bits we expected to change actually did change.
    let mut example: Example = rng.gen();
    let pristine = example.clone();

    // We're going to use a pointer to mutate the example.
    let example_pointer: *mut Example = &mut example as *mut Example;

    // Now create a new, shared identifier.
    let new_identifier: u128 = rng.gen();

    // Now things get interesting.  We're going to create numerous threads and
    // have them each try to write 'new_identifier' to shared.identifier.  **WE
    // DO NOT DO ANY SYNCHRONIZATION WHATSOEVER!** They are all writing the
    // memory location as fast as they can, and if something goes south, so be
    // it!

    let mut join_handles = Vec::new();
    for _ in 0..20 {
        // Have to cast to a usize because pointers can't cross thread
        // boundaries (rust's safety guarantees are AWESOME!  Shooting yourself
        // in the foot takes great skill here)
        let p: usize = example_pointer as usize;
        let t = spawn(move || {
            let ep: *mut Example = p as *mut Example;
            for _ in 0..200000 {
                unsafe {
                    (*ep).identifier = new_identifier;
                }

                yield_now();
            }
        });
        join_handles.push(t);
    }

    for jh in join_handles.drain(..) {
        jh.join().expect("Why'd I crash on joining???");
    }

    // We've completed the threaded write tests.  We should be able to write
    // new_identifier into 'pristine', and it should be identical to 'example'.

    // This test may fail, but it is very, very unlikely to.  Good enough for
    // the tests I'm doing.
    assert!(pristine.identifier != new_identifier);

    let mut merged: Example = pristine.clone();
    merged.identifier = new_identifier;
    assert_eq!(example, merged);
}

fn main() {
    println!("This is a test of unsafe, but possibly not undefined, behavior.");
    println!("If there aren't any crashes, then the test has likely passed.");
    for _ in 0..1000 {
        unsafe_test();
    }
    println!("If we're here, then there weren't any crashes.");
}

This is a data race, and data races are undefined behavior in Rust.

I couldn't find a definition for "data race" in any Rust book, so I'll take C's interpretation (from N1548, 5.1.2.4 "Multi-threaded executions and data races"):

Two expression evaluations conflict if one of them modifies a memory location and the other one reads or modifies the same memory location.

[...]

The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior.

edit: I should have clicked your link...

Rust defines it in the link I gave above (data races).

That said, while it wasn't causing UB before, by making each of my threads yield randomly I landed in UB territory. SO... thank you for your help, this topic can be closed now...

FWIW, it was causing UB before. Data races don't depend on the values being written to memory. All data races are UB. Whether or not that UB causes undesirable program behavior is... undefined.

3 Likes

Yeah, I understand that now. I was just really hoping I'd found an exception to the rule...

The nasty thing about UB is that it does not guarantee that things will go horribly wrong. UB likes to lure you into thinking that you are fine... until it suddenly pulls the floor out from underneath of you and you plummet into the pit where all your code has been arranged in a nice set of spikes.

17 Likes

I know, UB is nasty like that. What I was hoping was that this was unsafe but not UB, but that the usafe coding guidelines were missing this information.

Btw, Miri detects the UB here. :slight_smile: (Select "Tools - Miri" in the playground):

error: Undefined Behavior: Data race detected between Write on Thread(id = 2) and Write on Thread(id = 1), memory(alloc3272,offset=0,size=16)
(current vector clock = VClock([18, 0, 6]), conflicting timestamp = VClock([0, 6]))
  --> src/main.rs:77:21
   |
77 |                     (*ep).identifier = new_identifier;
   |                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Data race detected between Write on Thread(id = 2) and Write on Thread(id = 1), memory(alloc3272,offset=0,size=16)
(current vector clock = VClock([18, 0, 6]), conflicting timestamp = VClock([0, 6]))
   |
   = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
   = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information

Miri does not guarantee to detect all UB, but it detects a lot of it.

18 Likes

Not using unsafe code would have saved me as well! :rofl:

Honestly, I knew I was riding the ragged edge when I wrote the code, but I wanted to test the very specific case of writing the same value to the same location in memory. As weird as it sounds, I have a specific use case where this would be handy.

Yeah, I've seen other cases in the past where that would be handy. :slight_smile: And there certainly are optimizations that break such code...

Miri could still be a useful tool to keep in mind when exploring UB in Rust. :smiley:

1 Like