Idea: Safe blocks for inside unsafe blocks

AverseABFun · January 31, 2025, 12:01am

First off, I have no clue what impact this may have or if there would be any ramifications(especially since I don't know a lot of about the internals of rustc), but I had an idea while writing some code for an OS I'm making. My idea is to allow blocks that look something like safe { /* code */ } inside unsafe blocks(so something like unsafe { /* unsafe code */ safe { /* safe code */ } /* more unsafe code */ }) as, for example, I was writing a function like this(ignore the contents, it's not super important):

/// Parses a PC Screen Font into a [PCScreenFont].
pub fn parse_pc_screen_font(data: RawPCScreenFont) -> Result<PCScreenFont, crate::Error<'static>> {
    unsafe {
        let unitable: &[&str] = &[];
        let unistr = data.glyphs.byte_add(data.bytes_per_glyph as usize*data.num_glyphs as usize);
        /* Safe block start */
        for i in 0..(data.num_glyphs as usize) {
            let char = (*unistr)[i];
            /* Didn't finish writing code here lol */
        }
        /* Safe block end */
        
        let out = PCScreenFont {
            version: data.version,
            flags: data.flags,
            height: data.height,
            width: data.width,
            glyphs: &*(core::ptr::from_raw_parts(data.glyphs as *const Glyph, data.num_glyphs as usize) as *const [Glyph]),
            unitable
        };
        Ok(out)
    }
}

And I'd like to have the for loop be safe so that I can re-enable some of the compilers checks, however having something like:

/// Parses a PC Screen Font into a [PCScreenFont].
pub fn parse_pc_screen_font(data: RawPCScreenFont) -> Result<PCScreenFont, crate::Error<'static>> {
    unsafe {
        let unitable: &[&str] = &[];
        let unistr = data.glyphs.byte_add(data.bytes_per_glyph as usize*data.num_glyphs as usize);
    }
    for i in 0..(data.num_glyphs as usize) {
        let char = (*unistr)[i];
         /* Didn't finish writing code here lol */
    }
    unsafe {
        let out = PCScreenFont {
            version: data.version,
            flags: data.flags,
            height: data.height,
            width: data.width,
            glyphs: &*(core::ptr::from_raw_parts(data.glyphs as *const Glyph, data.num_glyphs as usize) as *const [Glyph]),
            unitable
        };
        Ok(out)
    }
}

seems too obtrusive. Please let me know if I'm missing something, but here's an example code sample with my same example:

/// Parses a PC Screen Font into a [PCScreenFont].
pub fn parse_pc_screen_font(data: RawPCScreenFont) -> Result<PCScreenFont, crate::Error<'static>> {
    unsafe {
        let unitable: &[&str] = &[];
        let unistr = data.glyphs.byte_add(data.bytes_per_glyph as usize*data.num_glyphs as usize);
        safe {
            for i in 0..(data.num_glyphs as usize) {
                let char = (*unistr)[i];
                /* Didn't finish writing code here lol */
            }
        }
        
        let out = PCScreenFont {
            version: data.version,
            flags: data.flags,
            height: data.height,
            width: data.width,
            glyphs: &*(core::ptr::from_raw_parts(data.glyphs as *const Glyph, data.num_glyphs as usize) as *const [Glyph]),
            unitable
        };
        Ok(out)
    }
}

Also, having a safe block like this makes it more obvious that this code is safe code when the surrounding isn't, although I suppose that the version that'd be used right now would make it more explicit that the code at the end of the "safe block" is unsafe again. I figure this would go through the RFC process, but I wanted to gauge interest first.

AverseABFun · January 31, 2025, 12:02am

Also, another thing I forgot, with what you have to use right now, you run into scope issues:

/// Parses a PC Screen Font into a [PCScreenFont].
pub fn parse_pc_screen_font(data: RawPCScreenFont) -> Result<PCScreenFont, crate::Error<'static>> {
    unsafe {
        let unitable: &[&str] = &[];
        let unistr = data.glyphs.byte_add(data.bytes_per_glyph as usize*data.num_glyphs as usize);
    }
    for i in 0..(data.num_glyphs as usize) {
        let char = (*unistr)[i]; // error[E0425]: cannot find value `unistr` in this scope
         /* Didn't finish writing code here lol */
    }
    unsafe {
        let out = PCScreenFont {
            version: data.version,
            flags: data.flags,
            height: data.height,
            width: data.width,
            glyphs: &*(core::ptr::from_raw_parts(data.glyphs as *const Glyph, data.num_glyphs as usize) as *const [Glyph]),
            unitable // error[E0425]: cannot find value `unitable` in this scope
        };
        Ok(out)
    }
}

AverseABFun · January 31, 2025, 12:11am

Just realized, probably not the best example, the let char line requires an unsafe block lol

programmerjake · January 31, 2025, 1:21am

note that unsafe doesn't disable any of the existing checks you'd usually rely on in safe code (e.g. lifetimes, indexing out-of-bounds, etc.), all it does is allow you to also do a few extra operations you wouldn't normally be allowed to: derefencing raw pointers, calling unsafe functions, and a few others.

That said, I think having some form of safe {} blocks is a great idea that I've been wanting for some time, mostly because it'll be useful in macros, e.g.:

macro_rules! inc {
    ($a:expr) => {
        unsafe {
            // not the best example...other macros much more
            // naturally have inputs in the middle of an `unsafe` block
            let mut a: u64 = safe { $a };
            asm!("inc {}", inlateout(reg) a);
            a
        }
    };
}

cuviper · January 31, 2025, 5:29am

But then someone who uses your macro inside their own unsafe block suddenly loses that ability in the macro arg.

CAD97 · January 31, 2025, 6:02am

I know you said not to focus on the specific example, but generally I find that code can be rewritten with more precise unsafe blocks in a clean manner, e.g.

/// Parses a PC Screen Font into a [PCScreenFont].
pub fn parse_pc_screen_font(data: RawPCScreenFont) -> Result<PCScreenFont, crate::Error<'static>> {
    let num_glyphs = data.num_glyphs as usize;
    let glyphs = unsafe { ptr::slice_from_raw_parts::<Glyph>(data.glyphs.cast(), num_glyphs) };
    let unistr = unsafe {
        let size_of_glyph = data.bytes_per_glyph as usize;
        let ptr = data.glyphs.byte_add(size_of_glyph * num_glyphs);
        slice::from_raw_parts(ptr, num_glyphs)
    };

    for &char in unistr {
        /* Didn't finish writing code here lol */
    }

    let out = PCScreenFont {
        version: data.version,
        flags: data.flags,
        height: data.height,
        width: data.width,
        glyphs,
        unitable: &[],
    };
    Ok(out)
}

But a way to cancel out one "layer" of unsafe would indeed be a nice to have.

programmerjake · January 31, 2025, 6:46am

true, though allowing extraneous unsafe operations in arguments of what's supposed to be a safe macro is waay worse than the slight ergonomic hit of having to use unsafe twice. this is basically what asm! did

I also commented on needing safe blocks there: Tracking Issue for asm_goto · Issue #119364 · rust-lang/rust · GitHub

robinm · January 31, 2025, 10:41am

Instead of introducing a safe block inside unsafe, wouldn’t it be better to have a prefix unsafe. operator (notice the required . to remove ambiguity) that applies only to the operation immediately on the right?

unsafe fn unsafe_function(x: i32) { ... }
fn get_ptr() -> *const i32;

fn foo() {
    let ptr = get_ptr();

    // this would not compile
    unsafe.unsafe_function(*ptr);

    // while this would
    unsafe.unsafe_function(unsafe.*ptr);

    // and obviously, this continues to work
    unsafe { unsafe_function(*ptr); }
}

tczajka · January 31, 2025, 11:43am

What ambiguity?

I have suggested the same before but without the ..

JarredAllen · January 31, 2025, 10:04pm

I generally agree about rewriting to have the unsafe blocks be more precise (I usually enable lints to force each unsafe block to only have one unsafe operation and and to have a safety comment), but I haven't found a good way to do it with a chain of method calls. I'd love something like:

// SAFETY: why `.unsafe_baz()` is safe
unsafe {
    safe {
        // SAFETY: why `unsafe_bar` is safe
        unsafe { foo.unsafe_bar() }
        .bat()
    }
    .unsafe_baz()
}

It's still not perfect to have a rightward drift from each unsafe level, but imo this is still an improvement over the status quo.

AverseABFun · February 1, 2025, 1:05am

That does work, although imo that feels cluttered(of course, that's more a matter of opinion, but that's just my feelings lol). Especially if your code is mostly unsafe but you still have a decent amount of safe code, then it doesn't feel worth it to add unsafe blocks all over.

AverseABFun · February 1, 2025, 1:07am

Anyway, I'm not 100% sure the process for this kind of stuff but since the general consensus seems to be for adding safe blocks then in a couple of days when there's been more feedback I'll probably create an RFC for this.

AverseABFun · February 1, 2025, 2:09am

Honestly, I'm impatient and submitting my RFC now. I'll definitely make changes along the way though.

AverseABFun · February 1, 2025, 2:34am

tczajka · February 1, 2025, 3:16am

JarredAllen:

I'd love something like:

// SAFETY: why `.unsafe_baz()` is safe
unsafe {
    safe {
        // SAFETY: why `unsafe_bar` is safe
        unsafe { foo.unsafe_bar() }
        .bat()
    }
    .unsafe_baz()
}

With single-operation unsafe this would be:

foo
    // SAFETY: why `unsafe_bar` is safe
    .unsafe unsafe_bar()
    .bat()
    // SAFETY: why `unsafe_baz` is safe
    .unsafe unsafe_baz()

AverseABFun · February 1, 2025, 3:19am

I feel like single operation unsafe would be a separate thing from safe blocks.

binarycat · February 1, 2025, 3:27am

surely no one dev would be forced to use (nested) unsafe twice, though? the macro author would use unsafe and safe, then some users of the macro would use unsafe?

i guess if someone is using the macro they wrote they might need to write unsafe twice, but in that case they could just choose not to include the safe block, and just define the macro as introducing an unsafe block.

programmerjake · February 1, 2025, 9:28am

what I was referring to could look like:

// crate a
// this macro is safe to use
macro_rules! m {
    ($v:expr) => {
        unsafe { some_unsafe_op(safe { $v }) }
    };
}

// crate b, depends on crate a
unsafe fn f(v: *mut i32) {
    unsafe { // unsafe needed to deref v
        *v += 1;
        m!(unsafe { *v }); // unsafe needed again because m! has safe {}
    }
}

tczajka · February 1, 2025, 8:06pm

The point is, every example so far in favor of safe is better solved by smaller unsafe scopes (including possibly a single-operation unsafe). So what's the use case for safe blocks?

zackw · February 1, 2025, 8:26pm

programmerjake:

// crate a
// this macro is safe to use
macro_rules! m {
    ($v:expr) => {
        unsafe { some_unsafe_op(safe { $v }) }
    };
}

// crate b, depends on crate a
unsafe fn f(v: *mut i32) {
    unsafe { // unsafe needed to deref v
        *v += 1;
        m!(unsafe { *v }); // unsafe needed again because m! has safe {}
    }
}

Shouldn't this be dealt with by macro hygiene? That is, shouldn't the expansion of $v be treated as within or without an unsafe block based on the original lexical context of those tokens, in the same way that we handle identifier resolution for a macro body differently than for substitutions? Like this

/// this macro is safe to call
macro_rules! m {
    ($v:expr) => {
        unsafe { some_unsafe_op($v) }
    };
}

unsafe fn f(v: *mut i32) {
    unsafe { // unsafe needed to deref v
        *v += 1;
        m!(*v);  // macro argument can use unsafe ops
    }
}

fn g(v: &mut i32) {
    *v += 1;
    m!(*v); // macro argument *cannot* use unsafe ops
}

Topic		Replies	Views
Explicitly marking unsafe macro expressions Unsafe Code Guidelines	17	5485	May 21, 2019
Hidden unsafe due to unintentionally abusable macros and include Unsafe Code Guidelines	58	4969	December 22, 2024
Unnecessary `unsafe` block… language design	22	3131	May 18, 2022
Unsafe categories	18	1458	December 14, 2021
Suggestion: Target-focused `unsafe` call language design	20	1185	December 30, 2020

Idea: Safe blocks for inside unsafe blocks

Related topics