Hidden unsafe due to unintentionally abusable macros and include

CTurt · February 23, 2021, 5:40pm

I wanted to re-raise the discussions around preventing 'hidden unsafe', meaning Rust code in the unsafe dialect that doesn’t require the direct use of the unsafe keyword nearby.

It may be tempting to dismiss these concerns because the patterns are so contrived that they would never organically appear in non-malicious code, but the angle I’m coming from is that they could be used to hide subtle backdoors in Rust code (underhanded Rust contest anyone?).

I'm also not talking about hiding vulnerabilities in safe Rust code; obviously in a security review it’s not sufficient to just grep for unsafe to find vulnerabilities since safe code can be vulnerable too, but at least you would expect to find all of the unsafe Rust by doing this... otherwise, what’s the point of the keyword?

I’ve talked about unsafe macros being the main technique in a previous thread, but it’s now closed, so I’m creating a new one so I can add new thoughts on an idea to combine this with the include built-in macro for more chaos.

Essentially, the simplest, shortest, ‘default’, way people use expr with unsafe in macros allows the unsafe block to spread, meaning that a malicious actor can use someone else's macro to hide their own unsafe statements without needing to introduce their own corresponding unsafe blocks.

A clearer example than originally shown (although there is some irony that some of the initial replies missed the hidden get_unchecked in the original example): if you naively create a mmap wrapper macro that takes a size, it allows any of the callers to hide their own unsafe code in the macro arguments without needing their own unsafe blocks:

Crate:

use libc::*;

#[macro_export]
macro_rules! alloc_pages {
    ($length:expr) => (
        unsafe {
            mmap(0 as *mut c_void, $length, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON, -1, 0)
        }
    );
}

Malicious code containing 'hidden unsafe’ (dereferencing an arbitrary pointer):

fn main() {
    // Intended use of library
    let p = alloc_pages!(0x4000);
    println!("{:p}", p);
    
    // Unsafely dereferencing raw pointer without using unsafe keyword!
    let p2 = alloc_pages!(*(0x41414141 as *mut usize));
    println!("{:p}", p2);
}

Again, this violates the idea that you can simply grep for unsafe to find all of the unsafe Rust in a project. Not only that, but it was pointed out that more advanced tools like Cargo-Geiger and even #![forbid(unsafe_code)] do not detect/forbid the hidden unsafe dereference. I can’t understand how anyone wouldn’t consider this a bug, but even assuming the possibility of the existence of ‘legitimate’ use-cases of this as a feature, should they really be prioritised over violating the ability to easily search for unsafe Rust? I think a breaking change is warranted.

I would like for that example to no longer compile, without an additional unsafe block:

alloc_pages!(unsafe { *(0x41414141 as *mut usize) } );

I believe there would be real appeal to hiding backdoors this way. If you are reviewing Rust code, it would be very easy to quickly assume certain lines are safe under the rationale that an unsafe keyword would be nearby if it were doing anything sketchy. For example, a variable assignment like x = y would look extremely innocuous if there were no unsafe keyword drawing attention to it, but if it happens to be against a global variable (disabling #[warn(non_upper_case_globals)] to disguise further) and this introduces an exploitable data race condition, this could be extremely easy for a review to miss.

It’s particularly annoying because Rust macros are supposed to be more sophisticated than in C where expressions are just ‘pasted’ and hope for the best. For example, Rust macros do solve the common C mistake where pasting an expression could lead to unintended order of operations; with an input like 1 + 2 a macro that does input * 2 will always get 6, as opposed to 5 like it may be in an equivalent C macro. Why can’t macros also be smarter against pasting code into unsafe blocks by default, essentially creating unintended implicit unsafe blocks everywhere?

I wanted to add a handful of real examples to back up that people do indeed write macros this way. I don’t blame these authors; I think the language is at fault here:

github.com

BurntSushi/byteorder/blob/0ead1057d4d1ea59ad9c8e5bc35514646ef8fb83/src/lib.rs#L1908


/// [`BigEndian`]: enum.BigEndian.html
#[cfg(target_endian = "big")]
pub type NativeEndian = BigEndian;

macro_rules! write_num_bytes {
    ($ty:ty, $size:expr, $n:expr, $dst:expr, $which:ident) => {{
        assert!($size <= $dst.len());
        unsafe {
            // N.B. https://github.com/rust-lang/rust/issues/22776
            let bytes = *(&$n.$which() as *const _ as *const [u8; $size]);
            copy_nonoverlapping((&bytes).as_ptr(), $dst.as_mut_ptr(), $size);
        }
    }};
}

macro_rules! read_slice {
    ($src:expr, $dst:expr, $size:expr, $which:ident) => {{
        assert_eq!($src.len(), $size * $dst.len());

        unsafe {
            copy_nonoverlapping(

github.com

cryptocorrosion/cryptocorrosion/blob/1eb8f4b6879cc41b130f1afe62c648b4ac1f3328/utils-simd/ppv-lite86/src/x86_64/sse2.rs#L783


        unimplemented!()
    }
}

macro_rules! swapi {
    ($x:expr, $i:expr, $k:expr) => {
        unsafe {
            const K: u8 = $k;
            let k = _mm_set1_epi8(K as i8);
            u128x1_sse2::new(_mm_or_si128(
                _mm_srli_epi16(_mm_and_si128($x.x, k), $i),
                _mm_and_si128(_mm_slli_epi16($x.x, $i), k),
            ))
        }
    };
}
#[inline(always)]
fn swap16_s2(x: __m128i) -> __m128i {
    unsafe { _mm_shufflehi_epi16(_mm_shufflelo_epi16(x, 0b1011_0001), 0b1011_0001) }
}
impl<S4, NI> Swap64 for u128x1_sse2<YesS3, S4, NI> {

github.com

rust-random/rand/blob/736a6e06ce4f17a4935f53bfc93c0da9b1336f79/rand_core/src/impls.rs#L77


}

macro_rules! impl_uint_from_fill {
    ($rng:expr, $ty:ty, $N:expr) => ({
        debug_assert!($N == size_of::<$ty>());

        let mut int: $ty = 0;
        unsafe {
            let ptr = &mut int as *mut $ty as *mut u8;
            let slice = slice::from_raw_parts_mut(ptr, $N);
            $rng.fill_bytes(slice);
        }
        int
    });
}

macro_rules! fill_via_chunks {
    ($src:expr, $dst:expr, $ty:ty, $size:expr) => ({
        let chunk_size_u8 = min($src.len() * $size, $dst.len());
        let chunk_size = (chunk_size_u8 + $size - 1) / $size;
        if cfg!(target_endian="little") {

These are all protected at least against external code introducing unsafe since they are not exposed via #[macro_export].

Anyway, what I wanted to add to the discussion today is that the include macro has the potential to copy these macros into a context where they can be abused. The idea would be an attacker could include some ‘utils’ file from another library as an unusual and bad, but not overly suspicious from a security review, practice. They would then be able to abuse these macros, which were originally thought to be safe from abuse due to not being exposed, to introduce their own hidden unsafe. Again, if the unsafe code they introduce is as simple as a variable assignment that happens to be on a global, would you really be confident that a reviewer would spot that without an unsafe keyword nearby?

Fortunately(?) I couldn’t get the include macro to work against source files that happen to contain documentation, which covers the 3 examples I listed above - but this definitely shouldn’t be considered as any kind of durable mitigation, and I think the idea still has potential (and we should consider solutions). Playground:

use rand_core; // 0.6.2
include!("/playground/.cargo/registry/src/github.com-1ecc6299db9ec823/rand_core-0.6.2/src/impls.rs");

(In the context of a malicious library, Cargo thankfully standardizes library directory structure, and so we can just use a predictable relative path to another library "../../rand_core-0.6.2/src/impls.rs" instead).

error[E0753]: expected outer doc comment
 --> src/../.cargo/registry/src/github.com-1ecc6299db9ec823/rand_core-0.6.2/src/impls.rs:9:1
  |
9 | //! Helper functions for implementing `RngCore` functions.
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: inner doc comments like this (starting with `//!` or `/*!`) can only appear before items

There may be a bunch of other unscrupulous opportunities for include to activate potentially unreviewed code such as a non-code file like the LICENSE that obviously won’t have been reviewed as Rust code. What would be truly evil would be to have a documentation file “Unsafe guidelines” that contains examples of unsafe Rust, but is crafted to be a fully valid Rust source file and can be subtly included and abused deep in some ‘util’ macro in another crate.

UPDATE: I managed to find a crate that doesn't have documentation, so isn't affected by the above compiler error. I was able to produce a working Playground demo for the include! idea - gaining access to unsound macros that are not supposed to be exposed, and then abusing them to introduce unsafe behavior (accessing global mutable without our own unsafe block). This macro was just using ident instead of expr so we can't introduce arbitrary code, but I think it's still interesting.

RustyYato · February 23, 2021, 5:56pm

~~Sticking the include into a module will subvert outer doc comments,~~

mod dummy {
    include!("sus file path");
}

edit: nvm, this doesn't work

mjbshaw · February 23, 2021, 6:04pm

If I'm understanding this correctly, is this basically "unsafe hygiene" for macros?

CTurt · February 23, 2021, 6:09pm

Unfortunately it seems to give the same error. Playground.

CTurt · February 23, 2021, 6:29pm

I would like for the mmap example (playground) to not compile because there is unsafe code (dereferencing raw pointer) without an unsafe block, this is an unintentional abuse of the macro (which could be someone else's code in a different crate).

toc · February 23, 2021, 6:53pm

Would it be enough (or at least a start) to lint/warn against macro expansion inside unsafe blocks? Assuming the macro is being written in good faith it should be written:

#[macro_export]
macro_rules! alloc_pages {
    ($length:expr) => ({
        let n = $length;
        unsafe {
            mmap(0 as *mut c_void, n, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON, -1, 0)
        }
    });
}

As far as I know there's not easy way (at macro expansion time) to detect that *(0x41414141 as *mut usize) is unsafe because the macro could be interpreting that token stream to mean anything. I would probably most prefer to write the example macro as:

#[macro_export]
macro_rules! alloc_pages {
    ($length:expr) => ({
        mmap(0 as *mut c_void, $length, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON, -1, 0)
    });
}

To require invoking it as unsafe { alloc_pages!(*(0x41414141 as *mut usize)) };. But I understand that may not apply to more complex usage.

CTurt · February 23, 2021, 7:09pm

Slight nit, invoking the macro itself isn't/shouldn't be the unsafe thing here, it's specific to the expression passed in the argument, so to reduce the size of the unsafe block and be most explicit this would be optimal use in my opinion:

alloc_pages!(unsafe { *(0x41414141 as *mut usize) } );

atagunov · February 23, 2021, 7:28pm

Is it the right answer to

color code blocks inside unsafe
carry over code color when substituting values inside a macro
not allow un-colored code inside unsafe after macro expansion

.

opt-in to this check via a rustc flag initially

?

bascule · February 23, 2021, 7:42pm

This is something that bothers me as well, particularly in the context of #![forbid(unsafe_code)], which still allows crate-local unsafe code via these "escape hatches", and therefore provides a misleading sense of what it actually does.

See also unsafe attributes which work in a #![forbid(unsafe_code)] context, e.g. #[no_mangle] (and there are many more than that).

Perhaps at an edition boundary #![forbid(unsafe_code)] could be changed to disallow all of these as well.

Or perhaps things could be take a step farther: at an edition boundary, #![allow(unsafe_code)] could be required to enable them, as was proposed before here:

scottmcm · February 23, 2021, 7:52pm

This specific example is fixed on nightly:

Which implies that updates to do the same for other such constructs would likely be accepted too.

CAD97 · February 23, 2021, 11:28pm

I think making macro_rules! respect hygiene with respect to unsafety is a reasonable change for an upcoming edition (or at the very least macro macros, "macros 2.0").

(And I argued against the plutonium advisory, in favor of the safe! macro (which is just aliased unsafe).)

Given that it's purely safe code to rm -rf / --no-preserve-root, though, I don't think "finding potentially malicious code with rg unsafe" is a good argument for it.

kornel · February 24, 2021, 3:21pm

You can't rely on the unsafe keyword to check safety of crates. There are many other "safe" ways to inject arbitrary code and evade the checks:

This is because unsafe is not a security boundary. It's a lint for double-checking programmer's own assumptions, and not a sandbox.

CTurt · February 24, 2021, 3:38pm

Agreed, I never claimed to rely on grepping for unsafe to be sufficient for a review. I specifically addressed that hiding vulnerabilities in safe Rust is obviously possible, but it's not relevant to this discussion.

The idea is that because Rust makes this useful distinction between safe and unsafe dialects, as a reviewer I would assume that Rust code isn't doing certain things if there is no unsafe block nearby, since the language is supposed to forbid this. For example, I would assume that a line such as x = y isn't assigning to a global variable if it's not inside an unsafe block, but using this trick it could. This concept of 'hidden unsafe' provides opportunities to create significantly harder to spot backdoors than with safe Rust.

kornel · February 24, 2021, 3:42pm

But there's lots of "hidden" unsafe everywhere. vec.push(x) contains hidden unsafe. Why push!(vec, x) shouldn't be allowed to?

CTurt · February 24, 2021, 3:44pm

The key point is that we're introducing completely new unsafe code without an unsafe block, not calling existing unsafe code that has been thoroughly reviewed. The difference between my example and yours should be clear - you can't use push! to introduce new unsafe:

alloc_pages!(*(0x41414141 as *mut usize));

push!(vec, x)

To introduce the dereference here a new unsafe block would be required, which is a good thing:

push!(vec, unsafe { *(0x41414141 as *mut usize) })

kornel · February 24, 2021, 3:47pm

Maybe Rust should support unsafe macro_rules! to let people write macros that aren't safe to call?

CTurt · February 24, 2021, 3:49pm

It absolutely is different, as a function, the code in alloc_pages is self contained and can theoretically be reviewed for upholding safety for all inputs. As a macro, arbitrary new unsafe code can be inserted, which obviously won't be caught in a review of the crate since the code isn't there to be seen; in the context of reviewing the consumer of the macro, it would be very difficult for a reviewer to spot as there's no unsafe block showing that it's using the unsafe dialect.

kornel · February 24, 2021, 3:50pm

edit: nevermind, I misunderstood the issue

mjbshaw · February 24, 2021, 3:58pm

This is about safety hygiene in macros. It seems like a reasonable request, IMO. I see it as more of an exercise in intentional unsafety. Of course there are a large number of opportunities for bad things to happen. Just because I can "safely" exec rm -rf doesn't mean that unsafe is now totally worthless. unsafe has its purposes. And I think it's worth being intentional when it comes to using unsafe.

But there's been a lot of rapid back-and-forth here. I think it's worth taking a little break to slow things down so responses can be more methodical.

burntsushi · February 24, 2021, 3:59pm

Look at the macro defined in the byteorder crate. It is not safe to call for all possible arguments. Merely, all such uses of it in the module are safe. If that macro could be defined as a function and it wasn't defined with unsafe, then we would call it unsound.

Perhaps the macro should not write the unsafe block itself and instead require the caller to do it. But this is a sub-optimal work-around.

Topic		Replies	Views
Explicitly marking unsafe macro expressions Unsafe Code Guidelines	17	5191	May 21, 2019
Do we need `unsafe` macros? language design	6	1179	July 13, 2020
Security breach with Rust macros compiler	31	3756	August 25, 2021
Unsafe-to-invoke macros that expand to items Unsafe Code Guidelines	10	414	February 11, 2025
You_can::turn_off_the_borrow_checker should not be allowed without declaring unsafe Unsafe Code Guidelines	14	2836	December 22, 2024

Hidden unsafe due to unintentionally abusable macros and include

Related topics