Hidden unsafe due to unintentionally abusable macros and include

CTurt · February 24, 2021, 4:00pm

Disagree - the author intended for the macro to be safe to use as they didn't expect new unsafe code to be insertable - they wouldn't have marked the macro as unsafe even if that was a supported feature, because this isn't their intent for it to be used that way. If the default of the language was that a new unsafe block was required here as I'm proposing, then the macro would have been safe:

alloc_pages!(unsafe { *(0x41414141 as *mut usize) } );

Agree with the general sentiment here. I've previously raised that using unsafe in a macro is difficult to reason about because of things like overloading basic traits like arithmetic and comparisons. Generics is an interesting point to consider too.

CTurt · February 24, 2021, 4:09pm

I'm not disagreeing with the proposal to be able to mark macros as unsafe, like we can do with functions; I think this is a good idea too.

Specifically, I think it's worth recognizing that expressions are not just traditional function inputs that can potentially tweak the function's code to be unsound, they are standalone code, and as code they should be subjected to requiring their own unsafe block if they want to use the unsafe dialect.

burntsushi · February 24, 2021, 4:09pm

I would have in byteorder. And if I didn't, I would consider it an oversight. It would be unsound and thus a bug to fix.

tspiteri · February 24, 2021, 4:10pm

Another workaround I use is to include unsafe_ in the name of the macro, so for example the byteorder macro would be called unsafe_write_num_bytes. I find this especially useful when I want to use macros in different modules.

CTurt · February 24, 2021, 4:12pm

Let's take it back to the mmap example. The point is that this is only unsound because you can insert new unsafe code into the $length expression. To fix the macro to behave as the author intended, the trick should be fixed - not to mark the whole macro as unsafe:

#[macro_export]
macro_rules! alloc_pages {
    ($length:expr) => (
        let x = $length;
        unsafe {
            mmap(0 as *mut c_void, x, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON, -1, 0)
        }
    );
}

kornel · February 24, 2021, 4:19pm

Oh, I've missed that detail before. I see your point now.

burntsushi · February 24, 2021, 4:28pm

Yes, that's a good idea. Done: impl: prefix unsafe macros with 'unsafe_' · BurntSushi/byteorder@5d9d038 · GitHub

burntsushi · February 24, 2021, 4:32pm

Ahhhh, I see now. Yes, I also missed your point. So the problem is the scope of unsafe itself being applied to the expansion of the macro. I agree that in such a case, it wouldn't make sense to mark the macro as unsafe. In byteorder's case, it's not as easy to make the macro safe for all inputs like your alloc_pages macro.

I'm not quite sure what to do about it though. This might just be one of those things that you have to be careful about when opening an unsafe block inside of a macro.

mjbshaw · February 24, 2021, 4:41pm

There are ways safety hygiene could be implemented. Maybe it's only a macros 2.0 thing. Maybe editions could be used (though I think it would have to be based on the caller's edition, not the macro's crate edition). CAD97 response was a reasonable starting point, IMO.

Should this be forked into a new topic? We're at >30 replies now, most of which have been based on misunderstanding the OP. I think it would be nice to start with a fresh topic focused on macro safety hygiene and exploring possible ways it could be implemented without breaking the universe.

mbrubeck · February 24, 2021, 5:24pm

Regardless of what other steps we take, implementing a lint (in Clippy?) as suggested by toc above would be useful. While a lint doesn't stop malicious crates from using this pattern deliberately, it would help non-malicious macro authors and code reviewers detect and work around the problem today, and would also smooth the migration to a new edition (or new syntax) where some form of hygiene is used to solve the problem.

RalfJung · February 24, 2021, 5:39pm

Summarizing what was said above, there are two aspects to this problem, that need to be solved separately:

For cases where the macro is supposed to be always-safe-to-use but makes use of an unsafe block, we need unsafety hygiene (or carefully pulling code outside of unsafe blocks in the macro, as with let x = $length).
For cases where the macro is not supposed to be always-safe-to-use, we need a notion of "unsafe macros", or establish a convention that such macros should have a name starting with "unsafe" (or similar). This has come up before in the plutonium advisory discussion, where the question was "what does soundness mean for macros".

H2CO3 · February 24, 2021, 5:41pm

Sure, but this doesn't have to do with what OP is asking for. Whatever level of unsafety unsafe means, allowing it to "leak" is not great because it defeats its purpose then. This doesn't only apply if unsafe means "potential security hazard/boundary" (which it indeed isn't), it works just fine with its actual meaning, "memory safety boundary".

Re #[link_section] is safe: it shouldn't be, because it can cause arbitrary memory corruption. The corresponding issue is officially l-unsound.

bascule · February 24, 2021, 9:35pm

Fantastic! If anyone else is interested in this sort of thing, I opened a tracking issue to try to put together a list of such attributes and track progress on adding similar lints:

github.com/rust-lang/rust

Add allow-by-default lints for all attributes with unsafe behaviors

opened 09:31PM - 24 Feb 21 UTC

tarcieri

A-lint T-lang

In #72188, an allow-by-default lint was added against `#[no_mangle]`, which prev…iously permitted unsafety even in the event that `#![forbid(unsafe_code)]` was used. I think it'd be a good idea to collect a list of all such "unsafe attributes" and try to add similar lints for them on a case-by-case basis. This shouldn't be terribly difficult to do by extracting them from the [Built-in attributes index from the Rust Reference](https://doc.rust-lang.org/reference/attributes.html#built-in-attributes-index). I've started a list below. Please leave comments and I'll update it (or anyone else who has the power to do so, feel free). Items in the list with ~~strikethrough~~ are deemed out-of-scope. ## ABI, linking, symbols, and FFI - [ ] `link` - [x] ~~[`link_args`](https://github.com/rust-lang/rust/issues/29596)~~ (see #29596) - [x] ~~`link_name`~~ (see [this comment](https://github.com/rust-lang/rust/issues/82499#issuecomment-785412847)) - [x] ~~`repr`~~ (see [this comment](https://github.com/rust-lang/rust/issues/82499#issuecomment-785415172)) - [x] `export_name` (#72188, #72209) - [x] `link_section` (#97086) - [x] `no_mangle` (#72188, #72209) - [x] ~~`used`~~ (see [this comment](https://github.com/rust-lang/rust/issues/82499#issuecomment-785418613)) ## Others? Please leave comments with suggestions or corrections.

ogoffart · February 24, 2021, 10:06pm

For the case of macro that are supposed to be unsafe to call, i don't think the unsafe_ prefix is good because that's not the unsafe keyword. But I prefer if the macro force the use of unsafe, for example:

macro_rules! write_num_bytes {
    (unsafe $ty:ty, $size:expr, $n:expr, $dst:expr, $which:ident) 
       => { ... }
}

meant to be used as

write_num_bytes!(unsafe u16, 2, n, buf, to_be)

That way the unsafe keyword is present.

That's what I did in the macro of the cpp crate: cpp::cpp_class - Rust

2e71828 · February 25, 2021, 8:59pm

Should there be a corresponding safe { ... }¹ construct that can be used inside an unsafe block to temporarily disallow unsafe operations? It would make it a lot easier to manually fix macros with this issue, at least.

¹ This is likely a bad keyword choice; that can be bikeshedded later if this idea goes anywhere.

mbrubeck · February 25, 2021, 10:06pm

And a lot easier to automatically fix them, as well. (For example, “safe blocks” could be used to easily provide a suggested fix for macros that trigger the proposed lint.)

RalfJung · February 26, 2021, 8:37am

I've seen this proposed before, IIRC for the same reason... it would be a "low-tech" alternative to proper unsafety hygiene. (I could also imagine that some macro authors do not want unsafety hygiene, they might want their macro to act like an unsafe block.)

2e71828 · February 26, 2021, 9:18am

It seems difficult to make automatic macro hygiene perfect: How should tt munchers that build expressions be handled, for instance? Other forms of code generation, like proc macros and build scripts, might also benefit from an easy way to provide unsafe hygiene as well. One path forward could look something like this:

Implement an explicitly-safe block, probably something like #[safe] { ... } to avoid a new keyword for something that'll only be used in generated code.
Add a new annotation (#[hygienic_safety]?) for macro_rules that changes how argument parsing is performed. If the macro is expanded in a safe context, expr and block arguments get automatically wrapped in #[safe] { ... }. Add unsafe_expr and unsafe_block argument types that retain the old behavior.
Let proc macros know whether they're being expanded in a safe or unsafe context, so they can implement similar hygiene behavior as macro_rules.
At an edition boundary, apply #[hygienic_safety] to all macro definitions. Anything broken by this can be fixed by replacing expr and block arguments with their unsafe equivalents.

This means that any effects of the hygiene system show up explicitly in the macro's expansion, and #[safe] { ... } can be used to explicitly opt-in to hygiene for any code not covered by these simple rules.

CTurt · February 26, 2021, 6:31pm

I was able to produce a working Playground demo for the include! idea - gaining access to unsound macros that are not supposed to be exposed, and then abusing them to introduce unsafe behavior (accessing global mutable without our own unsafe block). This macro was just using ident instead of expr so we can't introduce arbitrary code, but I think it's still interesting.

CAD97 · February 26, 2021, 11:29pm

Unsafety hygiene would work the same way that standard hygiene (name lookup and edition) works today.

Essentially, each token is tagged with whether it was lexicographically within an unsafe block when written. When an identifier that refers to an unsafe operation is used, the safety check is not done by checking if we're currently in an unsafe block, but if the token was tagged as lexicographically in an unsafe block.

If there's a case where this doesn't work, I'd love to hear it. If you can break hygiene with that specification you can probably break the existing hygiene

Topic		Replies	Views
Explicitly marking unsafe macro expressions Unsafe Code Guidelines	17	5309	May 21, 2019
Do we need `unsafe` macros? language design	6	1217	July 13, 2020
Security breach with Rust macros compiler	31	3814	August 25, 2021
Unsafe-to-invoke macros that expand to items Unsafe Code Guidelines	10	449	February 11, 2025
You_can::turn_off_the_borrow_checker should not be allowed without declaring unsafe Unsafe Code Guidelines	14	2869	December 22, 2024

Hidden unsafe due to unintentionally abusable macros and include

Related topics