Pre-RFC: unsafe attributes

RalfJung · September 28, 2022, 10:59am

Feature Name: unsafe_attributes
Start Date: 2022-09-08
RFC PR: rust-lang/rfcs#0000
Rust Issue: rust-lang/rust#0000

Summary

Consider some attributes 'unsafe', so that they must only be used like this:

#[unsafe(no_mangle)]

Motivation

Some of our attributes, such as no_mangle, can be used to cause Undefined Behavior without any unsafe block. If this was regular code we would require them to be placed in an unsafe { ... } block, but since they are attributes that makes less sense. Hence we need a concept of 'unsafe attributes' and accompanying syntax to declare that one is aware of the UB risks here (and it might be good to add a SAFETY comment explaining why this use of the attribute is fine).

Guide-level explanation

Example explanation for no_mangle; the other attributes need something similar.

When declaring a function like this

#[no_mangle]
pub fn write(...) { ... }

this will cause Rust to generate a globally visible function with the linker/export name write. As consequence of that, other code that wants to call the C write function might end up calling this other write instead. This can easily lead to Undefined Behavior

The other write might have the wrong signature, so arguments are passed incorrectly.
The other write might not have the expected behavior of write, causing code relying on this behavior to misbehave.

To avoid this, when declaring a function no_mangle, it is important that the name of the function does not clash with other globally named functions. Similar to how unsafe { ... } blocks are used to acknowledge that this code is dangerous and needs manual checking, unsafe(no_mangle) acknowledges that no_mangle is dangerous and needs to be manually checked for correctness:

// SAFETY: there is no other global function of this name
#[unsafe(no_mangle)]
pub fn my_own_write(...) { ... }

Reference-level explanation

Some attributes (e.g. no_mangle, export_name, link_section -- see here for a more complete list) are considered "unsafe" attributes. An unsafe attribute must only be used inside unsafe(...) in the attribute declaration, like

#[unsafe(no_mangle)]

For backwards compatibility reasons, using these attributes outside of unsafe(...) is just a warning, not a hard error. Unsafe attributes that are added in the future can hard-require unsafe from the start since the backwards compatibility concern does not apply to them.

Syntactically, for each unsafe attribute attr, we now also accept unsafe(attr) anywhere that attr can be used. unsafe cannot be nested, cannot contain cfg_attr, and cannot contain any other (non-unsafe) attributes.

The deny(unsafe_code) lint denies the use of unsafe attributes both inside and outside of unsafe(...) blocks. (That lint currently has special handling to deny these attributes. Once there is a general notion of 'unsafe attributes' as proposed by this RFC, that special handling will no longer be needed.)

Drawbacks

I think if we had thought of this around Rust 1.0, then this would be rather uncontroversial. As things stand now, this proposal will cause a lot of churn since all existing uses of these unsafe attributes need to be adjusted. The warning for using unsafe attributes outside unsafe(...) should probably have an auto-fix available to help ease the transition here.

Rationale and alternatives

Nothing. We could do nothing at all, and live with the status quo. However then we will not be able to fix issues like no_mangle being unsound, which is one of the oldest open soundness issues.
Rename. We could just rename the attributes to unsafe_no_mangle etc. However that is inconsistent with how we approach unsafe on expressions, and feels much less systematic and much more ad-hoc.
deny(unsafe_code). We already started the process of rejecting these attributes when deny(unsafe_code) is used. We could say that is enough. However the RFC authors thinks that is insufficient, since only few crates use that lint, and since it is the wrong default for Rust (users have to opt-in to a soundness-critical diagnostic -- that's totally against the "safety by default" goal of Rust). This RFC says that yes, deny(unsafe_code) should deny those attributes, but we should go further and require an explicit unsafe(...) attribute block for them to be used at all.
Item-level unsafe blocks. We could find some way to have 'unsafe blocks' around entire functions or modules. However, those would go against the usual goal of keeping unsafe blocks small. Big unsafe blocks risk accidentally calling an unsafe operation in there without even realizing it.
Other syntax. Obviously we could pick a different syntax for the same concept, but this seems like the most natural marriage of the idea of unsafe blocks from regular code, and the existing attributes syntax.

Prior art

We have unsafe blocks; this is basically the same thing for the "attributes DSL".

In the attribute DSL, we already have a "nesting" construct: cfg_attr. That allows terms like #[cfg_attr(debug_assertions, deny(unsafe_code), allow(unused))], so there is precedent for having a list of attributes inside a single attribute.

I don't know of other languages that would distinguish safe and unsafe attributes.

Unresolved questions

Different lint staging. The lint on using existing unsafe attributes like no_mangle outside unsafe(...) could be staged in various ways: it could be allow-by-default to start, it could be edition-dependent, it might eventually be deny-by-default or even a hard error on some editions -- there are lots of details here, which can be determined later during the process.

Future possibilities

Unsafe attribute proc macros. We could imagine something like
```
#[proc_macro_attribute(require_unsafe)]
fn spoopy(args: TokenStream, input: TokenStream) -> TokenStream {…}
```
to declare that an attribute proc macro is unsafe to use, and must only occur as an unsafe macro. Such an unsafe-to-use attribute proc macro must declare in a comment what its safety requirements are. (This is the unsafe from unsafe fn, whereas the rest of the RFC is using the unsafe from unsafe { ... }.)
Unsafe derive. We could use #[unsafe(derive(Trait))] to derive an unsafe impl where the deriving macro itself cannot check all required safety conditions.
Unsafe tool attributes. Same as above, but for tool attributes.
Unsafe attributes on statements. For now, the only unsafe attributes we have don't make sense on the statement level. Once we do have unsafe statement attributes, we need to figure out whether inside unsafe {} blocks one still needs to also write unsafe(...).
Lists and nesting. We could specify that unsafe(...) may contain a list of arbitrary attributes (including safe ones), may be nested, and may contain cfg_attr that gets expanded appropriately. However that could make it tricky to consistently support non-builtin unsafe attributes in the future, so the RFC proposes to not do that yet. The current approach is forward-compatible with allowing lists and nesting in the future.

josh · September 28, 2022, 11:11am

Very much in favor of this proposal. This will mean people can reliably search for unsafe and actually catch everything unsafe.

I would personally suggest, to avoid massive churn, that we handle this transition smoothly using an edition:

Make sure deny(unsafe_code) and warn(unsafe_code) catch all such attributes, in any edition.
Add support for unsafe(these_attributes) in every edition.
Add an edition migration lint, with rustfix, to all editions. (Allow people to opt into that lint even when not doing an edition migration.
In Rust 2024, remove the ability to use these attributes outside of unsafe.

At least initially, I don't think we should warn about the non-unsafe-wrapped versions in previous editions; that'll produce a lot of noise. Perhaps we could warn about those by default if we're already warning or denying unsafe code anyway?

RalfJung · September 28, 2022, 11:31am

I do not have strong preference on the migration strategy, but I think eventually it should be warn-by-default on all editions and a hard error on the latest edition. So your proposal works for me. I'd be even more happy if it would eventually be a hard error on all editions but that is probably too much to ask.

FWIW that is already the status quo: the unsafe_code lint will warn/deny these attributes (the ones that have been identified and implemented yet, anyway). See this issue for more details.

I must be misunderstanding since that sounds like your first bullet point?

josh · September 28, 2022, 12:01pm

I don't think we can make it a hard error, no. We might be able to get to warn-by-default eventually, but I'd want to see how much churn that would produce.

I assumed so, but I wanted to capture all the migration requirements in one place.

No, this is different. I'm suggesting that, if we're already warning about those constructs because they're unsafe code, we could also warn that they're the obsolescent form of those constructs. (Though not with a rustfix, because presumably people want to remove them, not correct them.)

afetisov · September 28, 2022, 12:01pm

If we go that way, then it would be better to have a proper concept of unsafety for attributes, rather than have an ad-hoc hack for a couple of specific ones. For example, let's say I write a proc macro which adds #[link_section] #[no_mangle] to the function. This would mean that the unsafety is now hidden, and it can be devilishly hard to find it in the proc macro code.

We need some way to track the context of an attribute and forbid unsafety within ostensibly safe attributes. It should also be possible to run cargo-expand on the code and inspect the expanded output for unsafe attributes. However, as usual unsafety must be possible to wrap in a safe interface, including a macro interface, so there must be some way to explicitly tag safety at the attribute level.

The syntax looks a bit weird to me. It looks not like a context, but like a separate attribute with arbitrary tokens inside (which is the current behaviour of function-like attributes). Is it expected that the possible inner attributes are hardcoded? Does it accept arbitrary known attribute calls inside? The current grammar for attributes just says that it's a delimited token tree, which isn't very encouraging.

What if we added some separate syntax for unsafe attributes, e.g. #unsafe [ attr ]? The latter syntax may be problematic, because #ident is currently used by macros, e.g. quote. Maybe #[unsafe][attr]? This "double attribute" syntax doesn't seem to conflict with anything at first sight (we can't attach attributes to arrays). Maybe #[[unsafe] attr]?

RalfJung · September 28, 2022, 12:37pm

That's a completely separate question -- it is the question of unsafe macros. Those are tricky both for code and attributes, nothing new here.

This RFC is proposing a proper concept of unsafety for attributes, rather than the current ad-hoc approach.

The syntax is similar to e.g. the 2nd argument of cfg_attr(..., attributes), so there is certainly precedent for "attribute-taking function-like attributes". For better or worse, attributes use parentheses exclusively, no braces or brackets; it would be strange to start diverging from that now.

I would expect unsafe(attributes) allows arbitrary inner attributes (just like we allow arbitrary code inside an unsafe block), but we might want to have an "unused unsafe" warning if none of them is unsafe.

josh · September 28, 2022, 12:47pm

We allow both kinds of functions: functions that are unsafe to call and call unsafe things, and functions that are safe to call and call unsafe things. We should do the same for attributes.

afetisov · September 28, 2022, 12:50pm

From the user's PoV there is no difference between a built-in attribute, an attribute macro and an attribute consumed by macros. Whatever the solution, it should be at least syntactically applicable to all of those cases (even if only built-ins are supported initially), because in the end we would want to support them, and the user shouldn't be required to know the implementation details of attributes to deal with basic issues of safety.

Yes, there is precedent in the form of cfg_attr, but I'd argue it's a bad precedent, because it generalizes poorly. Cfg expressions and gated attributes are hard to write, hard to read, and hard to compose.

RalfJung · September 28, 2022, 1:01pm

So are you saying we should allow unsafe(attribute_macro)? Sure, that makes sense to me.

Attributes consumed by macros are entirely up to that macro but this RFC would set the precedent for them to also require unsafe(...) if those attributes have unsafe effects, so I expect attribute-consuming macros would pick that up.

I disagree with most of these statements.^^ I would say they compose perfectly (cfg_attr demonstrates that) and generalize just fine (e.g. to unsafe). The syntax is a bit too lisp-like for my liking with all these parentheses, but that's about it. If there's one thing lisp is good at it's composition/generalization, so there is plenty of evidence that your claim doesn't hold up.

(Note that we are not arguing about the cfg part of the cfg_attr syntax, only about the attr part. The cfg part is entirely irrelevant for this discussion.)

Readability suffers somewhat, that is true, but I don't think unsafe(...) is complicated enough to warrant an entirely new syntax. There's significant cost associated with that, and unsafe attributes are fairly rare overall, so I don't think it is worth the cost. (I also don't think #[[unsafe] attr] is actually easier to read. That syntax is just too alien in Rust. If anything I would advocate for #[unsafe { attr }].)

Furthermore if we wanted new syntax we'd also want it for cfg_attr, so that would really be an entirely different RFC, one that is all about extending the grammar of our attribute language. Feel free to write such an RFC, but in this one I think it is just a distraction. The current syntax is certainly not so bad that we'd have to block unsafe attributes on figuring out a new syntax.

afetisov · September 28, 2022, 1:53pm

I see that even Rust could not escape the Greenspun's Iron Law.

kpreid · September 28, 2022, 1:54pm

Something to consider that the RFC does not mention: Would attribute unsafe() behave like unsafe blocks, in that it accepts any attribute whether or not it is unsafe, or would it only contain unsafe attributes? I suspect the former is a wise decision for extensibility.

RalfJung · September 28, 2022, 2:21pm

This has been discussed upthread but indeed I should update the RFC as well -- and I agree it should be allowed to contain arbitrary attributes.

dtolnay · September 28, 2022, 3:00pm

The proposal doesn't mention proc macro attributes. Presumably it'd just be a matter of declaring the proc macro entrypoint an unsafe fn to require the callers to write the call using the new unsafe attribute syntax.

#[proc_macro_attribute]
unsafe fn spoopy(args: TokenStream, input: TokenStream) -> TokenStream {…}

RalfJung · September 28, 2022, 3:15pm

No I don't think it makes sense to make the proc macro implementation be an unsafe fn. It's about the code generated by the implementation.

I am not very familiar with proc macro attributes. I could imagine something like this

#[proc_macro_attribute(unsafe)]
fn spoopy(args: TokenStream, input: TokenStream) -> TokenStream {…}

being an indication to the compiler that this proc macro attribute must only be accepted inside an unsafe attribute block.

Jules-Bertholet · September 28, 2022, 3:15pm

I don't think this is right. spoopy itself is just a function that takes a TokenStream and outputs a TokenStream. The unsafety comes from how the proc-macro infrastructure uses the function, so the unsafe keyword belongs on proc_macro_attribute or similar.

Jules-Bertholet · September 28, 2022, 3:52pm

Another bikesheddable question is unsafe derive macros. unsafe(derive(Trait)), derive(unsafe(Trait)), or derive(unsafe Trait)?

RalfJung · September 28, 2022, 4:00pm

I think we can bikeshed that when the need for such a derive arises

This RFC only allows unsafe(derive(Trait)), the other 2 are changing the contents of the derive attribute itself.

Jules-Bertholet · September 28, 2022, 4:03pm

bytemuck and similar could benefit from this.

dtolnay · September 28, 2022, 4:12pm

Relevant lang team discussion on how to spell #[derive(const Trait)]⸺

dhm · September 28, 2022, 7:49pm

This is true, and belongs to the Future possibilities section, imho. It is indeed the natural follow-up to this PR (and I have already in mind what I think could be a decent design for this).

The existence of unsafe proc-macro attributes requires that of unsafe attributes altogether, and this RFC enables just that. And providing a way for proc macro attributes to become unsafe attributes themselves can come later. Given that it requires a non-trivial design, it should come later.

So this RFC is completely welcome in that regard , it's been long due: thanks for submitting it @RalfJung!

Topic		Replies	Views
Reserved Names for `#[no_mangle]` and `#[link_name]` language design	10	2773	April 4, 2021
No_mangle crate-level attribute language design	9	1156	December 16, 2021
#[no_mangle] vs modules and pub language design	18	6046	March 25, 2019
Pre-RFC: making unsafe more safe to use language design	12	3262	March 25, 2019
Unsafe categories	18	1446	December 14, 2021