Pre-RFC: unsafe attributes

const is a modifier of different nature, but yeah unsafe trait derives could also use that syntax.

But I think that's out of scope for this RFC, which is about the general concept of unsafe attributes, not the specifics of derive. Using the general concept, unsafe(derive(...)) is a thing and indicates that the author of the attribute checked whatever safety requirements apply. EDIT: I added this as a future possibility.

I'll add them to the future possibilities section. EDIT: done.

2 Likes

The unstated alternative is #[no_mangle(unsafe)]

How could this generalize to arbitrary attributes?

Technically #[some_proc_macro_attribute(unsafe)] seems like it could work. It's giving an unsafe token to the proc-macro hygiened to the current crate, if the proc-macro then generates an quote!(#unsafe_token { ... }) it would correctly attribute that unsafety to the current crate rather than the proc-macro. That would potentially require the macro to emit some unused code in some cases, if somehow the macro is unsafe to invoke but doesn't need to produce any unsafe code (I don't know if this is really possible though).

But it doesn't feel like it's discharging the unsafety of the attribute. #[unsafe(some_proc_macro_attribute)] requires more work/knowledge to support the correct hygiene of the tokens (at a minimum Ident::new("unsafe", Span::call_site()), but @dhm has some more expansive ideas), but it is a closer analogue to how we wrap code in an unsafe block to discharge the safety requirements.

EDIT: I actually just tested a proc-macro passing a correctly Spanned unsafe token through, and it doesn't fail with #[forbid(unsafe_code)] in the crate using it, so at least currently this is wrong :cold_sweat: (hopefully that's a bug in the unsafe_code lint?).

tested proc-macro
#[proc_macro]
pub fn foo(tokens: proc_macro::TokenStream) -> proc_macro::TokenStream {
    let uns = tokens.into_iter().next().unwrap();
    assert!(uns.to_string() == "unsafe");
    std::iter::once(uns)
        .chain(
            "{ let foo: i32 = *std::ptr::null(); }"
                .parse::<proc_macro::TokenStream>()
                .unwrap(),
        )
        .collect()
}
3 Likes

Or #[no_mangle_unsafe] or unsafe { #[no_mangle] } or ... There are infinitely many unstated alternatives.

Is there any reason why this one is worth mentioning? I obviously can't list all of them. And precedent from regular code is that we wrap the unsafe operations in some form of unsafe block -- there's no 'postfix unsafe' or so.

2 Likes

Apologies for repeating @josh here, but I'm not sure I follow this logic? If the macro itself is generating the unsafe attributes, then by analogy to unsafe {} it's the macro's responsibility to ensure that it's upholding the safety requirements, either by it's own methods (in this example, perhaps it's mangling in a non-standard way itself for some interop reason), or by analogy to unsafe fn requiring the caller to provide the safety guarantee.

I assume you mean something like without the ability to mark the macro as unsafe as with the RFC as currently stated, the second option isn't available and such macros that currently just have to document the unsafety would be getting a new false sense of security from users expecting unsafe?

Or are you talking about deny(unsafe)? I don't know how the status quo here is affected by this RFC: it already has this problem.


On a related note: semantically, what's the difference between unsafe { foo() } and unsafe_foo()? It's easier for deny(unsafe)? Basically, is there a hard reason to prefer migration to #[unsafe(no_mangle)] over migration to #[unsafe_no_mangle]?

The 2nd is not machine-readable. We have a systematic way of marking functions unsafe, with associated tooling, so that is strictly preferred.

4 Likes

Macros don't have any such responsibility. They can expand to whatever code they like, whether sound or not.

macro_rules! totally_safe {
    ($($tt: tt)*) => { unsafe { $($tt)* } }
}

totally_safe!(::std::ptr::null::<u32>().read());

That's a problem. There is no way to say that the macro is unsafe to use. There is no safety hygiene, which could prevent laundering unsafe code through an unsuspecting macro. There is even no requirement to have an unsafe token.

They can, in the same sense that you can write

fn totally_safe() {
  unsafe { std::hint::unreachable_unchecked() }
}

But such a function is buggy, and the same is true for such a macro.

It is clear that we do not have a machine-ckeckable way of declaring a macro 'unsafe to use'. But that doesn't mean that it is correct to write an unsafe-to-use macro that does not expose this fact in its API through documentation and/or naming.

So, saying "it's the macro's responsibility to ensure that it's upholding the safety requirements" is absolutely correct. Any responsibility the macro offloads on its caller must be clearly documented. I'd love if we could agree on a standard for naming such macros unsafe_..., or find some way to force their use only inside unsafe blocks, but that is a different RFC.

In this RFC, you are getting off-topic. This RFC does not aim to solve the problem of unsafe macros. It aims to solve the problem of unsafe attributes, and provides a path for also using that approach in the future for unsafe attribute macros. Other macros are an entirely separate question.

1 Like

@dhm has some ideas for fixing this which is what I referred to earlier. As they mention it works just fine as a future extension of this RFC using the same syntax, (and I don't really see many ways in which you could solve it without being compatible with this syntax in some way, the syntax is very simple while making proc-macros safety-correct will require fundamental changes to their definitions).

2 Likes

Yes, but:

  1. I was asking for the purpose of comparison to unsafe attributes, and
  2. that's declaration site, not use site.

As there's no declaration site for unsafe attributes (ignoring proc macros for now), I was checking if there's any additional meaning at the use site for unsafe expressions appropriate to the equivalent attributes. Obviously it's grosser to check for unsafe_ prefixes vs an explicit syntax, but it's incredibly easy to implement and check, including for Rust users and macro writers, so it's worth justifying why it's better.

Others have covered this somewhat already, but that's a strange definition of "responsibility". It generally doesn't mean something you aren't capable of not doing, but something that if you don't do it, bad things happen - that's the meaning I was using. But that's definitely getting off topic!

I actually agree with your original post here in general! I just don't know what the specific issue is you were describing in the quoted part, because it sounds like you think an implementation for attributes alone would be worse than the status quo, at least potentially. Is that largely concerns about the syntax space for attributes, or is that an unrelated concern?

They are built-in, so the declaration site is the language/compiler itself.

I don't understand what you mean by this.

It's better because it's less gross? :wink:

No, because the macro never promised that it can be used in this way. Many macros, such as ones using tt matchers, have literally no way to check or enforce such safety rules. In the IRLO thread I linked, there are examples of actual real-world macros which allow safety laundering, not because of malicious intent or undocumented preconditions, but because they were never intended to be used with unsafe code.

Yes, but we still need to discuss the extended version, to make sure that this RFC really is forward-compatible with arbitrary unsafe attributes, and not just an ad-hoc solution.

Some questions:

  • Is #[unsafe(foo, bar)] and #[unsafe(foo)] #[unsafe(bar)] the same? It would be nice. But an attribute macro foo could likely be able to consume #[unsafe(bar)], but would not be applicable to bar in #[unsafe(foo, bar)]. Unlike derive macros, which cannot modify the item definition and are (usually) entirely orthogonal, the custom attributes could interact in arbitrary ways, which may become confusing with the #[unsafe(..)] attribute.

  • How can I consume an unsafe attribute via a proc macro? Currently the macros match on the name of the attribute, but with #[unsafe(..)] the name will be arbitrarily nested within the inner token tree (there may be nested unsafe blocks, and who knows what else in the future).

  • Active attributes remove themselves once they are processed. How would that work if an active attribute is within an (arbitrarily nested) #[unsafe(..)] attribute?

  • Can I have cfg_attr(..) inside of #[unsafe(..)]? How would it work?

One possible answer to the above questions would be to check unsafe correctness before macro expansion, and simplify #[unsafe(foo, bar)] into just #[foo] #[bar]. But that wouldn't work if the macro attribute or the attributes it uses are themselves unsafe, or if a macro expands to unsafe attributes, or if we want to preserve the unsafety hygiene.

  • Can we have unsafe tool attributes? It's unlikely that Clippy or rustfmt would require that, but we may have other tools in the future, and there is [register_tool].

  • The #[unsafe(..)] attribute would cause backwards compatibility issue with any user-defined #[unsafe] attribute, whether used separately or in derive macros. This may classified as acceptable breakage, and/or be gated over an edition boundary. However, it's hard to detect (the attribute may be emitted by a macro and consumed by another macro), impossible to fix automatically, and with the scope creep [1] [2] [3] [4] of stdlib attributes, this is not an acceptable long-term solution.

  • Attributes can exist on statements. What if the statement is inside of an unsafe block? Does it absolve us from using unsafe on attributes? I would expect not, but I think it's worth specifying.

The macro totally_safe is definitely buggy, there is no way this is an accident or unintentional.

I am not denying that we have a problem around unsafety hygiene. But this RFC is the wrong place to discuss that problem. So thanks for asking a bunch of concrete questions instead. :slight_smile:

How does this work with #[cfg_attr(test, foo, bar)] vs #[cfg_attr(test, foo)] #[cfg_attr(test, bar)]? As stated in the RFC, 'attribute lists' are already a thing in cfg_attr, and they should behave the same in unsafe.

But if it simplifies things, maybe it's better for now to say that unsafe can only contain a single attribute?

That sounds like a helper library to parse these attributes would be nice. That should be worked out in the 'unsafe proc macro attributes' RFC, but I don't see a blocking concern for this RFC here.

Same answer as previous question.

I guess for now we can only have cfg_attr at top-level? In terms of compositionality, it would make most sense to also make it work in arbitrary nested form, with the obvious semantics. How does cfg_attr work with proc macro attributes, in particular when there is a cfg_attr after the proc macro? Does cfg_attr expansion happen first? I would think it does, so I'd say the same should happen for cfg_attr nested inside unsafe.

If we say unsafe can only contain a single attribute, then it's probably not too bad to require cfg_attr(..., unsafe(...)) and disallow unsafe(cfg_attr(..., ...)).

That's clearly a future possibility.

What is the usual process here for adding new built-in attributes?

The attributes this RFC is concerned with don't even make sense for statements.

Here's the problem, derive attributes also allow lists, and behave very differently. Custom attributes are a total free for all. So there is no good precedent for lists inside attributes. #[cfg_attr] specifically avoids many issues because it is expanded into a list of attributes and removed, but I don't know how it's squared with macros.

Surprisingly, I couldn't find any description of cfg-expansion and its interaction with macros. Not in the Reference, not in the rustc dev guide. It would be nice if someone knowledgeable would explain how the compiler solves the issue.

I think that, in light of the problems listed in my previous post, nested unsafe attributes are a bit of a dead end, or at least something significantly more complicated. If we require that #[unsafe(..)] must have a single attribute inside, then many problems vanish.

  • The problem of multiple inner attributes just vanishes.
  • Since the syntactic form of an unsafe attribute is simply #[unsafe(attr)], it's very easy to parse with proc macros. There are no questions about the semantics of ordering, or nesting. It could be exposed in the macro api as an extra unsafety property of the attribute. The simple form means that even macro_rules! can deal with unsafe attributes.
  • We don't have the problem of dropping or adding attributes inside of deeply nested long lists, so if attr is active, we can just remove #[unsafe(attr)] as a whole. This also side-steps the issue of long attribute lists inside of unsafe blocks, or rather it punts it to the usual problem of many attributes.
  • Again, since there is a single attribute inside, it's easy to specify the interaction of unsafe and attr for any attr. For example, it's easy to forbid safe attributes with an unsafe annotation. Since cfg_attr can expand to many attributes (#[cfg_attr(cond, foo, bar, baz)]), it's best to forbid it.

Personally, I would prefer if unsafe attribute had a different syntax, which would clearly separate them from safe attributes (e.g. #[unsafe attr]), but #[unsafe(attr)] is almost as good, except for the extra parentheses, and doesn't cause issues with forwards-incompatible changes to the attribute syntax, which is valuable for tools and proc macros.

We can consider adding blocks, e.g. with the semantics of eager expansion into a list of individual unsafe attributes, later if annotating each specific attribute becomes too onerous (personally I don't expect them to be too common, unsafe blocks are already a rarity, unsafe attributes should be used even less).

1 Like

I don't understand what the issue is with having #[unsafe(attrs,...)]. It seems rather simple to specify:

  • #[unsafe(attr1, attr2)] acts the same as #[unsafe(attr1)] #[unsafe(attr2)] (I think this could even just be a macro expansion which is triggered in the usual order of attribute macros)
  • otherwise #[unsafe(attr)] just runs the #[attr] macro with a flag ticked somewhere saying that unsafe stuff is allowed (and this flag would be checked for builtin unsafe attrs as well as #[proc_macro_attribute(unsafe)] macros).

If we say that this is literal attribute macro expansion, then (I believe) that means the tokens in the expansion would be observable depending on where you place other proc macros, but this seems fine and consistent with other attribute macros.

1 Like

You are comparing apples and oranges. derive is, roughly, of type list DeriveTrait -> Attribute, which is very different from the list attributes we are discussing here, which are list Attribute -> Attribute.

Yeah that seems easiest for now. I'll leave supporting lists (for the cases where it makes sense) to someone else...

1 Like

That is an implementation detail, which shouldn't be a problem of the end user.

No, the public types of an API are pretty much the opposite of an implementation detail. Composing derives and composing arbitrary attributes are just fundamentally not the same thing.

Anyway, I removed lists from the RFC.

Thanks for the feedback, everyone! I have now submitted this as a proper RFC. Further discussion should happen on GitHub.

1 Like