Hey, I've been reading the Rust reference and found a distinction between active and inert attributes:
An attribute is either active or inert. During attribute processing, active attributes remove themselves from the thing they are on while inert attributes stay on.
The cfg and cfg_attr attributes are active. The test attribute is inert when compiling for tests and active otherwise. Attribute macros are active. All other attributes are inert.
This explains why it's not possible to check in if a branch or expression is guarded by a [cfg] attribute. Does someone know why this distinction was introduced? Wouldn't it be more beneficial to handle them like any other attribute?
#[cfg] attributes modify the AST of the source. A disabled #[cfg] block is entirely removed from the parse tree so that it doesn't require analysis, symbol resolution or compiler passes. It wouldn't make sense to leave an attribute without the corresponding subtree, would it? But then there is also no point in leaving an enabled #[cfg] attribute, if you observe the AST nodes, then they were conditionally (or unconditionally) enabled.
The proc macros are removed for the same reason: an attribute proc macro must be attached to a syntactically valid block of code, but otherwise that code has no semantics within the Rust language. A proc macro can totally rewrite the code into something entirely different, and the old code may very well be invalid (e.g. see the async-trait macro --- current Rust has no such thing as async traits). Thus leaving the old AST nodes would make no sense, and neither would leaving the attribute macro present.
I don't know why the derive macros are active, probably for consistency with arbitrary attribute macros.
#[test] is also essentially a built-in proc macro which significantly rewrites the code.
Btw since your question is asking for the explanation of current Rust rather than proposes some changes, I believe it is a better fit on users.rust-lang.org.
This question is also/mostly about the internal representation. It makes sense that disabled subtrees are removed, keeping them would make countless things very difficult. However, removing the enabled attribute without storing the information complicates the analysis. Clippy struggles with this a bit, we have some workarounds, but all of them are hacky at best.
You're right, that macros are removed from the AST, but their existence is still stored in the span information of the expanded nodes, this is not the case for the #[cfg] or #[test] attributes, though.
I'm actually surprised, that this behavior is documented in the reference, as I don't see the effect it has on the users and only see the difficulty it creates for source code analysis.
The expanded macro may not produce any item the attribute could be attached to. Also if it were to be re-attached, it would get expanded again on the next pass of the macro expander, which is likely incorrect. And what if a macro expanded into multiple items, would the attribute be attached to each item, or only some.
It's not an unsolvable problem. There could be dummy noop nodes just for macro attachment (essentially empty blocks). An macro which was expanded once could get a flag set which would prevent subsequent expansion. With several items, attaching the attribute to all items is one possibility. The other would be to add grouping nodes which would contain all expanded items.
But the real question is, are those complications really worth it? What would it give you that you cannot do today?
It would just provide some additional information for analysis. For example, it can be helpful to know that a certain branch is guarded by a feature flag that also enables std usage in the crate. The same goes with tests, in Clippy we've received several requests, to allow some lints in test functions. Checking if an expression is inside a test is possible, by searching for the expanded test code, but expensive. For [cfg] attributes, there is no real alternative.
I'm not sure if it's worth it. Currently, I mainly try to figure out why it behaves this way and why it's documented in the reference. But this is definitely a good point to keep in mind
Are you referring to #[cfg] here? If the macro is false, then it could be removed from the AST. This currently works well enough, even if I like the suggestion of the noop node. Since only #[cfg] and #[test] are active and they both require to be attached to a node, I would expect that it's possible to always find and item that the attribute could be attached to. Or am I missing something?
In my view, if you need to deal with information that is lost after macro expansion, then you really need to run your analysis passes before the macro expansion. That's the only way to get the full information, though there is always an issue of how exactly should you treat it? As mentioned, pre-macro expansion code doesn't even need to be valid Rust. Although I guess specific macros can be hardcoded into the analysis passes in quite comprehensive details. #[test] and #[cfg] seem to be especially easy to deal with.
Rustc has an option for this, but it tends to cause problems and was softly deprecated for that reason. Currently, the linting interface provides no other option for it, and there isn't one plant AFAIK. Maybe, it would be possible to keep a list/map of all active attributes that have been removed. Then the analysis could check if the node was affected by such an attribute. But an implementation like that would also require more effort
One concrete example is #![feature(doc_auto_cfg)]. IIUC that does run pre-cfg-expansion in order to capture #[cfg]s, but some of the inconsistencies with doc_auto_cfg are caused by that.
For the specific case of cfg, it seems reasonable to tag the included trees with "this was gated on cfg(whatever)". That also seems like it would be useful/necessary for the portability lint (e.g. you only use cfgd symbols from code with a sufficient cfg).
It is relevant to proc-macro attributes which can "see" other inert attributes (whether they appear before or after the proc-macro attribute), but not active ones that have been removed in their input.