Design discussion: proper conditional compilation support for rustdoc


#1

(Somehow, i’m in an RFC-writing/-planning mood this week. :woman_shrugging:)

Rustdoc and conditional compilation… don’t get along. The current solution requires cheating the compiler and getting multiple items to rustdoc by using ugly hacks to get multiple platforms’ worth of items in at once. The issue requesting proper conditional compilation support is one of the oldest still-open issues on the rust repo, dating back to 2012. I’ve gone on record in that issue calling it “the holy grail of rustdoc”, because of all the current hurdles in the implementation of both rustdoc and the compiler itself that are in the way.

So why bring it up again? Because i have a terrible idea and i want people to tell me how bad it is. It involves re-architecting multiple areas of the compiler, just so rustdoc can hack its way to seeing all versions of a crate at once. The amount of work is probably infeasible, but i want to write this down outside of a private group chat, in case enough people in the right places are interested.

The problem

The core problem with the current implementation of rustdoc w.r.t. conditional compilation is that it has no chance of seeing any item but the current target, because the compiler filters out conditionally-compiled items immediately when compiling - literally the first step after tokenizing the source is running macro expansion and #[cfg] pruning. (These run at the same time because one can affect the other, with conditional macros creating conditional items and vice versa.) This means that rustdoc has no chance to see any items that are filtered out by its current target, since it lets the compiler do its own thing until it can finish name resolution and crate analysis.

The current solution is “allow crates to force items into rustdoc’s view, and force the compiler to not care by replacing item bodies with loop {} when rustdoc is running”. The everybody-loops component removes most references to platform-specific items from the crate, allowing the item to exist on a platform it’s not otherwise meant for. (Note the “most”. This is not perfect, mainly because references outside of item bodies - say, in function signatures - are still around, and can still cause problems.) As mentioned, this requires people to set up special configurations for their docs, and complicates any #[cfg] attribute that people want to change to allow in docs.

Alternative solutions

If rustdoc were allowed to be implemented significantly differently, how could it approach the problem then? If it worked directly from the source, it could see all the code text before any items could be compiled out of it, right? Well… it would also see all the code text before any macros happened, before any build scripts happened, and possibly be more confused than ever, thanks to complicated macros like cfg_if!{} which is one of the greatest allies of people writing platform-specific items. You’d have to reimplement a lot of the compiler’s functionality to even get close to approaching this problem from that direction.

What about save-analysis? There have been a couple attempts to use save-analysis data to reimplement rustdoc, so what would that situation require? Well, it’s stuck in the same situation as current-rustdoc, sadly. To get analysis data, you need to run the compiler well past the point that conditional items are going to be compiled out. If you had a way to run analysis multiple times, you could see the crate from multiple “angles”, so you could find a way to combine them. However, you would still need to create that list of platforms, somehow. It still wouldn’t be a perfect solution (and current-rustdoc could do something similar, anyway), and i think we can do better.

The terrible idea

So, why bring it up again? So far, this is all just background information, a listing of the reasons i’ve kept this task out of mind for the last year or more. Let’s get into my terrible idea.

The core of the idea is this: When rustdoc is running, convince macro expansion to disable conditional compilation. Since conditional compilation in Rust is item- and block-based, we can simply leave in all the items that are tagged as platform-specific, and accommodate any ramifications that fall out of it. For example…

  • If items share the same name and namespace, but have mutually-exclusive compilation conditionals, then they don’t actually clash.
  • If something is trying to reference an item that now has a duplicate, you can try to prefer something that matches the conditional of the reference, or the current target, or (potentially) some preferred conditional, like #[cfg(rustdoc)]. This can be used for both standard name resolution and intra-doc links.
  • Rustdoc already has a means of signaling platform-specific items to the doc output - #[doc(cfg)] - so it can start using existing #[cfg] attributes for the same information.
  • When rustdoc encounters multiple items with the same name and namespace but mutually-exclusive conditionals, it can combine them into the same page, with an ordering based on the conditional (or some potential ordering provided by the user? up for design discussion)

As i mentioned earlier, this feels like an enormous undertaking to ask for (which is why i didn’t write this into an RFC template, as i wouldn’t know enough details to write the reference-level explanation). However, it also allows rustdoc to finally get true, automatic conditional compilation support. The thought of grasping that holy grail is enough for me to consider this impossible task. (Or at least, to propose it to see whether it’s actually impossible instead of just impractical.)


#2

Note that this is exactly the problem that a proper IDE support will face once the basic stuff works. Ideally, find usages on a symbol should find usages in both branches of cfg_if!(platform).

For both rustdoc & IDEs, the two most obvious solutions are:

  • work like compiler: pick a specific cfg combination and stick to it. If you need to work with all cfgs, then loop through possible combinations and merge the end result.

  • add some kind of magical mode to the compiler, which disables conditional compilation and maintains all items. (“The terrible idea”)

I am not sure the second idea can fly reliably. Conditional compilation affects name resolution, and it is, imo, one of the least-understood (by me for sure) parts of the Rust language. So, it may be difficult to get a definite semantics of name resolution in “all the cfgs” mode.

OTOH, I wonder if such cfg processing can be formulating in terms of existing hygiene infrastructure? Basically, cfg[windows] and cfg[not(windows)] items would have different hygene info, and items without cfg at all will have both?

Super long term, for IDE support, I’d love to pursue a language-level solution for a more structured conditional compilation (so, A more terrible idea). Specifically, Kotlin’s expect/actual is a flavor of conditional compilation which is easy for tools to support. The main idea is that public modules has exactly the same API regardless of current cfg flags, so you fully type check code without knowing which cfg is active.


#3

I like this concept too – I’ve thought about similar things for managing other sorts of global resources, like our allocators. It feels obviously highly related to the trait system, too. Makes me wonder if we can attempt a “grand unification”.

e.g., imagine if you could define a trait, and then somehow say “within this part of the program, here is the type that implements this trait” (lexically scoped, let’s say). Then you could invoke those trait methods like globals, and it would mean “use that type”.

Anyway at this point we’re also verging on ML functor territory of course, and my desire for more support of generic modules, so… “language design needed”.