Summary
Migrate the syntax of rustdoc markdown footnotes to be compatible with the syntax used in GitHub.
I've opened a pull request against pulldown-cmark #654 to implement GitHub-compatible footnote syntax, which should probably just be deployed to docs.rs if it's merged. If #544 is merged instead, it will probably be usable in docs.rs as well, but deploying it to rustdoc may be a bit more complicated.
Motivation
This change should reduce confusion with syntax not working the way people have come to expect. There are two major reasons to want rustdoc to be consistent with GitHub here:
- While doc comments themselves usually aren't rendered by GitHub, other tools like docs.rs and crates.io share
README.md
files with it. This means if those tools parse the file differently than GitHub does, it's almost always a mistake. It's also reasonably common for mdBook chapters to be read in GitHub, so mistakes happen when it diverges in its parsing (for example, where an RFC relies on GitHub's auto-linking behavior by accident and needs a fixup PR). It would be very silly for docs.rs README files to follow different markdown syntax from rustdoc. - The current behavior of pulldown-cmark is often implicitly considered a bug by end-users. See issues on pulldown-cmark's issue tracker: #20 #530 #623
Rustdoc's Markdown Footnote syntax was designed in a weekend in 2015 to make pulldown-cmark suitable for use with existing documentation. In particular, GitHub didn't add support for footnotes until 2021, and other markdown parsers of the time didn't seem to have converged on a common behavior for corner cases like this (showdown-footnotes does treat indentation as special, but unlike GitHub it doesn't require necessarily four spaces of indentation).
Guide-level explanation
Edition Guide: Footnote syntax in rustdoc
Summary
- Footnote syntax is now compatible with GitHub-Flavored Markdown:
- If a footnote definition (like
[^this]: contents
) is followed by text indented four spaces or one tab, that text will be treated as part of the footnote instead of being an indented code block. - Footnote definitions no longer need to be separated by blank lines.
- If a footnote definition is immediately followed by a list, block quote, or table, it needs to be indented by four spaces to be considered part of the footnote.
- If a footnote reference has no corresponding footnote definition, it is rendered literally instead of creating a broken link.
- If a footnote definition (like
- When rustdoc runs under Edition 2021, it will warn about any Markdown syntax that will be parsed differently in Edition 2024.
Details
If a footnote definition is followed by text indented four spaces or one tab, that text will be treated as part of the footnote instead of being an indented code block. To preserve the current behavior, writing code that will be interpreted the same way under both editions, separate the code block from the footnote using an un-indented HTML comment:
[^1]: footnote definition text
<!-- -->
// indented code block
fn main() {
println!("hello world!");
}
Footnote definitions no longer need to be separated by blank lines. To preserve the current behavior, if you intentionally want to write a footnote reference followed by a colon at the start of a line, use a backslash escape:
[^1]: footnote definition text
[^1]\: this is a reference, rather than a definition
If a footnote definition is immediately followed by a list, block quote, or table, it needs to be indented by four spaces to be considered part of the footnote. To preserve the current behavior, writing code that will be interpreted the same way in either edition, you'll need to fall back to HTML syntax, since there's no easy way to write code that the new syntax will accept as part of the footnote without the old syntax considering it a code block.
When migrating to the new Edition, a table within a footnote can be written like this (under the old Edition, the table is treated as source code):
[^1]:
| column1 | column2 |
|---------|---------|
| row1a | row1b |
| row2a | row2b |
Reference-level explanation
The detailed syntax for GitHub-compatible footnotes in pulldown-cmark is documented here, in specs/gfm_footnotes.txt
, and will be copied here when that pull request is merged and the design declared final.
Drawbacks
This is a compatibility break in the way rustdoc parses markdown. These changes are particularly painful, because markdown accepts all text as valid. Rustdoc will warn on a few egregiously bad cases, but it still produces docs (they're warnings, not errors, unless someone uses #![deny(rustdoc::all)]
), which means people can deploy broken docs without realizing it.
While this particular change is expected to actually affect very few people (how many even know about footnotes?), it's still a change in behavior guarded only with warnings.
Rationale and alternatives
The biggest problem with doing nothing is that we continue to live with rustdoc having a footnote syntax that looks like, but is subtly different than, the popular footnote syntax used on GitHub and GitLab, documented on the authoritative-looking Markdown Guide, and implemented in tools like VSCode and Pandoc. The best syntax is the one that everybody else uses, except when there's a compelling reason to be different, which there really isn't here. The only reason rustdoc isn't compatible with everybody else's markdown footnotes is because it was a (relatively) early adopter, and the rollout was rushed.
Some other possibilities include:
- Just pushing out the change without an Edition. When upgrading to a new version of pulldown-cmark that changes its parser to match the current version of the CommonMark spec, there's no need to wait for an Edition, because any changes are extremely minor and rustdoc is documented to implement CommonMark.
- Add
#[doc(gfm)]
and#[doc(not_gfm)]
attributes to explicitly request the markdown syntax variant, and change the default over the Edition. This means having to mention different markdown flavors in the rustdoc book's list of doc attributes, when most rustdoc users should never need to know about any of this. It's also not the way syntax changes for Rust itself are done (flexibility for flexibility's sake is bad design). - Instead of using an Edition, rustdoc could do what it did when switching to CommonMark: go through a warning period, then eventually remove the old syntax entirely. The second footnote syntax probably isn't as big of a burden as Hoedown was, so maybe it's not justified breaking compatibility like that.
Prior art
- https://www.markdownguide.org/extended-syntax/#footnotes
- https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#footnotes
- https://michelf.ca/projects/php-markdown/extra/#footnotes
- https://pandoc.org/MANUAL.html#footnotes
- https://babelmark.github.io/?normalize=1&text=named[^id]+anonymous^[note] ++[^id]%3A+footnote
Unresolved questions
- Is this worth doing at all? I expect most people won't even notice it, since footnotes aren't that popular, and the change isn't that big (the most important result is that footnote definitions don't need to be separated by blank lines any more, which will probably fix more docs than it breaks).
- The deciding factor to bother with an Edition and the "parser divergence warning", or just make the change everywhere with no attempt to mitigate, is whether there are very many docs that will be affected at all. This should be verified using
crater
, with a Draft PR containing the code to implement the warning, but patched to return a hard error instead, probably. - Will the old syntax ever be removed? Third-party tooling that consumes rustdoc JSON files will need to parse the markdown inside, and probably won't be able to parse the syntax unless it uses pulldown-cmark specifically, so keeping the old syntax around is a liability for more than just rustdoc devs. Again, a
crater
run will probably be the deciding factor.
Future possibilities
What happens when we want to support for new Markdown extensions?
This has been a problem before. Technically, docs could have been broken by the addition of new features such as intra-doc links, and could be broken if things like $
-math are added. Tactics to mitigate this include deploying the new syntax over an Edition (like how async
became a keyword in 2018), but it might instead be a good idea to allow markdown extensions to be toggled with an attribute:
#![doc(enable_math)]
This is probably a bad idea for the GFM footnote syntax, since people who just want to write doc comments probably shouldn't be expected to deal with weird legacy syntax nonsense, but it seems reasonable to allow people to express whether they need math syntax or not.