"So you want to hack on the Rust compiler?", a plan for a book


#1

In my previous post, I talked about the idea of drafting a “Rust Compiler Book”. I wanted to talk about how we might go about this. I think there are a few interesting questions to discuss:

  1. Where to host the book?
  2. How to describe and delegate the work?

Where to host the book

Currently, we have documentation scattered about in various places:

There are some pros/cons to each setup. I think that the README files and things are quite scattered and I have the feeling that many people do not find them. On the other hand, the forge appears to be even less well known.

I think what I want is, to some extent, neither of those things. I think I would prefer a “Rust Contributor’s Book” that is structured using rustbook – like the Rust book itself – so that there is a clear table of contents down the left-hand side. I anticipate that people will want to skim around and jump to sections of interest.

Same repo or a different repo?

One of the motivations for having README.md files was so that edits to those files could be done “atomically” in the same PR that changes the code. I also thought people would discover the comments more naturally.

I think that the discoverability of a “book form” is much higher than the READMEs. We can always have links in any given module that direct you to the appropriate chapter in the book:

//! Trait resolution. See the [corresponding chapter from the Rust Compiler Book][c].
//!
//! [c]: ...

But the question still arises whether to have the book in rust-lang/rust or in some other repo. The atomicity of changes is a plus, but I’m not sure how important that is. If we put things in a separate repo, then we don’t have to block on bors, which is also a plus. Ultimately, I am inclined towards a separate repo.

How to describe and delegate the work

I envision the process being roughly like this. We’ll lay out a rough table of contents. People can sign up to write chapters or sections. They can do this in one of three ways:

  • If you know how the code works, then you can just write it.
  • Of course you are welcome to try and read into the code as well.
  • Otherwise, you can schedule some time with me or other mentors, and we can discuss together how the code works, and then you can write it.

Rustdoc. This is independent, but I think that as part of this work we should also make an effort to write rustdoc comments, and we should figure out what it will take to host the compiler comments somewhere public. Writing comments is a good way to help learn the codebase, as well. =)

A rough table of contents

I am thinking something like this.

  • How to build the compiler and run what you built
    • Describes config.toml settings you want
    • Describes common x.py commands
      • e.g., ./x.py build --incremental src/libstd
    • Effective use of RUST_LOG
    • Handy options, like -Ztreat-err-as-bug or -Zdump-mir
  • Using the compiler testing framework
    • How to run tests
    • How debug tests that are failing
      • e.g., run the test file by hand
  • The compiler source
    • The crate structure
    • The phases of the IR
      • AST vs HIR vs MIR
    • Representing types
      • the ty module, which needs a name btw
    • Queries and the tcx arena structure
    • The various forms of ids (NodeId, DefId, HirId, etc) and how they are used
      • e.g., the way we make hashtables alongside the main IR
  • The parser
  • Macro expansion
  • Name resolution
  • HIR lowering
  • Representing types (ty module in depth)
    • how interning works
  • Type inference
    • snapshots
    • type folding
    • resolution
  • Trait resolution
  • Type checking
    • FnCtxt
  • MIR construction
  • MIR borrowck
  • MIR optimizations
  • Trans

Thoughts?


Lessons from the Impl Period
#2

Generally speaking, i’m a fan of more docs. :D

In terms of precedent, there’s currently the “Unstable Book” that exists in-tree, and is published to doc.rust-lang.org. It’s not in a separate repo for very similar reasons: The things it’s talking about can move faster than it would take to get a PR into the Reference, merge that, then get the submodule updated on the main repo. (This is still a problem for things that break links in libstd API docs, that the reference/book/nomicon links to, since those links are checked.)

It’s also worth noting that rustdoc docs for the compiler libraries are generated and hosted. CONTRIBUTING.md links to a set by Manishearth, but that is manually updated and can get out of date. There’s another set available that’s updated automatically each day, but its sidebar crate listing is missing all the “unstable” crates, so you need to manually enter the a crate name in the URL bar or search for it in the doc search. (There’s also an open PR to test the compiler docs in CI, which should help this out a lot, by letting the compiler-docs option in config.toml Just Work, if i understand it correctly.)


#3

I would like to volunteer to write some sections or chapters. :raising_hand_man:


#4

I would like to volunteer to be mentored and help write some docs


#5

I can share my experience on navigating the compiler code, modules, finding relevant lines, grepping, etc.


#6

I would like to help writing docs too, but my knowledge is pretty limited.


#7

Great idea! Would love to contribute, too.


#8

I love this idea.

I would like to recommend a hybrid approach: prototype the book outside of the repo, and then, once it’s mostly there, let’s move it into the repo.

I think the atomicity aspect is important, and since it’s describing the compiler, it should live with the compiler, like any other docs for any other project. But being in the main queue is kind of a bummer, and hurts more during the start of the project.

My two cents, anyway :slight_smile:


#9

Love the idea and I’m also interested in participating.

Here are my thoughts on the repo issue:

Maybe we are discussing two different things here. One observation was that the compiler could be better documented, but also that there is a need for a more comprehensive introduction to the rust compiler. Perhaps these things should be separated.

The compiler-book could provide a higher-level overview, similarly as the rust book explains key concepts of the language. The contents of the book should also be consistent over time; smaller changes in the rustc source code shouldn’t require changes to the book. In contrast, more technical details and solutions (which are more subject to change) should be closely documented to the actual source code.

Maybe, I can put it this way: If I want to learn about the rust compiler I shouldn’t be required to look into the source code and search for README files. However, when I’m trying to understand some actual code, helpful documentation should be nearby.

I’m happy to hear your thoughts about this :slight_smile:


#10

100% agree… It took me a long time to find the README files. Is the forge even updated? I got the impression it was outdated.


#11

This was basically my intention, sorry if I did not make this clear. The idea is that the content should be high-level enough that it mostly stays up to date except when there are major revisions to how the compiler works.

But this does remind me: maintenance is an issue! I don’t know that we have to discuss it here, but I think we have to figure out some way to review the book and ensure that the content stays “in sync” when changes are going into the system.

If nothing else, just knowing that it exists and having reviewers – when a major refactoring is taking place – cross-reference is good, but it’d be nice if there was a tool that helped us do better.


#12

I should have mentioned that. However, I was only aware of @Manishearth’s docs, and I believe that they would prefer to get out of the business of hosting. The general tracking issue for this is https://github.com/rust-lang/rust/issues/29893, and in particular @alexcrichton posted some specific instructions on what it would take to get “always complete” docs that are updated after every PR.


#13

Hmm, so the idea of being able to check cross-references is really good. I like that. I had forgotten about this issue.

In particular, I would like if it we are able to create various kinds of links between the book and the compiler source:

  • From the book:
    • Reference particular files as well as tests
    • Reference method and type names (e.g., ty::Ty<'tcx>)
    • Maybe even quote sections of the code
  • From the source:
    • Identify a chapter in the book
    • Reference test files

These links would be checked, so that if a method or type is renamed, but the book is not updated, the build fails.

Do we have any capabilities like that already? What exactly are we checking now?


#14

Great idea.

A related question: (Since our documentation, including books is mostly based on Markdown) i wonder if there’s a Markdown-based i18n framework? I feel bad that rust-www or the books got many independent copies in varying languages, people bother checking and keeping them sync from time to time. Is it possible to integrate all the translations somehow?


#15

Have you given any thought to how you might like to handle diagramming in such a book?

Options include:

  • Graphviz
  • Mermaid
  • TikZ
  • Inline SVG (ugh…)
  • Bitmaps
  • Ascii-Art (coupled with pre-selected Ascii-Art → SVG converter)
    • Options for converters include svgbobrus, a2s, markdeep, my own mon-artist

My suspicion is that it might be a mistake to force contributors to use any one option in all instances. But at the same time, suggesting one option to start with may help achieve a uniform presentation.


#16

It’s being worked on.


#17

I had not thought about it. =) The obvious answer of course is mon-artist :stuck_out_tongue:


#18

I think prototyping it out of tree then merging it in after it’s in a good state is a really good idea, much like how incremental builds were done. This would allow getting the meat of it done then later actually having it updated as certain parts of the compiler change when it’s in tree allowing that atomicity.

It’d be good to have the rustdoc comments as well, but as you know @nikomatsakis from our offline conversations, getting it to work with rust-forge was a pain in the butt, mostly because getting Travis to be able to build the compiler was a pain. I might experiment with setting up something to just build the compiler daily and upload it to my site as a proof of concept of not having out of date docs for the compiler.

I wonder if there would be a way to unite the readmes? @QuietMisdreavus work on external doc imports might help here with consolidating readmes into one place if we want to use rustdoc for the repo itself rather than the book you’re describing here.


#19

The emscripten port needs a port to wasm-unknown-unknown and off you go :). https://github.com/skade/em-artiste


#20

I considered this plan for a while but I think it will be far less “discoverable”.