You’re seriously trying to make contributing to a compiler easy? Such blasphemy, that’s supposed to be hard. The world we live in
Three things from me:
I’d like to at least try writing a bit of it. I have some vague notion about what might be happening inside the compiler (not nearly enough to write documentation), but I’m OK reading code. I’d just like to know which part of the code is relevant to be read.
I guess the ToC is just a first draft, but I really miss one chapter: something like usual workflow. If I clone a crate, every time I make a little change, I run compilation and see what errors it spits on me. But running the compilation of the whole rustc takes ages, so this obviously doesn’t work. I’d like to know how people who hack on the compiler go about issues like this and what works for them.
I myself prefer documentation as close to code (because of the atomicity of PRs, but also to have the correct documentation when I check out some historic version of code, for archeological reasons). I don’t think the bors times are a huge problem here, because the new documentation won’t break a build, so you don’t have to watch it and iterate if it fails. You can just start on another capture or branch from what you’ve submitted and rebase onto master later on.
But I have an idea about bors I’ll write down and start a new topic.
Just a wild idea, would it also make sense (in addition to just documenting all the areas) have a tutorialish section, something like a walk-through a small bugfix or feature, listing the steps taken by imagined new contributor (maybe even with few usual catches and dead ends)?
I don’t know. Tutorials usually aren’t for covering the whole thing. They are there to make people comfortable, to help them the first steps. So maybe even something like fixing a documentation or adding a doc test into the standard library.
I actually made a tool for this and hooked it up to a cron job so my laptop would rebuild the compiler and docs while I’m asleep, then it pushes the generated docs up to GitHub Pages regardless of whether generation failed. (rendered/repo)
Oops, looks like something’s changed since I last touched my rustc-internal-docs program. @QuietMisdreavus, do you have any idea what I’d need to do to make the “unstable” crates appear in the sidebar?
Oh yeah, and I’d also be pretty keen to help out. I’m fairly competent at Rust and have wanted to hack on the compiler for ages but been too intimidated to really try anything. So that probably makes me the perfect guinea pig for this kind of thing
Oh, Dragonbook, 30% parsing in full theoretical detail (not useful in rustc because we do simple manual recursive descent parsing) and 65% optimizations (not useful in rustc because it’s a frontend and optimizations happen in LLVM, but maybe a bit more useful now when we have MIR passes).
“Second compiler books” like Muchnick are mostly about optimizations too, so I’d say “traditional type system textbooks” like Pierce “Types and Programming Languages” would be more useful for understanding our favorite frontend.
I have a quite extensive example of a language enhancement that could be used, along with a brief discussion (to be included) of why extending the language in that way is unneeded and unwise.
My initial use of Rust was for writing a crypto package that would otherwise be written in C. The cipher algorithm at the heart of the package is an extremely-efficient military-grade autokeyed stream cipher which uses wrapped add/sub/mul as well as rotate. I thought that there should be lexemes for those operators and their assign variants, so I went through the nightly compiler source finding all the files that would need to be changed to add those operators. I worked out the required changes, preceding each such new or changed line with an appropriate meta-information statement
(.e.g., #[unstable(feature = "operator_tokens_for_wrapped_add_sub_mul", issue = "0")]) to control inclusion/exclusion. There were a lot of files, because the changes affected the lexeme parser and AST and HIR and MIR.
I concluded that two new lints were needed, but did not develop the expertise to work them out because they required enhancing the compiler to maintain a distinction between unsigned and (probably-)signed expressions.
Even without the lints, this would be an example that goes from source (new lexemes) all the way to LLVM (for LLVM’s rotate intrinsics).
I started to document these proposed enhancements as a pre-RFC, then realized that I should determine how useful the enhancements would be to the general Rust population. I grepped the compiler, as well as the various crypto crates. My conclusion was that only 5% of the uses of add/sub needed wrapping arithmetic, which was too infrequent to justify complicating the language. (There was almost no use of wrapped mul.)
I saved this work; it would be fairly easy to migrate it to the latest stable release and use it as an extensive serial example, first of lexeme parsing, then of AST structure, then of the HIR, followed by the MIR. ending with calling LLVMs intrinsics.
I didn’t get especially far besides copying the table of contents in and creating dummy files. I plan to open up some issues with notes and places for people to sign up on what they plan to work on, as discussed. I’m also open to rearranging the TOC, this is just a strawman proposal obviously.
One thing I would like to do is copy/move the existing README content out of rustc and into the appropriate places. If anybody feels doing a first PR, that’d be a good choice.
The other thing is that I don’t know how to setup the repo so that it auto-renders the book at some convenient place. I’d love it if somebody is up for that.
Feel free to open up some issues for other things!
Adding to that, I think there is generally a lot of great content already available for many topics. As an example for MIR there is this blogpost and this excellent RFC explaining a lot of the motivation and concepts.
However, they are not just copy-paste-able and would require some editing work. On the other hand, we could just reference them as “further reading” or “see also”.
Another thing we possible have to consider is copyright. I guess most authors would be fine if we reuse their work in some way, but we can’t be sure for certain. Perhaps we can try and ask authors if they permit us reusing their work and we mention them if a chapter was based on their post.
@nikomatsakis great idea and initiative. I definitely would have benefitted (and still will benefit) from a book like this, and look forward to reading and contributing to it.
One thing that just struck my mind: this book is going to keep evolving as compiler internals keep changing. Wouldn’t it be better to have more of a “wiki” format for this kind of thing? I mention it because I was about to add something to one of the pages, and then I realized I’d have to post a PR, and in the meantime someone else could end up working on the same thing and there could be conflicts… it’d be nicer if the edits just showed up immediately. I know GitHub has a wiki feature that could maybe be an option.
Either way, this is a great idea. This is just food for thought about making it as easy as possible for everyone to contribute. Of course, the “curated” approach has its advantages too.
My concern is that it might end up like OSDev Wiki, which is chalk-full of useful knowledge, but it isn’t particularly well-organized and has no apparent structure. It also has huge amounts of obsolete knowledge. I like the idea of a book better because it could be more structured, easier to find things, and easier to keep up-to-date.