I’ll be back with some detail about RLS planning. Re Rustdoc, here are some thoughts on Rustdoc and the RLS working together.
Proposed architecture
I propose that Rustdoc depends on the RLS, this disconnects it from the compiler and provides an easily understandable interface, while sharing a large amount of code with other important projects (primarily IDE support).
RLS background
The RLS is a bunch of libraries which form a layer around rustc and Cargo and provide an interface for tools to get information about Rust programs. The RLS is not a single thing - there are different crates which provide different levels of abstraction and different modes of operation.
- librustc_save_analysis - part of the compiler, turns the compiler’s internal data about a crate into a more easily understood (and slightly abstracted) format. Output can be accessed as an API (to query a single AST node), or as a dump of all info about a crate, this dump can be passed as in-memory data, JSON, or CSV (and can be extended to other formats).
- rls-analysis - takes save-analysis as input (either directly in memory or via the JSON dumps) and presents the data as a Rust API. This involves some post-processing and cross-referencing of data, then storing it in a set of hashtables.
- rls-vfs - a virtual file system for the RLS, not relevant to rustdoc
- rls - a client of rls-analysis, it manages builds using Cargo and rustc, uses rls-analysis to process the results of the builds, and presents this all using the LSP.
- rls-span, rls-data - helper libs, not relevant
Example clients:
- rls-vscode - our reference IDE implementation uses the rls lib, communicates over LSP
- other IDE plugins - similar to above, use rls over LSP
- rustw - a web-based code exploration tool, uses rls-analysis from its (Rust) backend, does not use the rls crate and shells out to Cargo directly (will also work with other build systems, which rls does not). I would expect rustdoc to follow this model.
Overview
(This is a very early sketch of how things could look, don’t take it as a concrete proposal, only to illustrate how the RLS and Rustdoc could work together)
I imagine that we would have a backend written in Rust which would act as a web server. I.e., Rustdoc pages would not be statically generated as there are now, but would be generated on demand. The backend would operate on a save-analysis dump, i.e., does not itself include a build step (note that for the distro, this can be installed by rustup, we should provide a helper program (probably a Cargo plugin) that builds a project, and starts the Rustdoc backend with the data). The backend uses rls-analysis to read and process the data and does no processing itself. It uses rls-analysis’s API to get information and processes it into something easily digestible by the frontend, provided as a RESTful (ish) http API.
The frontend should be a ‘single page web app’ using standard JS web tech. Personally, I would use React, but we could use Ember, Angular, whatever. It would send ajax requests to the backend to make the docs - one request per ‘page’.
rls-analysis API
We’d need some new APIs, but a lot already exists. The key data structure is a Def
which represents any definition:
pub struct Def {
pub kind: DefKind,
pub span: Span,
pub name: String,
pub qualname: String,
pub api_crate: bool,
pub parent: Option<u32>,
pub value: String,
pub docs: String,
pub sig: Option<Signature>,
}
Note that we already include data about docs, parent (which gives info for ‘up links’), and the signature (more on this below). We have a function for_each_child_def
which gives access to all children defs (e.g., fields in a struct, items in a module, etc.), and find_impls
which gives all the impls that a type implements (although this is not well-tested and will probably need some bug-fixing).
We would need new APIs for finding the 'root’s (i.e., the top-level modules of each crate, easy), for text search (we already can search by identifier, but we can’t do the kind of fuzzy search that rustdoc currently does, we want this for IDEs too), and possibly need to add more data for the details of impls.
Signatures need re-doing. The current version doesn’t really work. The concept is that they would contain enough data to render any item. There is a little design work plus implementation (which touches the compiler) to do here. Perhaps it should be integrated with DefKind
. Straw-man sketch:
enum DefKind {
Fn(FnSignature),
...
}
struct FnSignature {
generics: Vec<(String, Id)>,
args: Vec<Arg>,
ret: Option<(String, Id>,
vis: Visibility,
}
struct Arg {
var_name: String,
var_id: Id,
ty_name: String,
ty_id: Id,
}
enum Visibility {
Pub,
PubRestricted(String, DefId),
None,
}
The big missing piece is ‘logical children’, for example taking into account deref coercions or impls which are not straightforward (e.g., when doc’ing Ty
, impl ... for &Ty
or blanket impls). I’m not even sure what this should look like. But, I don’t think there is anything super-hard here, and we probably want something very similar for compiler-powered autocomplete.
Proof of concept
rustw has a ‘summary’ view which is a bit more source-oriented than rustdoc, but is very similar in concept, it demonstrates that this approach basically works. It is probably not encapsualted enough from the rest of rustw that it can be a foundation for Rustdoc though.
Handlebars version: https://github.com/nrc/rustw/blob/master/templates/summary.handlebars
React version: https://github.com/nrc/rustw/blob/react/static/summary.js#L93 (WIP)
Rationale
The RLS is used by IDEs and is likely to be continually developed. While the underlying architecture might change (in particular to take advantage of incremental compilation) it is unlikely to disappear or for the API provided by rls-analysis to change dramatically. The level of abstraction feels right for rustdoc - it abstracts away a lot of the low-level detail from the compiler’s data structures, but doesn’t lose any info that we might need for Rustdoc.
Rustdoc only talks to the compiler via a data dump and some helper libraries. There is no need to be linked to the compiler directly, nor do you have to worry about versioning (too much). Most developers only need to understand the API of rls-analysis which is fairly small and straightforward. For some debugging, reading JSON is required. But it should be very rare to have to add features to the compiler. There is no unstable code, and no need to be in the same repo as the compiler, or even to be a submodule or whatever (modulo distribution issues).
Using the compiler directly is a bad idea:
- the data structures are very low-level and need a lot of plumbing. You would be mostly duplicating code in librustc_save_analysis to do this.
- the compiler’s API is not stable - breaking changes are expected and these would break rustdoc. Fixing these would require knowledge of compiler internals. The only way to not get such breakage is to keep Rustdoc in-tree.
- you have to build against the compiler which means long build times
- rustdoc would have to be part of the rustup distribution route and could only be installed this way. Using the RLS, it could be installed with Cargo, or built and used from source with any compiler.
- you have to build the project you want to document, this makes projects like rustdoc.org or users building their own std lib docs less convenient.