Design discussion: JSON output for rustdoc


#1

While I was at RustConf, i had a conversation with someone that made me think about rustdoc’s old JSON output format. Up until early 2016, rustdoc was able to output JSON instead of its usual HTML (and also take it as input instead of the lib.rs for a crate), but it was taken out because it was neglected, unused, and the tools team decided it wasn’t worth the effort of planning for its stabilization.

However, I wanted to take another look. We’ve done a lot in the two years since then, and it’s worth considering whether we can properly support something like this. To that end, i chalked some of my thoughts down and tried to think through some of the bigger design questions behind it. I wrote it as a potential RFC, but to be honest that was mainly so that i could have a template to put the hypothetical docs into, since i wanted to start specifying what the potential JSON would look like before trying to write the implementation.

Perhaps the biggest question is: what does this accomplish that save-analysis doesn’t? The reasoning i wrote down was the rustdoc and the RLS care about different things, but the only real example i can think of offhand is that rustdoc’s JSON output would (for example) filter out private items by default. To be honest, i’m not familiar enough with the save-analysis data to know whether rustdoc needs something it doesn’t currently show.

Does this seem decent? I’ve seen several requests to put something like this in, so this is basically me thinking through the design process as best i can.


#2

If we do this, I also want it to be trivial to transform the JSON into other representations such as TOML. I’m not sure what restrictions on the JSON would have to apply for all the common formats, or what the common formats even are.


#3

I cant’ find info about save-analysis. I didn’t know it existed, and I don’t know if it’s useful :slight_smile:

I wanted to make my own front-end for the documentation, because I have several gripes about Rustdoc’s page layout, but having to mess with the compiler codebase and endure its compilation times is prohibitive for such project. Being able to get all of Rustdoc’s page data, but in JSON, would be great to quickly develop front-end as an independent project, where I could focus on HTML/CSS, and not compiler internals.

My immediate short-term use case is to know all function, struct and constant names in a crate, so that I can auto-link them to docs.rs from code blocks in the README on crates.rs.


#4

save-analysis is one of the data sources for the RLS. It calls the compiler in the background to get this information and uses racer to fill in when your code can’t be compiled (e.g. when you’re in the middle of typing a line). The actual content is considered unstable, but you can feed it through a crate like rls-analysis to get some of the data out.


#5

This sounds great. I’ve had an idea in the back of my head for awhile about a tool to do public API diffing between versions of a codebase (and perhaps warnings for API breakages/semver violations). Every time I think too much about it I get hung up in the complexity of parsing the API and move on to other thoughts.

But having a machine-operable set of data about the public API sounds like it could enable all kinds of awesome things.

I also hadn’t heard about save-analysis, so guess it’s time to look into that, too.

(P.S. I went to your talk at RustConf and it was great!)


#6

This is fantastic, thanks. I have a lot of feels here and not much time to reply, but I’d like to mention the JSON API spec that @wycats and I work on; I think it’s a great fit for this problem.

I’ll elaborate more later.


#7

Save-analysis supports this - it is configurable.

I think save-analysis would be a great fit for this. As far as I can tell, the RLS and Rustdoc want very similar data. The big question to me is what part of the rustdoc pipeline is serialised - is it the input from the compiler (which corresponds to save-analysis) or is it an alternate view of the output? An example of the difference is whether you list impls, or whether you list traits which might apply to a type (the latter being the processed version).

I’m also not clear on how the data would be created. It sounds like Rustdoc would both create and consume the JSON, but since save-analysis is not stable, it would be really bad to have two places trying to generate compatible save-analysis data. The save-analysis module in the compiler would need some modification to provide enough data for rustdoc, iirc, it would need information about impls which is currently lacking. Changing Rustdoc to work from that enhanced save-analysis info would be amazing - it would reduce code dup and complexity and mean that building rustdoc wouldn’t need to wait on building the compiler first. It would also let you do cargo src style ‘view src’ pages with jump-to-def. However, I think it would be a huge undertaking (this was basically the plan for the rustdoc 2 idea, but it never happened).


#8

This potential use case just popped up so I’ll share:

I’d like to be able to scrape code fragments from crate documentation; specifically, code fragments marked with a specific language other than Rust. My tool can then take those fragments and compile them in a similar check as to rustdoc tests, or in my case, collect grammar snippets into one canonical grammar file.