First, if we want to do autocomplete we need to be able to do "on-demand" type checking (which needs someone to write it), but that pretty much requires a "warm" compiler.
Yeah, I guess my biggest unknown right now is what the compiler team plan to do in terms of compilation model independent of the RLS. If the plan is to make the compiler a long-running thing anyway, then that changes the outlook for what to do with the RLS.
The old CSV-based save-analysis
It is basically dead at this point. It is kept around because DXR still uses it, but that is basically unmaintained anyway, afaik.
but its story for types is imperfect (cough retokenise_span cough).
I view this very much as a detail that can be improved, rather than an important factor. To be clear, from the IDE's perspective what we do for types is fine. For Rustdoc and similar we need to improve, and I have plans there, but I don't think any of the changes are significant. retokenise_span is gross, but it is entirely an implementation detail which doesn't affect the API in any way and only exists to hack around deficiencies in the compiler.
This quote provides a nice jumping-off point:
Sorry in this case, the quote is misleading. I was very much describing what the RLS can do here, not the more general situation. The post above was not aimed at Rustdoc. The RLS can very much run in a 'bulk' mode and would be perfect for Rustdoc
or even linking rls against the rustc libs
This is what we do today, although we still use librustc_driver
The extensive changes I mean are long-running vs batch model. I'd be happy to hear that this is not such an extensive change after all. I agree the 'answering queries' bit is a trivial change.
traversing all the refs in the world shouldn't take more than a few seconds
It would be interesting to get some numbers for this. A large crate (e.g., libstd) on a fast machine takes about 20-30 sec. Now this cross-references all defs not just one, but even for one you have to search all the refs, so my intuition is that 'find all refs' would take apx the same time. It can probably optimised, but I'm not sure how much (I would expect this to be slower in the compiler than the RLS since the data structures there are larger and more complex).
I think the current "RLS" abstraction boundary is ill-defined - what is exactly a ref? There's no good definition,
These are two separate things. I agree the abstraction boundary is not exactly well-specified, but the alternate proposal seems to be no abstraction boundary at all.
IDE "elements" need to be a first-class concern and not "half piggyback on compiler internals".
I have no idea what this means or why the proposed solution is more 'first class' than the current one.
Another relevant thing is that I would like the RLS to become the "repository" for incremental state
Having the RLS supply this from memory rather than disk sounds fine. Having the RLS do more detailed management seems a bit scary to me (I don't think that is what you are proposing, but it has come up in the past).
presumably we can all agree with @arielb1 that the save analysis format is "underspecified"
nope, not at all - I don't see why the compiler team think the RLS is less specified than any other API or what specified vs underspecified really means here.
likely to change as the compiler gets refactored
This I can totally agree with. I am totally unclear on the degree of refactoring, it seems the compiler team think we need major changes but I am missing a concrete explanation of what these are. It is not clear to me why Datalog or some other format is better than the current one, but honestly the transmission format doesn't really matter much.
we have to spend some time thinking out what will be a relatively stable interface for identifying nodes, representing types, and so forth
Again, I'm not really clear on what you see as the problems with the existing approach - from the client perspective this all works fine right now and it is the incremental stuff and some 'online' queries that will cause problems.