Summary
Create a summary of public API of a crate in JSON (or similar) form, either during compilation or from the built rlib. This machine readable API index shall be available to build script and/or procedural macros of dependent crates (possibly upon request).
All symbols will be fully resolved and referenced using their canonical qualified names.
The serialization of stable features should be stabilized while leaving opportunity to add attributes at any level for new features later.
Motivation
Main use would be generating bindings and wrappers. cbindgen
currently has to resolve the types itself, so it does not work with aliases, and it is the simple case that only needs to enumerate extern "C"
functions and #[repr(C)]
functions. A more advanced generator (I am thinking e.g. GObject Introspection here) would benefit from being able to:
- See which types implement certain traits. Doing this with procedural macros requires annotating all such types, which is unwieldy when such annotations don’t actually bring any new information, and problematic for types from dependencies.
- Generate list of selected items. Normal procedural macros are not suitable for that as they run for each annotated item separately (and may only run for some of them in incremental compiler run). There was a proposal for such collecting procedural macro for use in test frameworks, but it was not completely general.
- Be able to work with resolved symbols not to be thrown out by unusual imports, aliases, reexports and such.
Exporting the item definitions including resolved types in some extensible serialization format would allow the binding generator to easily generate both whatever wrappers are needed for marshaling data across the interface and corresponding declarations for the consuming language (C header, GI XML, VAPI etc.).
Explanations
TBD: the format will have to be defined.
Drawbacks
- It is another piece of code in the compiler or compiler-related tool that has to have its backward compatibility maintained.
Rationale and alternatives
-
A simple format is needed in which description of existing item kinds can be stabilized, independent of syntax changes with editions, while new item kinds and new attributes can be added later. JSON seems to fit that requirement well.
- Advantages of stabilizing format are that the code for processing it can be evolved as a separate crate on crates.io, and that it can be processed by tools written in other languages than Rust.
-
Alternatively the interface could be specified with types to which the data deserialize, similarly to how it is done with token trees for procedural macros.
- Advantage of specifying types is that their use is checked by the compiler.
- However there still needs to be a split between extraction of the data, which is necessarily compiler-version-dependent, and the interface, and backward compatibility. Otherwise it is putting extra burden on the processing tool to be always quickly updated for each compiler release.
-
For the motivating use-case, an alternative could be to use procedural macro for defining the interface, and have a way of writing some data outside the built library from them.
- I see some additional use-cases for such mechanism, but in this case big disadvantage of procedural macros is that they run before symbols are resolved (which they have to, since they can generate more symbols that will affect that resolution), so it’s difficult to write the information so that the wrapping tool will know the correct symbol names to use in all complex cases.
Prior art
-
The information is basically what
rustdoc
writes into documentation, but it does not generate index in any format suitably easy to further process. -
cbindgen
parsed the code itself to find the right symbols, and I believe now utilizes some procedural macros, but it is a relatively simple use case in that it does not allow most complex types. -
There is also
wasm-bindgen
that exports functions using procedural macros; I am not sure how complex types it can process. As far as I know it does not currently generate any form of IDL, just registers the functions. If it was to generate web-idl, it might benefit from this proposal.
So, does it make sense to work on this?
Note that I’ve tried to ask users whether something like that exists and got no response.