Hi there! I haven’t made much of a name for myself in the Rust world yet, so I hope it’s appropriate for me to bring up this matter. My experience with Rust so far has been wonderful (I’m using it for osdev), but one thing sticks out to me in particular - how Rust code is built.
The norm with other languages that run on llvm (C and C++ being the obvious ones) is to use headers to describe your API and data structures, and then implement them elsewhere. This has several significant advantages. Among them:
You can compile a single source file into an unlinked object
It follows that you can link them all together later and only build modified files (incremental compilation)
The API is very clearly defined through header files
Alternate implementations of the same API are easy to build and use as a result
You don’t have to go reading through the code to learn the API
You don’t have to have shared libraries installed to compile against them
Interop with other languages is easy
C and C++ and Assembly all get along smashingly without any extra work
Mixed language projects have no additional effort required to maintain
I realize that we may be a bit too entrenched to revisit this, but it concerns me enough to bring it up. With my own project that’s been going against the grain on this issue, I’ve found that it’s entirely feasible to go the header route (or something hacky that resembles it, in my case). There are many benefits to describing your API without forcing the implementation to accompany it, and I can’t think of anything but drawbacks for the opposite position. Can someone shed light on why headers aren’t involved, and if not, is it too late to fix this?
One drawback of headers is information duplication - you have function prototypes in both header and implementation files, and when prototypes change, you have to change two places instead of one.
I don’t claim that it’s enough to justify this particular decision - just outlining the tradeoff.
This isn't really true at all. You can write header files for (extern'd) Rust functions which C/C++ can call, and you can generate prototypes for C functions from C header files using rust-bindgen (or manually). There is no requirement that Rust use header files for this to be possible. And in order to have perfect interop with header files, Rust would have to use a superset of C syntax for them natively, which is not possible at all.
I don't see how you can even find this useful - C++ header files are completely unusable from C. You have to either only expose a C-friendly interface in the header file, or have two sets of header files. And you kind of miss the point of using C++ if you're always dropping back down to the C subset at every corner. The same is true of Rust.
The second point is easily mitigated by simple tools like tagbar, speedbar, and doxygen. The first point is a legitimate issue in incrementally building Rust projects, but header files are not a good solution.
I cannot name any languages besides C/C++ which use header files on llvm. The majority of languages I've seen which compile to llvm are implementations of high level languages which have full module systems.
With respect to incremental compilation, it would make a lot of sense if the compiler extracted the information that is normally found in header files, i.e. the signatures of things, and stored those somewhere. Using this information, each function could be translated independently and thus incrementally and in parallel. There are a few things that need to know about the whole crate (e.g. “coherence”), and the resolve pass for things that are referenced in interfaces/signatures would have to be re-run in order to know what needs to be re-compiled, but those things are relatively light-weight compared to other compiler passes, such as type-checking/inference, “trans”, and especially LLVM’s codegen. If I had a few months of spare time on my hands, I’d give this approach a try right now
If the idea is for the compiler to read only the headers of imported
modules, then you have a C+±like mess: any generic or (unless you want to
give up on non-LTO operation) inline-worthy functions would have to be
stuck in the header, and you have constant tension between dumping stuff in
there to take advantage of these things and leaving it in the source file
to keep things prettier (and increase compilation speed, but no reason to
think /that/ would be recreated).
Meanwhile, it’s not actually necessary to do this to get incremental
compilation. On the contrary, a superior method to C’s is to have the
compiler manage it and maintain dependencies at a finer grain than
per-file: thus only one function in a large source file may need to be
recompiled; (critically) modifications in a file containing API or
structure definitions need only force recompilation of code that relies on
the particular items changed, unlike the situation with C++ where touching
some common header often means recompiling the entire project; and
modifications to generic functions need not cause recompilation of
dependencies unless they were actually inlined into them (since this won’t
happen at -O0, this is a nice benefit when prioritizing compilation speed
over all else). Oh, and since this is basically equivalent to LTO, you get
better inlining (i.e. across source files) without needing to redo codegen
from scratch every time, like normal LTO.
If you do this, use of header files would have negligible benefit to
compilation speed. Rust does not currently have anything of the sort, but
I heard someone was going to work on incremental compilation, which I hope
is something along the lines of the above…
If header files are to be consumed chiefly by humans, as documentation
and/or to more clearly visualize what API is being exposed, that does not
apply. However, I’m not sure how much advantage they have over
Javadoc/librustdoc-like generated documentation. (I can definitely get
behind keeping it in the editor rather than needing to use a slow web
browser. But this doesn’t need to be part of the language.)
Note also that the C++ proposals for modules - both Daveed Vandevoorde’s original one and Doug Gregor’s current one - drop header files in favour of a much more Rust-like approach. Going the other way would be an odd move.
To your “Alternate implementations of the same API are easy to build and use as a result” and “You don’t have to go reading through the code to learn the API” points, where are traits falling short for you?
When an import declaration is processed in a D source file, the compiler searches for the D source file corresponding to the import, and processes that source file to extract the information needed from it. Alternatively, the compiler can instead look for a corresponding D interface file.
A D interface file contains only what an import of the module needs, rather than the whole implementation of that module. The advantages of using a D interface file for imports rather than a D source file are:
D interface files are often significantly smaller and much faster to process than the corresponding D source file.
They can be used to hide the source code, for example, one can ship an object code library along with D interface filesrather than the complete source code.
I feel like a change like this does not affect the public use of Rust, so it does not need to be in for 1.0.
If you come up with a concrete proposal, I would love to hear you. If you have a sound argument and win people over with hard evidence of better overall software development, Im sure more people would back you.
For now though, you sound like “if we bring back header files, things will be better!!” which sounds like malarky. Don’t get me wrong, I follow your work and know that you have a good basis for this knowledge, but you need hard evidence to win over this crowd.
Did that make sense? On mobile and formatting is harder.
C++ has a stupid design where the definition can’t be parsed until its seen the declaration in the header, with slight syntax changes like ‘defaults can only be in the header’, it’s utterly infuriating that such an amazingly poweful piece of software can have such a stupid (almost seemingly deliberate) problem in it … if only they made some syntax additions (which only affect parsing) that could be fixed
On the other hand, caching signatures between builds can be used to speed up recompilation.
Therefore headers are needed both for caching and expressiveness. So there should be two different types (e.g. same syntax but different file extension), to distinguish between the two use cases.