@kornel I do want to see a "safe ABI" subset, which is bigger than C and smaller than fully general Rust.
However, having a shared library for cases like _join would require making the internal _join interface stable, which would increase our stability surface area. And in practice, I don't think most people who want shared library support specifically want it to save disk storage space. They do sometimes want it to save RAM, and a shared library might help a little with that when multiple Rust programs are running simultaneously. But mostly, shared libraries make distribution maintenance easier: you can upgrade a library without rebuilding the world. This wouldn't necessarily solve that problem. And I think it would in practice make distribution of Rust binaries harder for many people, because then they'd need to supply the matching library.
The interesting ABI question when talking about kernels, to my mind, is: Suppose a kernel written entirely in Rust + assembly, instead of C + assembly. What subset of Rust types can safely appear in the signatures of system calls, such that user space programs written in Rust can use them with no friction (and ideally without unsafe), but it's still possible to write user space programs in other languages?
It's important to think about both passing and returning values here. For instance, passing &str and &[u8] is straightforward, but being able to returnstr and [u8] is also desirable and might be a huge pain.
This got me thinking about other possibilities; Erlang has the ability to hotswap code. For servers, kernels, and other very long running processes, this could be a Good Thing™. Can the current ABI support this kind of use? If not, what would be required to make this a standard?
I want to toss i128/u128 into the mix here, not because they are all that hard, but because they illustrate that rust has been extended in the past, and could be again in the future (I don't remember when those values became stable, but I do remember them not being available when I started programming in rust). If this happens again in the future, a stable and immutable ABI will become a burden.
Thinking about all of this really got me thinking about what we're trying to solve. In my mind, we're trying to turn software into a bolt or a screw; a utilitarian object whose interface and guarantees are easy to determine. The ABI is a method of determining the interface and guarantees of the object under examination (library, application, whatever) in a forwards-compatible manner, so that you can decide at runtime if two 'things' are actually able to communicate, and decide if you can replace one instance with another. The closest analogy I can think of is how I can swap one bolt for another bolt in an engine, provided I know both the size and threading of the bolt (the interface), and the yield strength required (the version). If I try to put in the wrong size bolt, it's immediately obvious that the interfaces don't work; likewise, a loader that can't find certain symbols in a library that an application is asking for will fail to load the application.
But finding symbols is the easy part; the hard part is determining if the intent of the interface is unchanged. Semver exists in part to let us know if the intent of an interface has changed; e.g., if fn foo() printed Hello, World in 1.0, but it now prints Frobnicate, the version number tells me if the intent has changed, something the loader won't be able to determine (e.g., was the change a fix for a spelling error, or will the change have a major impact on how the function is used), the versioning information is actually more important because computers can't decide intent, only the programmer can. So we need a machine parseable versioning interface that is guaranteed to remain stable across all versions of the ABI. Ordinary SemVer may not be sufficient (you need a total ordering).
But that is only half the story. If I continue with the machine analogy, think of a large machine like a ship. It is common for a vessel to be undergoing some kind of maintenance while it is in active use. Erlang recognizes this, and has a method for performing maintenance while the machine is in use. But as far as I know, there is no common ABI specification that permits this happen. We do have something similar in that we start and stop applications, and in some cases different applications are able to save their state in a form that a later version of the application is able to pick up where the first stopped, but it would be nice to be able to replace the security module of a running webserver without having to stop and restart the server.
Most languages use C ABI to do FFI, so we have to gurantee that our structs and enums fit it. If we want to use dynamic sized types, for example &str or &[u8], we might want to do C's struct Dynamic{int size,void* data} analog, but this is obvious. Things become more complicated if we do trait objects in syscalls: C doesn't have vtbls, but it has function pointer; we could invent some method calling schema, for example, provide void* (*call)(int,void*) C function pointer which call the method from vtbl with tuple from second argument, method is indetified by index, which is first argument of function which we provide.
If we want to do non-FFI trait objects, then ABI can be more complex, but i'm not sure how it could look like.
In case of dynamically sized values, i don't think that we can do that with plain stack, think of segmented stack.
If this is the route that we're taking (and I really, really hope it is!), then we should ignore C ABI compatibility completely, and develop a new, clean ABI that addresses current needs. Furthermore, if you do need C ABI compatibility, we already came up with a solution; see A Stable Modular ABI for Rust - #37 by ckaran above.
I tent to disagree, mainly because C already have all primitives we actually need, we can do both struct and union in C, that is sufficient to describe anything we need. Slices are repesented like a struct of size and pointer to data. I've wrote the definition of slice above.
Next point is how to deal with representation optimizations, current rust ABI is unspecified intentionaly, to allow this optimizations. We have a bunch of optimizations that change the representation, all of them could be configurable by special macro, effect of which is determined by input params only, note, the compiler might be forced to do pointed optimizations, as well as give gurantees on what they produce, leaving little to no room for their use.
I think that cargo have to support limitation on which ABIs can be used, i.e. a new section like [ABIs] with all used in crate ABIs except for rust,C, maybe more, the ABI item then must specify provider - crate of special type, which is used to produce Layout for types and specify calling convention, this means exposing compiler internals or creating semi-stable API solely for that.
Custom #[repr(...)] items are then not quite special as they are today. Back to ABI section, user could specify which ABIs he use, providers, possibly versions. Then, all information from ABIs section should be bundled in the dylib, giving consumer ability to use lib with ease as well as constraing the ABI version (#[link(...)] parameters?).
Aditionally #[repr(...)] items can be allowed on modules, selecting default ABI for its items. This enables easy maintence of code which has to work with different ABIs. The open questions on this is exact rules for the feature.
Yes, you're right in the sense that given enough effort we can lower all calls to fit within the C ABI, but my concern is that if we make C ABI compatibility a priority, then we can't move beyond it either. That said, I'm not an expert in what Rust's ABI is like. @josh, @hanna-kruppe, @kornel, @CAD97, @comex, @matklad, @Tom-Phinney, I see all of you on posts everywhere, and I know you all are deeply involved in the internals of rust, so I'd like to know your opinions. What can the C ABI not support in rust? What is missing?
IIUC, everything layout-wise can be described in terms of the C ABI. In fact, if anyone develops a new ABI, it's typically first described in terms of how it maps to the C ABI.
When you get to tweaking the calling conventions rather than just layout is where just using the C ABI no longer works. E.g. if you want to reserve registers for e.g. an out parameter for copy elision, that cannot be described solely with the C ABI.
Also, the C standard technically doesn't allow for zero-sized anything.
(To be honest: I'd probably suggest any pluggable/custom/stable ABI work to start by just focusing on the layout part, as that's the main barrier for dynamic code reuse is. Changing functions to extern "C" instead of extern "Rust" (or generating shims) is not all that impactful (currently).)
What future optimizations would that preclude? Does C's unoptimized ordering of fields in a struct preclude all those space-time optimizations that a clever human or optimizing compiler might apply?
As an analogy, a 1970s memory model ABI that required all accesses to provide equal delay would have precluded the development of cache technology.
I guess, the optimized layout of struct will place most used field in offset 0, which is possibly break of C ABI, enums are harder: in code where we have Option<Cow<T>> in case of naive C ABI usage we end up having first byte having 2 possible variants and second one with, again with 2 varians while resulting 4 combinations can still be represented in 1 byte, and even more, if optimizer knows there some values of T that will not exist in runtime, then it could just use their bit-patterns to encode variants without data. Not to mention that in most modern architectures greater bits of pointers are unused due to adress buses are less than 50 bit wide, so 14 bits left and we could use them for encoding additional information. Such optimization could be represented in ABI...
IIRC, C requires that the contents of structs be laid out in the order they are defined within the struct itself, so, yes, moving the most used field to offset 0 will result in a breaking change.
I understand why you want to use those bits, but I couldn't be more against this if I tried. This is another form of the problem that @Tom-Phinney just mentioned:
An ABI like what we're talking about will likely last a very long time, possibly until after all of us are dead of old age. When did the C ABI start to solidify? Maybe in the 1970s? Depending on the dates, that's 40-50 years ago, and even if we had a new ABI finished and ready to go at this exact instant, the C ABI would stick around until all compilers and all currently in-use code were converted over, which would likely take decades (witness COBOL's continued existence). So, even though at this moment we're not using the high-order address lines, we very likely will be using all those bits (and more!) before long1.
In my opinion, this is actually the more important part. Since ABIs connect two different software components together, how much room is there for reordering of fields? Are we talking about an abstract ABI that is interpreted by the loader to produce a concrete ABI for a particular chip? What exactly are we trying to produce in this discussion???
1Using Moore's Law and extrapolating wildly, it looks like we'll all have more than 264 bytes of RAM in our average personal computers sometime between 2050-2070. I also expect that applications will bloat up at about the same rate, so we'll actually be using most of that while running no more applications than we do currently. As a result, computer will have address buses that are at least 64 bits wide, if not wider.
And this is all without discussing how a generalized modular ABI would represent niches, where unused code points in the representation of one enum variant are used to indicate a different enum variant, thereby eliminating the need for a separate, disjoint memory field to store the enum's discriminant. Rust's Opt type and its non_null pointers are the most common Rust-specific uses of such niches.
IEEE floating point can be considered another example of use of niches, one in which each variant of the underlying floating-point enum itself contain a multi-bit data field. To elaborate, ±0, ±∞, NaNs, and denormalized values can each be considered to be enum variants that are encoded in niches of the normalized-value enum variant. Each of those variants (±0, ±∞, NaNs, and denormalized values) has its own sub-decoding within its own multi-bit niche: for the ±0 and ±∞ niches that sub-decoding conveys the sign of the zero or infinity, for the NaN niche that sub-decoding indicates whether it's a signaling or non-signaling NaN and which specific NaN is being conveyed.
A fully generalized ABI description should also permit limited-depth [Edit 1: recursive nested] definition of such niches, such that the variant that is occupying the niche might itself have a representation with a sub-niche. The companion URLO forum occasionally gets queries about the possibility of having such nested niche encodings, IMO most frequently for nested Opts.
IEEE floating point can be considered to use nested niche discriminants:
when the nominal exponent field is all zeros, a zero or non-zero mantissa field discriminates between ±0 and denormalized values;
when the nominal exponent field is all ones, a zero or non-zero mantissa field discriminates between ±∞ and NaN values.
Edit 2: Added signed zero as another IEEE floating-point niche, as zero is clearly not a normalized number. Added the exposition of IEEE floating point as a potential case of nested niche discriminants.
Edit 3: Removed multiple places in the 2nd paragraph that marked the extent of Edit 2, since the above description of Edit 2 is adequate to figure that out.
I don't think we can go behind C ABI in theme of layouts: C ABI already can describe anything we can place in memory, so the main topics are a) How to deal with all those optimizations?; b) What do we want from calling convention and how we implement it?; c) How we do trait objects?
There are a few obvious missing features of the C ABI for memory layout:
Layout is independent of context (you can't describe a type that is laid out differently in an array, to achieve SOA, for example).
Fields in a struct cannot overlap, and fields in a union must overlap.
Fields must be laid out contiguously. Imagine trying to describe something like the x86 XSAVE format; logically, it's describable as an array of 32 512-bit vector registers, but the first 16 low-128 bit vectors are stored contiguously, then the first 16 next-lowest 128-bits are stored immediately thereafter, then some unrelated data, then the high 256-bits of the first 16 vector registers are stored, and then the last 16 registers are stored as contiguous 512-bit chunks.
Layout is done in byte-sized chunks; there's no way to say a few bits go here, a few bits go there, etc. Bitfields kind of solve this, but they have a few issues themselves.
Every field must have an address and take up space.
The value in memory must be the same value as in register (you can't say that the exponent of a floating point number is stored as a biased value, for example), up to potential sign extension.
While you can generally write an equivalent set of C structs/unions to describe the layout of a type (context-sensitivity notwithstanding), you don't necessarily want to map every Rust-level field to exactly one field in such types. I think there is value in being able to go richer than the C layout ABI; even if solving all of these issues may not provide enough value to justify cost (the context-sensitivity probably falling in the latter category).
Imagine being able to specify a vtable layout that stores function pointers as 32-bit offsets from the vtable field; you'd save an awful lot of dynamic relocations in doing so (some discussion on the benefits of doing so in C++ code).
While I fully agree with the points, I don't think that cost of teaching our extended ABI to other languages is affordable, moreover, having union and struct as primitives in ABI is sufficient to describe everything. We are developing ABI which is intended for FFI, otherwise there simply little to no sense in making it, so we have to respect current capabilities of other languages. The ABI we are trying to develop will be used in "headers": When you do FFI, or syscalls, you always link to header and only then link to dynamic library in runtime, the header is description of what is inside of library at all, which types it can return, etc. (Syscalls can be made by hand too, but that is an entire level of abstraction which you'd have to care about). Our header file are written in Rust, so we actually want to present any fully qualified type to foreign world, here is where you cannot go behind C ABI.
To the points:
Layout is independent of context because you need to somehow determine this context, so options are either to write some complex and determined rules for that or to leave it as is.
Contiguous layout of fields is what gives struct constant time member access, and I don't think somebody wants to break this guarantee.
In machine world there no thing like ZST - if it takes no space then it doesn't exist; ZST is only there for typesystem, they don't really affect memory layout.
The value in register and value in memory being same is one of key properties of von neumann architecture.
Due to headers practice, bitfields will need their own syntax to tell both the compiler and other languages what the content of bitfield is. So firstly we need bitfield as first class feature of Rust. (Making other languages rely on third party crates for that don't sound good, at least to me).
Thank you for your mentions of previous posts, I were really concerned about developing and stabilizing at least one ABI, and thus ran ahead of time. Concrete ABI's I think should be provided by users, except few really common ones, like C ABI.
Our task is provide a way to select an ABI used for data structures and functions as well as a way to controll which ABI is used in crate.