A Stable Modular ABI for Rust

Yeah, it's quite long as I wanted to try to cover as much surface area as possible. Here's a summary:

  • A stable ABI would be really nice for Rust
  • But it would also be difficult to do
  • We propose a modular ABI, where compiler-time macros can determine the layout of datastructures so that they are ABI-compliant.
  • We discuss caveats
  • We ask for feedback.

I'll take a look at everything tomorrow,

3 Likes

I applaud any attempt to address some of the use cases commonly associated with "ABI stability" without affecting code that doesn't actually need "ABI stability". Sadly I am too short in time to delve into implementation details, but I can (and feel the need to) give a feedback that I often give then the topic of "stable ABI" comes up: this is a overloaded term, and it's important to be precise about which of the several meanings we have in mind at any given point, because they differ greatly in what they enable and what they require from the compiler and from the user.

More concretely: there are broadly two clusters of meanings for "stable ABI", the first referring to how language concepts are mapped machine code (e.g. data structure layout, calling conventions) and the latter to the ability to change (upgrade) a software component without rebuilding all the software that interacts with this component. The second is ultimately the responsibility of programmers (e.g. deleting a function from a library breaks both source compatibility and ABI compatibility), but since it's required for many of the benefits ascribed to "stable ABI" and requires compiler support too, anyone pondering this subject should either explicitly acknowledge that aspect (and everything that it enables) as out of scope, or think about what it entails -- which goes far beyond "freezing" data structure layouts, calling conventions, etc.

In the C world, there is a relatively simple relation between these two aspects: everything you put into ("public") headers is part of the ABI, and if you want the ability to e.g. change the layout of a type without breaking ABI compatibility, then you need to make that struct an opaque type in the headers and e.g. expose getter/setter functions for fields whose existence you want to guarantee.

In a richer language, however, it's more difficult. The way Rust compiles generics, for example, necessarily inlines lots of "internal" library code (full of hard-coded field offsets, type sizes, etc.) into consumers of the library. Even in non-generic code, innocent changes such as adding an extra field to a struct often invalidate all code that deals with that struct in any way. In contrast, Swift has similar features but achieves dynamic linking anyway by adopting different compilation strategies and a host of neat tricks that Rust lacks. There is a huge design space here: what subset of Rust can we make usable for libraries with stable ABIs? what sorts of changes can be made non-ABI-breaking? What changes can we make to the compiler to allow a bigger subset of Rust and more kinds of changes? How can we communicate these rules to programmers who author libraries with stable ABIs, and how can we make it easy for them to adhere to the rules? What escape hatches should we have for trading off resilience and performance (like Swift's @frozen structs)? etc. etc.

These kinds of questions are crucial if you want to reap the benefits Swift and C are reaping from their "stable ABI", but are only apparent if you go beyond viewing ABI stability as just about the choices the compiler makes while compiling code. You can focus exclusively on that aspect if you want, but in that case you still need to think very carefully to avoid over-stating what use cases your proposal will actually enable.

46 Likes

Addendum: when I wrote this sentence I also fell into the trap of focusing just on the "ABI" in the narrow sense of choices made while mapping Rust to machine code. Just as important is the fact that the library's code is copied into the consumer at all. So if a bug is fixed in the library, even if you can recompile that library and re-link everything (because you went through great effort to make type sizes and calling conventions and so on stable), the bug fix still won't reach the applications that use the library. Or will reach it inconsistently depending on where a fresh copy of the generic code was instantiated and where an existing copy of the code was reused (see: -Zshare-generics flag).

20 Likes

Thanks for taking the time to provide feedback. I'll try to provide clarifications and specify in more exact terms what meanings were meant.

I'll call the first definition of ABI the 'ABI specification', and the latter definition an 'ABI interface'.

An ABI specification isn't really programming-language determined, rather it's determined by largely the OS (and toolchain). An ABI specification provides guarantees as to how types are laid out in memory, how functions are called, and how things are named. The definition of 'ABI' used in this proposal largely falls into the ABI specification category, as the proposal discusses modularizing the code that defines Rust's ABI specification.

An ABI interface defines how a user of a ABI specification can upgrade existing code without invalidating the ABI. The challenges of cementing an ABI interface difficult: because ABI specifications are modularized, more than one ABI interface can exist, meaning that each ABI specification has to describe its own interface. This is a complex topic and I'll need some more time to think of an elegant solution, but for now we'll assume that this is out-of-scope and two ABI interface data types are only compatible if they have the same byte-level layout.

AFAICT, Swift's ABI largely works by generating 'header files' (swiftmodule + interface summary) for the library at compile time. When a library is dynamically linked, the 'header files' are used to communicate with the library, regardless of version. However, because Swift is dynamic, it has much more free reign as to what stability guarantees its ABI is able to provide. For example, adding a field to a struct in C will invalidate the struct's ABI, but doing the same in Swift won't, necessarily.

I don't know the specifics of the Rust linker, so any clarification on how Rust currently works in this regard would be appreciated.

I'll try to answer these questions, but they're fairly tough, so forgive me if I provide a nonsensical answer.

What subset of Rust can we make usable for libraries with stable ABIs? This proposal proposed a modular system-ABI where each basic Rust structure has a macro associated with it that generates layouts for it. This was discussed in the original thread, but the result was inconclusive. I think that the core rust data types, namely structs, tuples, enums, and basic pointers could all be supported.

What sorts of changes can be made that are non-ABI breaking? Any changes to types that retain the same layout are ABI-specification compliant. Changes to API interfaces are more difficult to ascertain, so for now we'll say that an API interface change is non-breaking if an ABI-specification change is non-breaking.

Note: I actually spent a long time writing out a huge answer to this question with code examples, discussion of layouts, how traits can be used to provide more concrete interfaces, how the compiler might integrate with the ABI to detect these changes. After writing it all out, I figured that it was quite long-winded and I wasn't fully sure if it was a fully-baked idea, so I removed it in favor of this shorter answer. I'll try to rewrite and repost the original answer as it's quite interesting, but it still needs some work.

What changes can we make to the compiler to allow a bigger subset of Rust and more kinds of changes? To support modularizable ABIs, the compiler would have to be extended to use information provided by proc macros (layout, etc.) to determine the ABI specification. At compile time, macros would be passed data about structs and produce layout information for the compiler to use. For example, consider:

#[repr(abi_rust)]
pub enum Numbers {
    Float(f64),
    Int(usize),
}

The enum macro specified by abi_rust would use information about the Numbers enum to produce a layout. For example, following the conventions in this post, the generated Layout might be:

Enum {
    size: 16,
    discriminant: (0, Discriminant { size: 1 }),
    variants: vec![
        (8, Variant { size: 8, layout: Some(Box::new(USize)) }),
        (8, Variant { size: 8, layout: Some(Box::new(F64))   }),
    ]
}

Of course, this representation is far from perfect (Which variant do we mean?), but it shows the general idea. The compiler can use the provided layout in conjunction with the information about the enum to correctly determine layouts. The compiler would also have to be modified to support dynamic linking (and should be able to report errors when ABI interfaces don't align).

When I started the proposal, I was worried about trying to make a point overstating it. AFAICT, Once an modular ABI specification system as proposed this is put in place, just about everything I mentioned in the proposal is should be possible. The main caveat is that although a modular ABI specification might enable certain things, like ABI interop with Swift, implementing a modular ABI will not immediately guarantee such a thing will immediately work out of the box - much more work will still have to be done to bring Swift ABI support to Rust.

Thank you again for the feedback. It's taken a lot of challenging thinking, writing, and discussion to put this proposal coherently because the underlying complexity of the problem of developing a modular ABI - I'd be lying if I said I understood everything completely. Any forward progress we make on this problem is good; although the problem at hand is difficult to solve elegantly, I hope that through the insights and feedback from the many smart people in the Rust community an ultimatum to this problem may be found, whether it be a solution or a concrete consensus as to why such a thing is impossible.

3 Likes

Really good points @hanna-kruppe.

Whew, that's a difficult one for sure. We may have to, at least initially, limit the ABI stable constructs that you are able to use such as blacklisting generics or something. Obviously some of Rusts most awesome features such as Result<T> and Option<T> come from generics, so to say that you couldn't use generics anywhere in your code obviously wouldn't work. So then there are restrictions on the public interfaces only maybe?

As a library designer, most naturally, I would think that my ABI stays compatible if I don't make any public API changes. Essentially if the Rustdoc doesn't look any different, the ABI stayed the same, but that is just what is intuitive. Maybe when compiling in ABI compatible mode is makes sure that all public interfaces only use types that are ABI compatible ( so if that meant no-generics, your public functions couldn't return Result ).

Also, there could be a tool that compares two commits of your application and makes sure that the resulting ABIs are the same.

Absolutely. For this proposal I would think that we should definitely keep our minds open and think about the problem of how we serve ABI stability to the programmer, not just to the binary representation of the ABI, which, like you stated, is a much bigger problem.

Still, if we could achieve just a step in the right direction that would also be useful, even if it doesn't bring in the whole solution.

I would rather that we don't design something that satisfies only the ABI specification ( as @isaac defined the term ) without providing a lot of value to the programmer who really has to deal with the ABI interface. Still, if we have to find a way to work out a more stable ABI specification before we can tackle a stable ABI interface, then it is still useful work.

3 Likes

One super rough idea I am thinking about recently is that maybe we don‘t need „Rust“ ABI, but „System ABI“? Today, the interop language of software components is C ABI, and it is wholly inadequate: it doesn’t even have slices, strings are slow and error-prone, etc.

It seems like developing language-independent ABI which significantly improves over C, without being Rust specific, is possible. Slices, tagged unions, utf8 strings, borrowed closures are features which immediately come to mind and have obvious-ish implementations.

Distinction between borrowed and owned (callee must invoke destructor) data and simple lifetimes are somewhat more Rust/C++ specific, but don’t seem too controversial either.

Support for dynamic dispatch (where fat vs thin pointer is a big tradeoff) and templates seems pretty hard, but also desirable and plausible.

It seems like if we had this „C ABI, but you also can have nice things“ , then interoperability between various software components would be easier and far less error prone.

48 Likes

That actually sounds like a really cool idea. I'm thinking we would still try to work out this proposal for a modular ABI, then we would try to implement this new ABI, that was designed not only for Rust, as an ABI crate.

The spec for that ABI would be a separate proposal, but maybe this proposal would seek to satisfy the requirements for being able to provide that "System ABI" for Rust.

2 Likes

Generics are not a problem as long as all types are known in the interface exposed through the ABI. So it wouldn't be a problem to return Result<i32, String> or even Result<(), Box<dyn Error>> (assuming trait objects have defined ABI).

They're only an issue for functions like pub fn generic<T, E>() -> Result<T, E> that can't be monomorphized on the library side. For these functions the options are:

  • Just forbid them. Library interfaces would have to use dyn Trait or concrete types instead. IMHO this is quite sensible limitation, especially for an MVP.

  • Require defining ahead of time which parameters can be used, and compile monomorphic versions just for these types (similar to template instantiations in C++)

  • Do what Swift does and compile a universal version of the generic function that uses run time type information to support arbitrary types. It's a very clever approach from ABI perspective, but it's equivalent of changing everything into dyn Trait, so it may be a poor fit for Rust.

22 Likes

That's nice! :slight_smile:

I think that is rather reasonable.

Maybe this could be optional behavior?

4 Likes

That's not actually too far fetched: Optionally: declare the ffi-safe traits with #[sabi_trait] , used as trait objects in the public interface.

That seems like a reasonable limitation.

IMO, I feel like that might be a comparatively significant extension to the compiler.

Perhaps for only non-monomorphizable functions, explicitly annotated.

1 Like

The Swift ABI isn't specific to Swift, and is an impressive feat of engineering that could fill this slot.

Witness table indirection obviously adds some overhead over static linking or "just repr(Rust)" linking, but it successfully maintains the ability to evolve private implementation details, and it's a necessary cost to do so. (Also, it supports freezing the ABI to remove (most) witness table overhead.)

The one thing I don't recall off the top of my head is whether it requires any shared heap object to be managed by Swift atomic reference counting. I think it just uses opaque "clone pointers", though.

"Add repr(Swift) to Rust" is an interesting research topic independent of "Add user-defined repr to Rust".


But, yes, an initial MVP would handle concretized generics fine as "just another type" and punt on unconcrete generics, as you can "just" expose a dyn API yourself.

13 Likes

Since the Swift ABI would be included in any reasonable set of modular Rust ABIs, it probably makes sense to subset the modular ABI task to first just develop the Swift ABI, noting during the development process those places where additional effort would be needed to generalize to additional ABIs beyond the obvious three: Rust's native unstable/unspecified ABI, the stable# C ABI, and the new stable# Swift ABI.

# Of course those externally-specified ABIs are subject to change through their own language specification maintenance processes.

7 Likes

That saves us the development of the proposal, which would be large in-and-of-itself, for a „System ABI“. I think that makes a lot of sense.

So the goal for this proposal would be to create a modular ABI system that could support a Swift ABI crate for Rust.

:+1:

6 Likes

The problem with that is that the Swift ABI is much more complicated than the Rust or C ABIs, as it has e.g. different behavior inside the defining compilation static linking and things using it via dynamic linking.

And another thing is that IIRC, the Swift ABI requires alloca, which Rust doesn't have yet (and is decently low priority IIUC).

But if a modular ABI is powerful enough to support the Swift ABI, it's probably good enough to support basically any feasible ABI. And it would be nice if the definition of repr(Rust) and/or repr(C) could be separate from rustc, I suppose.

Personally, I'd target the MVP at being able to specify repr(C), stage 2 at (most of?) repr(Rust) (~niche filling), and stage 3 at the (most of?) nongeneric subset of repr(Swift) (witness tables outside the main static link).

6 Likes

IIRC, in some situations where alloca isn't appropriate, another technique is used.

Would it be good to create some Wiki pages somewhere maybe to outline a more formal specification of the proposal or a couple different versions of the proposal that could be critiqued?

It seems like the general consensus is that it is an overall good idea, but the details need to be worked out. In order to work out the details we probably need to start spec-ing stuff out, even if it needs to be heavily revised so that we can work out specific points that need to be addressed.

Also, before even attempting to spec this out we should be sure that we are clear on the goal or "vision" of the proposal. What are we trying to accomplish with it? If necessary we could modify the vision later, if it makes sense because of technical limitations on what we can accomplish in a reasonable proposal, but we should come up with an initial "what is the goal of the proposal?". What should the proposal accomplish?

I think @CAD97's suggestion to target this in stages might be a good idea:

The only potential problem I see of taking the proposal in stages from simpler to more complicated ABIs is that we might have to refactor the plugin API each time we need it to be able to support a more complicated ABI, and I'm not sure if it would be better just to try and take into account the complication necessary for the Swift API in the first place.

I am leaning towards doing it in stages, but others might know better what would be the most efficient way to tackle that problem.

5 Likes

Even if there are ABI-defining APIs for the more complicated cases, I think it still would make sense to have convenience APIs for the simpler cases. So the staged design process does need to make sure to leave space for expanding towards being able to define the full Swift ABI, but it doesn't need to actually handle that complexity until later.

5 Likes

The only potential problem I see of taking the proposal in stages from simpler to more complicated ABIs is that we might have to refactor the plugin API each time we need it to be able to support a more complicated ABI,

Given that this would be a plugin-style architecture, we should look at how other plugin architectures solve this problem.

In my experience, it is solved by having a general-purpose communication channel that is flexible enough to not require changes going forward. On top of that, the actual interaction is negotiated using capabilities on one or both sides.

For example:

rustc: what ABI features do you understand?

plugin: how to layout structs and enums

rustc: OK, I won't try to give you "niche" details, and I'll error if someone tries to use this ABI with tuples or arrays.

This allows more features to be added without breaking existing plugins.

6 Likes

There are lots of good ideas here, with the possibilities of multiple ABIs, one per language. But, as someone who was deeply involved in the C/C++ ABI for the MIPS processor, one thing that people seem to be missing is that there are NxMxTBD ABIs. One per processor (and possibly more if you consider little-endian vs. big-endian, small vs. medium vs. large memory models etc), one per language, and maybe more (the TBD). So, tossing around the term "ABI" hides a world of work. And you need to consider ABIs you're not working on to preclude conflicts with the ones you are.

People touched upon the various things Swift supports; the general question arises about support for things which don't have a Rust equivalent. FFIs work from language A to language B and the reverse. My initial reaction is, if Rust doesn't support it, the ABI doesn't support it. This is just a matter of trying to manage the scope of such a project. This doesn't preclude adding in support later, it just makes things more achievable in the near term.

Having said that, it's an awesome idea. The benefits are large. It allows third-party Rust compilers to emerge which may be highly optimized, and hence useful, for specific architectures, much as using an Intel C compiler may give you better performance in generated code than LLVM or GCC. This means slightly less support for the open source language but significantly more support for Rust.

13 Likes

Yeah, I've seen how these things can be way more complicated than they might seem from the outside. I think this could potentially be very difficult, but I also think that the possible advantages could be enormous so it is worth doing as much evaluation of what we might actually be able to accomplish, even if it is initially more limited than we would hope for the future.

I think that makes sense. The scope of a fully functional and capable modular ABI system is probably massive, so it would be good to focus initially on what we will be able to extend with more features later, and what could actually start bringing in benefits for some use-cases.

1 Like