[Sketch] Minimal pimpl-style "stable ABI"

TL;DR: allow a crate to specify abi = "stable" or similar in Cargo.toml. If it exposes anything that doesn't have a defined ABI (i.e. is repr(Rust)/extern "Rust"), fail compilation.


This is a very unbaked sketch based on a few ideas I had in the wake of @matklad's pimpl post. This is not intended to be a short term target nor really a direct suggestion, rather a starting point for discussion on what a hyper minimal stable ABI could look like for Rust.


My proposal is that we add a new library property to the cargo metadata to indicate that a library is intended to have a stable ABI. (For this post, I use abi = "stable", but any key communicating a similar idea would do.) This would cause compilation of the crate to fail if any publicly exported API allows working with types that don't have a defined ABI.

The main advantage of this is that the crate can now be treated as having a stable ABI. For Rust specifically, this means two main things: linking with a version of the library compiled with a different compiler, and the ability to avoid recompiling downstream when upstream gets an update.

At a glance, this seems almost impossibly restricting. One real restriction is that we can't have any polymorphic API surface area. It would appear that we're also limited to repr(C) types everywhere and the other limitations FFI safety implies, but that's a little too strict.

For T: Sized, the layout of &T is defined. That means that a function with a defined calling convention can take &T and be ABI defined. This enables a "pimpl" pattern: expose public ways of manipulating a type that the outside world knows nothing about.

For safety, the downstream library consumer treats T as if it were ?Sized (but with a thin pointer) and asks upstream for any information about the type it needs at runtime (such as for executing std::mem::size_of_val::<T>()). For additional safety, T is always treated as having drop glue. This means that even if it doesn't, it's safe to add a member that requires dropping later without worrying about it being forgotten.

Additionally, it should be treated downstream as having the safe defaults for autotraits: !Send, !Sync, !Unpin, unless explicitly implemented otherwise.

The biggest remaining hurdle (that the author sees) is that of construction. If (repr(Rust)) can only be handled by an abi = "stable" API by reference, then we need some way to create it by owned reference.

The answer to that is that Box<T> already has a defined layout for T: Sized (if I'm not mistaken) despite not being FFI safe: exactly that of just a pointer. We may want an extern "RustStable" fn to allow the use of these "stable ABI but not C FFI safe" types (pointers to decidedly not ABI stable types) rather than overloading extern "C". That said, it could also be useful to say that this pattern actually is FFI safe and allow calling these functions from C FFI.

Linking would only be defined to work when

  • The version of the "header" being used to link is at most as semver recent as the compiled version, and is semver compatible.
  • The version of the compiler being used to link is at least as semver recent as the most recent change to abi = "stable" before the version of the compiler used to compile the library.

To help enforce these requirements, these versions can be embedded in the compiled artifact and checked at link time.

The final question is what level of lifetime generics do we allow? The simple solution is to only allow the elided lifetime '_. The author believes it ok to allow whatever lifetime signatures are desired, as a Rust caller is required to link a compatible "header" and C FFI is used to dealing with complicated lifetime requirements and will benefit from the machine checked ones. This isn't a dimension we have to restrict for ABI stability, as lifetimes don't exist at runtime.

One might also raise the question of partially ABI stable crates. The author is comfortable delaying that question, especially as a wrapper crate can expose the partial API in an ABI stable manner.


Major potential downsides:

  • Bigger ABI stable surface area of the language
  • Exclusion of traits from ABI stable crates, as std traits don't have ABI stable interfaces
    • Somewhat fixable in the language: export stable ABI wrappers around trait dispatched functions and use those
    • Library workaround: make a normal crate and an ABI stable wrapper
  • "minor" breaking changes become "major" breaking changes when they change ABI
    • The biggest potential issue here is addition of drop glue. Hence the rule about always treating types as having drop glue.
  • Requires a stable mangling scheme or manual mangled symbol names everywhere
    • Language solution: guarantee v0 mangling
    • Issue: moving declarations and re-exporting at the original location changes mangling
      • Solution: just emit duplicated symbols for every re-export location that refer to the same object
    • Library solution: manual mangled names
  • The standard library can't ever have a stable ABI under this system as extern "RustStable" fn cannot unify with extern "Rust" fn
    • Library solution: a stable ABI wrapper for the stable ABI compatible part of the standard library (which is decently small, as it requires no generic types)
  • Puts a foot in the door of a "proper" "stable ABI"

Major potential upsides:

  • Huge incremental compile benefit when touching simple ABI stable crate in a deep hierarchy (just recompile it rather than the world)
  • Linking of (simple) precompiled Rust crates becomes somewhat practical
  • Puts a foot in the door of a "proper" "stable ABI"

Notable potential extensions:

  • Simple unsized types, such as slices (including &str)
    • (But be careful not to stabilize the "wrong" representation of CStr)

At a minimum, I'd like to see rustc/cargo grow enough smarts that at least in debug mode, they can notice when a crate's recompile doesn't change the public ABI, so downstream crates don't have to be recompiled. This improvement doesn't need to expose any knobs to the user. IIRC, just touching a deep dependency causes the world to be recompiled, when we could do with recompiling the one crate, noticing the ABI didn't change at all, and relinking, avoiding recompiling anything downstream.

9 Likes

To be super clear:

This would make &_, &mut _, and Box<_> "ABI safe" for any sized type under the rules above (downstream treats it as ?Sized (thin), Drop, and not automatically having auto traits) alongside all preexisting "FFI safe" types. Further types (notably Rc<_> and Arc<_>) could potentially be added in the future.

Instantiations of a generic type are not generic and are allowed, so long as they follow the other rules for "ABI safety".

Any smart pointer with an into_raw_parts can be hacked into this system via a repr(C) container of said raw parts, and thus should probably be a potential candidate for becoming "ABI safe" in the future.

1 Like

:heart:

I think this is basically what rustc's existing incremental compilation feature is meant to do. Someone with more knowledge of rustc internals could chime in about how well it can currently do this and what limitations exist. All I know is that it's limited enough that last time I tried to build rustc with incremental compilation enabled, even a trivial rebuild took forever. :slight_smile: But this seems essentially orthogonal to a user-facing "stable ABI" feature.

On the other hand, I think a user-facing feature would have a variety of use cases, including:

  • Speeding the initial compile – being able to compile only the interface of a crate before starting to compile its dependent crates.

  • Developer control – being able to explicitly mark which parts of a crate can force recompilation of dependent crates, and which parts can't.

  • Dynamic linking – Linux distros, etc.

  • Dynamically loaded plugins. You can sort of do this already with abi_stable.

  • Binary caching – Doesn't strictly require a stable ABI, but it would be really nice if cargo could recompile some dependencies from source, such as -sys crates (since different systems may have different library paths and such), but still be able to grab prebuilt binaries for crates that depend on them. That would drastically reduce the number of duplicate binaries that a binary cache would have to compile, and make cross-compilation more feasible.

  • Binary-only distribution of proprietary software (I hate this, but it is a use case).

Several of these use cases do seem to require a way to separate interface from implementation within a crate's source code; in other words, the equivalent of a header file. Dynamic linking and binary caching don't strictly require it.

4 Likes

Does it require Rc and Arc to be made into lang items?

Why in the Cargo.toml ? Another option (there are probably many) would be to specify #![abi(C)] or #![abi(stable)] at the top of a crate.

At a glance, this seems almost impossibly restricting. One real restriction is that we can't have any polymorphic API surface area. It would appear that we're also limited to repr(C) types everywhere and the other limitations FFI safety implies, but that's a little too strict.

Just to clarify: the library can use traits, repr(Rust), rust functions, etc. internally. What it cannot do is export any pub function items that aren't extern "C". That would already restrict the arguments of these functions to be repr(C), and since they are types in a public API, they would need to be public and repr(C) as well.

This isn't necessarily incompatible with allowing such a crate to export repr(Rust) structs, traits, etc. as long as these aren't used on any of the function items exposed by the API. When compiling the crate, we can separate the "C" part, from the Rust part. The C part would be stable, the Rust part would have an unstable ABI.

For T: Sized , the layout of &T is defined. That means that a function with a defined calling convention can take &T and be ABI defined.

This is currently unspecified, and for example if T is usize we could just as well pass it by copy instead (the function can't modify it, and no other threads can modify it either, so passing it by pointer and loading it inside the function isn't necessary). I think this optimization might be ok for all T: Copy + Freeze types tha are sufficiently small (e.g. &(usize, usize) would be ok as well, since we already do this optimization for scalar pairs).

We may want an extern "RustStable" fn to

I think I would prefer a versioned ABI, e.g., extern "Rust=0.1" fn. and maybe at some point if we decide to stabilize it provide an extern "Rust=1.0" fn ABI. This would allow us to bump it in the future, and handle both ABIs in the same program. It's unclear to me whether it would be interesting to tie this kind of versioning to editions somehow. Maybe, maybe not.


One issue we also need to consider is how this would interact with panicking. I think it would be ok to initially say that extern "Rust=0.1" fn functions abort on panic, and we can always provide a different ABI where panicking is allowed if we manage to stabilize that.


FWIW I think that doing something in this space is a great idea, and that starting with the simplest possible thing that could work (e.g. an #![abi(stable)] that restricts the public API to repr(C) types, extern "C" functions, turn the improper_ctypes warning to deny) would already be a great start.

We can then try to figure out how cargo could exploit this, and in parallel, how can we allow such crates to do more stuff (e.g. the extern "Rust=X" fn part).

2 Likes

For regular static builds, I've got a feeling that incremental compilation should handle it automatically (e.g. downstream crates should know which ABI details they depend on, what they have inlined, and skip rebuild if nothing they used has changed).

However, I welcome all progress towards proper ABI for dynamic libraries.

Could this be changed to be an ABI for dynamic linking? e.g. a Rust crate compiled as .so would export only items marked as pub(dyn), and pub(dyn) would enforce the ABI stability limitations.

1 Like

Rust should explore "stabilizable" layout for both data types and trait objects, including cleaner type erasure, but actually stabilizing anything sounds years away.

I think https://github.com/rust-lang/rfcs/issues/600 shows that pub(dyn) makes no sense, but annotations like #[repr(redox-v1)] pub enum .. and #[repr(redox-v1)] dyn Trait or #[repr(redox-v1)] trait .. make sense.

There is much to learn from Swift ala https://gankra.github.io/blah/swift-abi/ too, especially their improved handling of unsized types on the stack, and their LLVM hacks adding return path optimizations for Result<_,SomeDynError>. Yet, I think their implicit getters and setters approach violates Rust's core objectives.

In my view, Rust should instead pursue

  • fields in traits with field offsets in trait objects eventually, including bit offsets for bools, discriminants, etc.
  • manual annotations for layout, including only partially annotated layout, and note manual enum discriminants was accepted in https://github.com/rust-lang/rfcs/pull/2363
  • improve handling of unsized types, ideally even on the stack

If you need ABI stability then you'd either specify a #[repr(..)], a layout for all pub fields, and sometimes a size, or else define an object safe trait with a #[repr(..)].

In principle one could even make traits like this object safe by adding an invisible fn process_size(&self) -> usize and fn new_size(self: &dyn(vtable_only) Foo) -> usize methods, along with some &dyn(vtable_only) Trait type.

#[repr(Rust-v2)]
pub trait Foo : ?Sized {
    x,y: f64;
    mut okay : bool;
    fn new() -> dyn Self;
    fn process(self: dyn Self) -> dyn Self;
}
1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.