#[repr(Interoperable_2024)]

This is a continuation of the discussion in Discussion: Editions in Rust-GCC (and other Rust compilers), which diverged from the original topic.

Summary

In the Discussion: Editions in Rust-GCC (and other Rust compilers), we ended up coming up with the idea that we should have some kind of stabilized representation that is fully and unambiguously specified. Since that wildly diverged from the original topic, @comex and @scottmcm made the excellent point that instead of hijacking that thread, discussions of stabilized representations should be split off into their own topic (this one).

The Issue

We want rust binaries to be able to link with other binaries, be they compiled rust code or compiled from other languages. The normal way to handle this is to mark the shared stuff as #[repr(C)] and call it a day. However, if we had a stabilized representation that was fully and unambiguously specified, then we'd have the opportunity to go beyond what #[repr(C)] is able to guarantee and provide. Moreover, since the new representation would be separate from #[repr(C)], we'd have the opportunity to fix any ambiguity that might have cropped up. Finally, since we have the concept of editions, we can bake into the layout the version information of the layout. This would allow forwards and backwards compatibility.

So What Now?

First, go read the Discussion: Editions in Rust-GCC (and other Rust compilers) if you haven't already. That will give you the background for this topic (and it means that I don't have to expand the summary as much).

If enough people think it's a good idea, I'll file an issue to get this started as a new initiative.

Edit

@steffahn told me how to create a poll. Thank you @steffahn! So, on with a proper poll!

Stabilized representation
  • Yes, we need a stablized representation
  • No, we don't need a stabilized representation
  • Something else (comment below)

0 voters

Edit 2 - References

I'm going to try to maintain a reference section of other places where the layout/ABI effort has been proposed before. That way we can all go back to it as needed.

Edit 3 - 22 August 2022 - Summary of discussions so far.

This is a summary of discussions both here and on Zulipchat. No decisions seem to have been made yet. Also, Zulip prevents me from mentioning more than 10 people at a time, so I had to remove names to make it happy. I apologize about that.

  • There seems to be a lot of interest in developing a stable cross-language ABI.
  • AT-josh suggests that we should start with extern "C" as a base-line and develop a proper superset of its capabilities. This would ensure that code that is currently extern "C" can be linked to and used by the new ABI.
    • AT-jhpratt echoed this sentiment, suggesting that the initial ABI should be the absolute minimum necessary to clarify extern "C"'s behavior.
  • AT-comex pointed out that WebAssembly still requires an ABI, and that maybe we should start with what they have. AT-yigal100 strongly argued for more use of WebAssembly as the best layout, and that we should all be working towards that in every case.
  • AT-programmerjake thought it would be neat to "end up with an ABI that works across different ISAs, such that I could have a function in x86_64 call a function in AArch64 call a function in RV64GC, etc."
  • Everyone seems to have their own favorite name for this ABI layer, from AT-josh's extern "safe-1", to AT-ckaran's extern "interoperable_<YEAR>", to AT-nacaclanga's dual version number approach
    • There are concerns about what happens if the version or year numbers of the ABIs aren't advanced in lock-step with one another. AT-zackw suggests that the version numbers should only be advanced in lockstep when one or the other needs to 'catch up' to the more advanced number. AT-jjpe was concerned that this would lead to confusion when new users look for the corresponding ABI, only to find that there isn't one. AT-ckaran was strongly in favor of advancing both in lock-step with one another, even if no changes were made.
  • There is interest in having two ABIs, one for static linking and one for dynamic linking. AT-josh has suggested that we concentrate on the static linking part first (which should be easier to do), and work on the dynamic linking part later. AT-ckaran requested that any dynamic linking ABI be a proper superset of the corresponding static linking ABI.
  • There is some discussion on what needs to be stabilized over an above what extern "C" currently does. Ideas include:
    • AT-CAD97 was concerned about how zero-sized types (ZSTs) and zero-sized arrays might be handled as both are illegal in standard C
    • AT-zackw and AT-josh both wanted proper representation of enums, in particular their tag sizes and layouts.
    • AT-CAD97 pointed out there is a lack of consensus in the C echo system on the order of what you put first for an array, the pointer or the length argument.
  • AT-CAD97 came up with the idea of a canonical layout/calling convention that is then lowered to the actual calling convention. AT-ckaran voiced strong support for this idea.
  • AT-ckaran mentioned that it is possible to have supported extensions in a similar manner to OpenGL's extension policy. At the same time, he was opposed to the idea, saying that he only mentioned it to document the possibility of it.

If I've missed anything in this summary, please PM me so I can make corrections.

5 Likes

There's been widespread interest in supporting some kind of safe interoperable ABI for a long time.

I think it's important to distinguish between two different cases, both of which are important:

First, an extern "safe-1" (naming bikeshed and versioning conventions aside, and leaving aside associated reprs) that's designed to interoperate between languages. This should be defined as an ABI that uses extern "C" as a baseline, and then introduces representations for additional types, such as counted known-to-be-UTF-8 strings, counted slices, and so on. This would be useful even if it starts out as a relatively small subset of Rust and grows slowly and incrementally, because the primary goal is to talk between safe languages without having to drop down to raw pointers and unsafe.

Second, an extern "rust-dynamic-1" or similar (again, naming bikeshed and versioning conventions aside, and leaving aside associated reprs), which is designed to be a large subset of Rust, without consideration for interoperability with any language other than other versions of Rust. This would exist primarily for use cases like dynamic linking, plugins, and similar cases of separable compilation. For this, we'd want to support things like dynamic trait objects (similar to how Swift's ABI handles generics by handling them all as the equivalent of dyn), for instance.

I think we should treat those as two largely separate efforts, both valuable, with different goals and primary use cases. extern "safe-1" needs to be conservative in what it enables, so that a wide variety of languages can reasonably support all of it, so that we don't have to increase its version number often, and so that it can be implemented more easily. extern "rust-dynamic-1" should be expansive and cover as much of Rust as possible, and we shouldn't expect anything other than Rust to implement it.

I'd love to see both of these. I think extern "safe-1" is a smaller-scoped project that we could accomplish much sooner. extern "rust-dynamic-1" is a much more ambitious project. Both would have a great deal of value.

23 Likes

I can accept this, with the caveat that I want extern "rust-dynamic-1" to be guaranteed to be a superset of extern "safe-1", and that will be true for all future versions (i.e., for any x, extern "rust-dynamic-x" is guaranteed to be a superset of extern "safe-x"). There are dark corners of C++ where it is not a superset of C, and if you are unlucky enough to end up in those dark corners, you will be very, very unhappy. Let's not repeat those mistakes.

That said, given how much work will be involved, I would strongly prefer that we concentrate our initial efforts on extern "safe-1" for now. There is lot of work in terms of process, etc. that will need to be setup for such an initiative because we're not just talking about Rust here; done right, whatever we come up with could be the basis for many other languages to interoperate with each other, disregarding rust entirely. I am not enough of an expert to know what the ambiguous corner cases of extern "C" layout is, but given what I know of C, I'm reasonably sure that there are a few. If we are able to solve those layout issues here and now, then I think we'll have done a great service for the world at large.

3 Likes

I do agree that anything we support in the safe ABI should also be supported in the rust-dynamic ABI. I do want to avoid guaranteeing that we'll bump revisions at the same rate; I'd expect that by the time we have safe-2 we might have rust-dynamic-6. But in general, I agree that we shouldn't add something to the safe ABI that isn't supported in the rust-dynamic ABI.

Agreed.

I don't think we should be trying to re-specify extern "C" for any given platform; we should incorporate the platform C calling convention by reference (with references to documents where those documents exist). With some effort, I'm hoping that extern "safe-1" can be largely platform-agnostic. I'd expect the definition of the safe ABI to say things like "tuples are passed as if they were a struct defined with the same fields in the same order", or "slices are passed as if they were two arguments, a pointer and a usize, in that order", or "strings are passed as if they were a slice of bytes; strings are not guaranteed to be nul-terminated, and may contain internal nul characters; strings must be UTF-8". (We'll also need to have a clear and unambiguous definition of usize...)

4 Likes

Some limited subset of enums and generic types, sufficient e.g. to express Result<[u8;N], SmallPositiveInteger> is highly desirable as well, particularly for specifying a kernel-user interface in terms of safe types. I realize this drags in niche optimizations and discriminants.

Might it be less confusing to skip numbers as necessary so that "safe-N" and "rust-dynamic-N" are always paired, when both exist, but one side or the other don't always exist? In your hypothetical, we have safe-1 and rust-dynamic-1, and then rust-dynamic-{2,3,4,5} with no corresponding safe-X, and then safe-6 paired with rust-dynamic-6.

1 Like

This would very likely lead to questions like "I'm using safe-4 but can't find rust-dynamic-4, how can I do X" at some point down the line.

I'm aware they're meant for different use cases, but going with the use cases sketched out for both above, you never know when someone decides to write a new language that both interops with Rust using safe-x as essentially a safer and more featureful repr(C) and also would like to use rust-dynamic-x to do dynamic linking e.g. for plugin support.

EDIT: Of course the entire point might be moot if safe-x is featureful enough to support both of those things on its own.

1 Like

While I agree w.r.t. calling convention/ABI, I still would prefer if we could guarantee that #[repr(C)] layout is the simple inorder repeated application of Layout::extend. This allows us to avoid the messy and inconsistent handling of ZSTs and zero sized arrays (as zero sized types are illegal in standard C) as well as the fact that we already diverge from what some C compilers do as #[repr(align)] is treated differently than #pragma aligned in certain cases.

Now, we can and probably should lint the cases known to diverge from platform C as improper-c-types, but it seems better to use the simple and predictable layout and leave the edge cases to lints rather than complicating our C layout.

(I do understand the benefit of C layout matching the "platform C" layout exactly, but chasing bug-for-bug compatibility of the historically underspecified layout algorithm doesn't seem a useful allocation of anybody's time.)

1 Like

I don't want to discourage anyone's efforts here but I do feel we're firmly in the diminishing returns phase of the two mentioned use cases above. Interoperability today and into the future has to have the security and trust concerns addressed. So while addressing safely is import and required, this is insufficient imo for broad adoption.

We can already use wasm as a compilation target to cater for most of the practical applications today. Yes, it still has a few limitations and some overheads. These limitations are being actively worked on and the overheads are very likely insignificant given that interoperability with other languages means usually less performant languages.

For example, I much rather see a potential Rust clone of notepad++ use wasm modules for its plugin system. Currently, it uses native DLLs which means if I load a faulty or malicious plugin it could crush the host app or take advantage of its capabilities to attack my system. Even if rust supported already the suggested "dynamic-rust-1" why would the developer bother when they can simply use a crate such as wasmer and provide a much better solution that also covers portability using a few lines of code?

I do realise this doesn't cover 100% of use cases and people are of course free to volunteer their time to work on whatever they find interesting. That's what open source is all about! I just think that given the effort this requires and as I say diminishing returns for most rust users, this should not be a priority for the Rust project.

1 Like

My concern with an extern "safe" ABI is that other languages will have to support it to for it to become useful, and I don't know how many of them will, or even how many of them can.

1 Like

One use case for dynamic-rust-1 is "I want to build a Linux distribution in binary form, and not rebuild 100% all my Rust packages every time a new Rust compiler comes out". That use case is not solved by WebAssembly.

And the "safe" ABI has even more use cases.

2 Likes

I agree that we should specify Option and Result, as well as enums (e.g. how the discriminant and associated variant data are laid out).

I think we should be cautious specifying niche optimizations and similar, though, lest we increase the complexity too much. It wouldn't be the end of the world if repr(safe_1) effectively used repr(C) layout, which mostly doesn't use niches.

I do think we should specify "pointer that can't be NULL", and require that Option<pointer that can't be NULL> has the same layout as a pointer. I don't think we should go much beyond that in the first version.

4 Likes

This is pretty much my thinking with regard to an initial ABI. Do the absolute minimum. Reorder fields to minimize padding, preferring the definition order in the case of a tie. enums are just tagged unions. Pointers are a logical exception. Anything more than that? I don't think it's worth it for ABI v1.

1 Like

The one more thing is standardizing on (ptr, length) for slices. I suppose v1 could always defer to a user #[repr(safe-v1)] (*const T, usize) rather than define a span type itself, but the C ecosystem's lack of consensus on ptr, len or len, ptr argument ordering (despite having an obscure language bias[1] towards len first!) serves as evidence that having a standard early for such a core bit of vocabulary is important.


  1. It's possible to write void foo(size_t len, uint8_t arr[len]) and the language actually recognizes that as the minimum number of elements that the argument pointer points to. It's not possible to write it in the other order without old-style function declarations (which have been removed from C2x). ↩︎

5 Likes

As I said, I don't disagree that there are valid use cases for both ABI variants.

The example above regarding a binary Linux distro is over stated though imo. One pre-existing option is indeed a source code distro. Nothing new about that idea. The other option is a precompiled binary distro. Debian compiles all their packages for the end user. They don't have to do it for each release of rust though and they have been maintaining patches on top of the upstream projects in order to support more platforms and backporting security fixes for their longer lived stable releases for decades now. Rust didn't create this "problem".

Regarding webassembly: npm has apparently already started supporting wasm artefacts and so do some (all?) of the cloud container registries (eg azure supports this). There are of course also wasm specific cloud platforms. A simple first step for Rust imo should be to add similar support for cargo and crates.io. I reckon it will solve a whole swash of use cases:

  1. Cargo install is low hanging fruit for this. Just install a runtime and your tool of choice as a portable wasm module.
  2. Adding support for rustc to load wasm modules would allow us to have precompiled proc-macros. This was discussed already in the past.
  3. I suspect that given this possibility, a lot of other use cases will emerge. It will incentivise people who want to use dynamic libraries to use pre-existing wasm modules in the ecosystem.

Firefox for instance is already using wasm modules to isolate and sandbox internal components from each other. This is the way forward instead of trusting 3rd party binary artefacts which we know is a major potential risk from a security perspective.

Edit (as I forgot to mention before): The Linux kernel has their own custom solution with similar objectives - they have a scripting facility to write robust and safe driver modules. It is less efficient than loading a binary module but for a lot of devices this is an excellent trade-off. So even in kernel space the use case for native dynamic libraries is reduced.

There are use cases for dynamic Rust libraries. There are also use cases for WebAssembly. This thread and extern "rust-dynamic" are trying to solve the former problem. If you'd like to advocate the latter, please take it to another thread, rather than telling people what they shouldn't be working on or trying to invalidate other people's use cases.

1 Like

Explicit definition of enum tag sizes. And no field reordering at all would probably be simpler, I think.

At the risk of further derailing, it's worth noting that linking between multiple WebAssembly modules, or between WebAssembly code and native code, still requires an ABI. The main difference between WebAssembly and native platforms in this respect is that WebAssembly is defining a standard ABI which supports some functionality that existing C ABIs don't. That includes some WebAssembly-specific constructs such as handles and push/pull buffers, but also some constructs that just aren't in C, such as enums, guaranteed-UTF-8 counted strings, and counted slices.

The latter part is quite similar to @josh's described use case for "safe-1". Honestly, I wish the interface types effort had been designed as cross-platform from the start rather than narrowly focused on WebAssembly. I suspect there is still an opportunity to reuse some of their work: it'd be cool to see wit-bindgen ported to target native platforms, though that's more elaborate than what's being proposed here. But as regards this effort, we may want to see if we can make the representation of enums, strings, and slices in "safe-1" compatible with WebAssembly's "canonical ABI", at least when targeting WebAssembly. While this compatibility isn't necessary for WebAssembly's ABI to work (since they are currently using an approach based on generated bindings that perform conversions, rather than directly sharing struct definitions), it would allow their Rust bindings to be nicer/simpler in the future, and would avoid having two competing ABIs. And since WebAssembly's ABI effort is strongly inspired by Rust, there's a good chance that it's compatible with what we'd do anyway. ...Though, the canonical ABI says that enum discriminants are always u32, which seems unlikely to be what we want…

Edit: Requiring enum discriminant sizes to be explicitly specified, as @josh just suggested, would sidestep that problem.

6 Likes

Imho, it would be really neat to end up with an ABI that works across different ISAs, such that I could have a function in x86_64 call a function in AArch64 call a function in RV64GC, etc. This mostly requires memory layouts (structs, enums, etc.) to be identical, different register layouts and calling conventions can be handled using trampolines. Trampolines can't handle translating between different memory layouts in the general case, since some things just can't be moved, such as a shared atomic variable.

I have some ideas for how to implement a processor that has a reprogrammable decoder such that it can easily be made to run any ISA you can think of...

https://bugs.libre-soc.org/show_bug.cgi?id=841

I tend to agree with @jjpe on this; somewhere down the line, people are going to be confused as to why they can use the safe version, but can't find the dynamic version (or vice-versa).

My proposal is the exact opposite of @zackw's (and contrary to @josh's); whenever we release a version of "safe", we also release a version of "rust-dynamic", (and vice-versa), even if there are no changes to the other spec. Somewhere in the bowels of the compiler will be a hashmap that maps these no-op representations to the earlier version (it could literally be string substitution) so that the compiler requires minimal effort to maintain in those cases. The actual written release of the spec sheet could just be a page that says something like "safe-12 is a synonym for safe-5. Click here to go to safe-5's documentation". Doing this is minimal effort, and deals with the inevitable questions about why "rust-dynamic-X" works, but "safe-X" doesn't.

4 Likes

Emphasis is mine.

I am strongly against this because a spec needs to be clear, complete, and unambiguous or there is no point to it. If the C specs for a platform are ambiguous or missing, then we need to write our own specs as our own "safe" version for that platform. Done right, our version will be adopted by the given platform as its own spec.

Licensing Thoughts

This is going sideways, but I want to address this ASAP. There have been cases in the past (Rambus lawsuits, embrace, extend, and extinguish (EEE)) where good ideas were weaponized to wreck a spec. Can we require that all contributions to any spec be subject to a DCO, and that the spec be licensed under some appropriate license? I'm not sure if MIT/Apache 2.0 is the right license for a spec, possibly something like CC BY-ND would be better to prevent EEE.

Basically, I want to make sure that if someone is claiming to use "safe-X", then what they produce is all and only what is in the spec for "safe-X". No producing a compiler that creates code that only other users of that compiler can consume!

1 Like