A Stable Modular ABI for Rust

Even if there are ABI-defining APIs for the more complicated cases, I think it still would make sense to have convenience APIs for the simpler cases. So the staged design process does need to make sure to leave space for expanding towards being able to define the full Swift ABI, but it doesn't need to actually handle that complexity until later.

5 Likes

The only potential problem I see of taking the proposal in stages from simpler to more complicated ABIs is that we might have to refactor the plugin API each time we need it to be able to support a more complicated ABI,

Given that this would be a plugin-style architecture, we should look at how other plugin architectures solve this problem.

In my experience, it is solved by having a general-purpose communication channel that is flexible enough to not require changes going forward. On top of that, the actual interaction is negotiated using capabilities on one or both sides.

For example:

rustc: what ABI features do you understand?

plugin: how to layout structs and enums

rustc: OK, I won't try to give you "niche" details, and I'll error if someone tries to use this ABI with tuples or arrays.

This allows more features to be added without breaking existing plugins.

6 Likes

There are lots of good ideas here, with the possibilities of multiple ABIs, one per language. But, as someone who was deeply involved in the C/C++ ABI for the MIPS processor, one thing that people seem to be missing is that there are NxMxTBD ABIs. One per processor (and possibly more if you consider little-endian vs. big-endian, small vs. medium vs. large memory models etc), one per language, and maybe more (the TBD). So, tossing around the term "ABI" hides a world of work. And you need to consider ABIs you're not working on to preclude conflicts with the ones you are.

People touched upon the various things Swift supports; the general question arises about support for things which don't have a Rust equivalent. FFIs work from language A to language B and the reverse. My initial reaction is, if Rust doesn't support it, the ABI doesn't support it. This is just a matter of trying to manage the scope of such a project. This doesn't preclude adding in support later, it just makes things more achievable in the near term.

Having said that, it's an awesome idea. The benefits are large. It allows third-party Rust compilers to emerge which may be highly optimized, and hence useful, for specific architectures, much as using an Intel C compiler may give you better performance in generated code than LLVM or GCC. This means slightly less support for the open source language but significantly more support for Rust.

13 Likes

Yeah, I've seen how these things can be way more complicated than they might seem from the outside. I think this could potentially be very difficult, but I also think that the possible advantages could be enormous so it is worth doing as much evaluation of what we might actually be able to accomplish, even if it is initially more limited than we would hope for the future.

I think that makes sense. The scope of a fully functional and capable modular ABI system is probably massive, so it would be good to focus initially on what we will be able to extend with more features later, and what could actually start bringing in benefits for some use-cases.

1 Like

This sounds a little like Cap'nProto to me, which could actually be an option for implementing the plugin protocol. I'm not sure if building the plugins into the compiler through macros might be a better idea, though. That would probably make distributing the ABI plugins as crates easier. ( This doesn't preclude us from using a capabilities based ABI plugin implementation model )

Also, WASM is another idea. Something that has already shown promise as a way to compile and use procedural macros, with a successful POC I think ( there's a post around here somewhere about it ).

3 Likes

watt

2 Likes

I think this is a great idea and I hope it succeeds. But I think it is important to clarify the goals of this modular ABI project. I reread this article and I think it provides a great overview of the Rust ABI problem space. This is also very relevant.

The questions I have:

  1. Does the modular ABI deal only with type layout or also with calling conventions?
  2. Is it handled as annotation per struct similar to #[repr(C)] or is it a global compiler/cargo option or a singleton similar to global_allocator? Making repr(C) just another ABI module would be very elegant, but it limits the usability if every struct has to be annotated, especially in dependencies.
  3. If the ABI is applied globally, does it apply to all types in the crate or only the public ones? What about re-exported types from dependencies?
  4. How are different targets handled? Is there one ABI module per target?
6 Likes

I'm going to answer these the best I can. Maybe @isaac or someone else has more input:

Not sure because I don't understand the technical aspects fully, but I think it would deal with both.

I think we would have options for both an annotated version such as #[repr(ABI)] and a global flag that could be used to compile a whole crate as a dynamic library with that ABI.

I think it would only be necessary for the public ones, but there might be reasons to do the private ones as well, maybe? I think only the Public ones so that the rest of the types benefit from any optimizations the Rust compiler team adds to the internal type layouts for Rust default.

From what I've heard, it sounds like, yes, we might need an ABI module per target, but I don't know.

1 Like

As far as I understand, we may want to specify the ABI for everything that cross anything that isn't statically linked with the current crate. Currently, only types that are marked #[repr(C)] can safely cross that boundary. But if more ABI were supported, it would still be the same part of the code that would cross the boundary, and thus the same places that would need to be annotated.

There are two types of places where ABI is crossed:

  • where you are consuming types/calling function with a known ABI (for example when your code call a C library, or a Swift library)
  • when you are consumed by something else (for example when your code is called as a C library, or a Swift library)

In the first case, you want to explicitly specify the ABI used (#[repr(C)] or #[repr(Swift)] respectively), since you know how the other library is called.

However, in the second case, I think it's better to let the ABI unspecified, and control it with a global flag. Types/functions could be annotated with something like #[repr(ABI)]. The concrete ABI would be selected at rustc (cargo?) level. If more flexibility is required, #[cfg(...)] flags could be used.

If you have a type that doesn't have a representation for a given ABI (for example String cannot be currently used with the C ABI), you can use #[cfg(...)] flags to manually implement a fallback. In the future, the compiler could also automatically do the conversion for you if it is considered useful (in the case of String, it could for example be copied into a null-terminated u8 buffer).

To sum-up:

  • if you are witting a library that can be consumed by anything, you would use #[repr(ABI)]
  • while if you are writing a wrapper for an existing library, you would use the concrete ABI (for example #[repr(C)] or #[repr(Swift)].
3 Likes

I have some question, please forgive me if they're stupid:

Does that also include extern "ABI" and extern "Swift"?

Should it be possible to support multiple ABIs from a crate, e.g. when the crate is dynamically linked to a C and a Swift library?

How should standard library types such as Option, Result and u128 be supported?

Should ABIs be handled by rustup (so you can do rustup abi install Swift && cargo install -- --abi=Swift) or as cargo dependencies (e.g. swift_abi = { version = "1", abi = "Swift", optional = true })?

1 Like

Honest questions aren't stupid! :slight_smile:

Yes.

I think so, if you use per-struct and per-function annotations such as #[repr(Swift)] and #[repr(C)].

That's a good question. We were thinking about allowing you to build the whole crate with an ABI with a cargo flag or something, which would seem like it would be best managed with rustup but at the same time we need to be able to version these ABI crates with SemVer and apply per-struct annotations in the crate, which seems like a crate dependency is the way to go.

If the ABI crates were implemented as macros, I think crates would be a good fit. I'm leaning towards that.

2 Likes

Having it somewhere within the Cargo.toml file will also ensure repeatable builds, something I'm fighting with at work right now. So a big :+1: from me for putting it in the Cargo.toml file!

I like this idea, but I want to be sure that it interacts well with the idea of storing it as a dependency within Cargo.toml. That is, I want to be certain that there is no way to specify conflicting ABIs in the Cargo.toml file and the code.

Come to think of it, how would workspaces be handled? What if you include a crate as a dependency whose ABI is different from your own crate? How do we decide which ABI (if any) 'wins'?

2 Likes

Well, just including the crate wouldn't do anything to change the crate ABI, you could even include multiple ABIs without having any effect until you add annotations to the crate.

I'm thinking you would have to add a module level attribute such as #![repr_crate(Swift)] if you wanted to change the ABI of the whole crate and #[repr(Swift)] on structs ( and I think functions, too ) if you wanted to change the representation and calling convention for those specific items.

I think at that point it would be easy enough to statically guarantee non-conflicting ABIs at compile time.

1 Like

Ah, thank you, I thought there was some kind of special magic going on that meant that specifying the dependency would also implicitly set #![repr_crate(Swift)] on the crate.

1 Like

Maybe I'm completely brain glitching here, but does that mean something like

#[repr(C)]
#[repr(Swift)]
pub struct Blah {
    field_1: u8
}

would be expected to work? Or would there be some kind of configuration switch that needs to be selected so that you could 'turn on' on or the other ABI?

1 Like

No, that would be a conflict just like you can't ( as far as I know ) do this in today's Rust:

#[repr(C)]
#[repr(Rust)]
pub struct Blah {
    field_1: u8
}

A struct could not have two memory representations at the same time and the compiler should tell you that.

2 Likes

Makes sense, but I have to admit I was kind of hoping it would work... you could then use rust to create glue between compiled objects that are in one ABI and get them to work with a different ABI. I fully admit that would be a niche application (only useful for code where you've lost the source, but still have the binary).

EDIT

I just realized that there could be another option. Would the following be expected to work?

#[repr(C)]
pub struct Blah_cabi {
    field_1: u8
}

impl From<Blah_rustabi> for Blah_cabi {
    // ...implementation details...
}

#[repr(Rust)]
pub struct Blah_rustabi {
    field_1: u8
}

impl From<Blah_cabi> for Blah_rustabi {
    // ...implementation details...
}

If so, it would be relatively easy to write up a derive macro that would generate the glue code, so you could have something like the following:

#[derive(ReprC, ReprRust)]
pub struct Blah {
    field_1: u8
}
4 Likes

Yes, translating between ABIs is possible using different structures for the different layouts.

5 Likes

Sounds like rust could easily become a useful tool for writing glue code! :smiley:

4 Likes

This use as inter-ABI glue was mentioned in a post in the URLO thread that gave rise to this IRLO thread, as part of the motivation for pursing modular ABIs and permitting multiple such ABIs to be in use concurrently (just as Rust and C ABIs can be used concurrently in present-day Rust).

1 Like