Discussion: Editions in Rust-GCC (and other Rust compilers)

What this requires is a forwards compatible rlib format that says which internal version of the Rust ABI is in use, such that new rustc versions can read it and know what ABI to expect from symbols in that library.

How would the old rlib work with a newer core/alloc/std (assuming these will not be ABI-freezed)? AFAIUI rlibs can contain inlined core/alloc/std code that depends on their private implementation details (ABI being only a part of the problem).

Not being able to make them work together would seriously reduce the usefulness of mixing versions.

1 Like

I was still thinking about an ABI, which is predictable and preferable usable by external tooling and languages. It's just that for the version updated, the FFI wrappers need to be ajusted, just like it's the case for any other ABI break in the libraries ABI. If the wrapper is generated with a bindgen like tool from some generated rust metadata, that metadata could specify the version automatically. The only difference is that rather them annotating every single item, with the ABI used, the compiler recives a global switch for every ABI friendly item. But, yes, I also see the downsides of my approach here.

The question is through: Is sticking to the current concept, in which every language writes it's own header file, that hopefully matches the actual ABI and then links the library in a hopefully I guessed correct fashion a good idea, or should we rather define a new linking protocol in which the ABI is described in some metadata and the application is checking, wheter the linked ABI matches the expected one when linking at runtime? In the latter case, headers pretty much must be generated automatically from metadata and just adjusting the ABI of everything, for a major version upgrade would probably be worth considering.

1 Like

I see your point, but my concern is that edition boundaries are where we introduce 'big' changes. People expect that. So I'm not opposed to "safe-0", "safe-1", etc., but I would prefer that when the changes are introduced, they are introduced on an edition boundary.

Though there is one handy part about tying the introduction of an ABI to an edition; it gives a clue for future readers as to when the code was created. This may help people in the far future when they are trying to understand the code that we write today.

There's a few options, none of which are particularly great:

  • Only allow the use of versioned ABI items from a versioned ABI item
  • Utilize a polymorphic versioned ABI a la Swift's
  • Disable inlining of non versioned ABI items into versioned ABI items and delay monomorphizations until assembling the binary picks a std
  • The versioned ABI library implementation has its own copy of the nonversioned items internally that aren't shared with the outside world

The last one is probably the most reasonable. Keep in mind that this is just for effectively dynamic linking against an object file with import headers; the common case would still (hopefully) be static linking to a source compiled rlib.

3 Likes

Is the assumption that you should be able to eg. emit code for calling into a library with hand written "headers" without reading the library file? My understanding is that in theory library files could be mostly self describing, eg like .net assemblies, while still giving different compilers and languages the ability to layout fields and parameters however they like. Microsoft later reused the .net assembly metadata format for WinRT, so it doesn't seem to require a JIT.

1 Like

Rust allread has multiple crate type options. Using a stable ABI will likely have a cost and in most cases the current "rlib" type with unstable ABI is sufficent and hence in these cases, an unstable ABI will continue to be used. The stable ABI would be used with the "staticlib", "cdylib" and maybe also "dylib", or with new crate types designed for this purpose.

1 Like

Not sure I understand? I was talking about whatever other languages do end up consuming, regardless of what that is. Maybe that applies to future rustc, but as you say that already has the code for old ABIs, so it's not as big a deal.

1 Like

I don't want to this topic to fall into oblivion. What are our next steps? Form a working group? Put together an RFC?

I suspect that trying to nail down a spec for a representation is going to take quite a bit of work and time, so my vote is not to try to really solve it, but to get the processes in place to support specifying representations. My proposal is that for the first version of the representation (#[repr(InteroperableRust_2024)]) we write up a specification document that fully nails down the current #[repr(C)] and makes #[repr(InteroperableRust_2024)] a synonym for #[repr(C)]. There would be two reasons for this exercise:

  1. If there are any ambiguities in #[repr(C)], the specification for #[repr(InteroperableRust_2024)] will remove them. By the time the process is done, and the specification is published, mathematicians and computer scientists should not be able to find any conflicts in definitions within the spec, nor should they be able to find any ambiguities. Proofs based on the spec should be sound. Languages other than Rust should be able to reference the spec and use it for their own layout specifications, and be able to interoperate with Rust using it.
  2. We need to set up the processes to support writing future iterations of the spec. Since that will take time on its own, I don't want to try to simultaneously extend the representation beyond that of #[repr(C)]. My feeling is that right now we have a chance to go well beyond the rustc and gccrs communities all the way out to the various vendors in the world. Setting up all that will take time.

Is this agreeable to everyone?

If the first generation #[repr(safe)] is just #[repr(C)], it seems better to just specify #[repr(C)] more precisely rather than make a new repr attribute.

(Or maybe to make it #[repr(simple)] to keep #[repr(C)] as the more abstract "it matches the system C compiler"... though that still isn't quite always the case anyway.)

repr(C) in fact is already explained as code in one place in the stable documentation as an example. Being the individual who added that example, I'm pretty sure that the intent was that it would be a non-normative example rather than defining repr(C) layout formally. IIRC, the reference or nomicon contains a similar bit of pseudocode to define repr(C), though I don't remember exactly where and can't find it at the moment.

The current effective definition of #[repr(C)] is as given in the following code block. I don't see it being able to be changed, even to match the "system C compiler"'s behavior, though compatibility lints are quite possible.

pub fn repr_c<const N: usize>(fields: [Layout; N]) ->
    Result<struct { layout: Layout, field_offsets: [usize; N] }, LayoutError>
{
    let mut layout = Layout::from_size_align(0, 1)?;
    let mut field_offsets = [0; N];
    for (i, field) in fields.into_iter().enumerate() {
        (layout, field_offsets[i]) = layout.extend(field)?;
    }
    layout = layout.pad_to_align();
    Ok(struct { layout, field_offsets })
}

[playground]

I can go with either solution, as long as it's completely nailed down. Like I said earlier, I don't see the first iteration as really extending the layout, I see it as laying the groundwork so that we can extend things in the future.

Come to think of it, the representation would have to differ from #[repr(C)] slightly anyways as we still need some way of enabling forward compatibility, which in my mind means some way of telling others what version of the representation is in use...

I too would love to see a stabilized repr that supports more features than repr(C), but I think this belongs in a new thread.

3 Likes

I would suggest moving out of a topic called "editions in rust-gcc", for starters :upside_down_face:

As a language addition, the first step would be a new initiative. What you need for that is a persuasive argument that there's a problem space that needs to be addressed.

Importantly, specifics of the exact solution are not needed for that. A couple of sketches of possibilities can help, but mostly it's about what's too hard to do now, and what the goals (and non-goals!) are for a successful solution.

7 Likes

I would also suggest looking at this thread, which had a lot of good ideas, but unfortunately didn't lead to any concrete proposals.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.