ABI discussion for w65

I'm opening this thread for a preliminary discussion of a proposed ABI for use with the w65 architecture. It is specifically for discussing the implications for rust implementations of such (though this may necessarily extend into discussions of other mentioned languages and the respective abis). The document defining the ABI is here: SNES-Dev/abi.md at main · SNES-Dev/SNES-Dev · GitHub.

Notable effects on rust:

  • usize (isize), in this abi, does not correspond to size_t (signed size_t). This may affect FFI code that relies on this correspondance. usize (isize) does and continues to correspond with the uintptr_t (intptr_t) type, defined in the <stdint.h> header.
    • The maximum size for objects in the ABI is 65535 bytes, or just less than 1 bank. This was done to ensure that every object could be allocated within a single bank. The restriction on the size of the type ensure it can be passed in a hardware register, rather than a specially reserved memory address.
    • Pointers in this abi are 4 bytes, with the high byte 0 for an inbounds pointer. NOTE: pursuant to the discussion on this zulip thread (archive), this is not a validity invariant on the pointer type itself, but in the current version all inbounds pointers have a high-byte of zero.
  • There is no standard rust type corresponding with the uint_fast8_t and int_fast8_t types specifically (no other fixed-sized, least-sized, or fastest least-sized integer types do not have a standard corresponding type). Implementations could provide an (unstable) mechanism to access the underlying type, but this abi does not specify that mechanism. This is not deemed to pose a significant issue to Rust code or Rust FFI.
  • ZST Parameters are ignored for the purposes of ABI
  • the AtomicU8 and AtomicU16 types can be defined by core as they are always lock-free. No other atomic types can be defined by core (notably, AtomicUsize/AtomicPtr<T> are not defined as atomic operations of size 32 are not lock-free).
  • The "Rust" ABI section contemplates the creation of an extern"w65-interrupt" abi, which restricts the parameters it can be called with, and also forbids calls to the function. This could (ideally) be done statically to avoid unsoundness.
    • Interrupt handlers define with this ABI have significant restrictions on behaviour (corresponding to the restrictions on async signal handlers from C11). This is deemed not to pose a soundness issue, as in order to be useful, they must be defined with particular link names, and both #[export_name] and #[no_mangle] are deemed to be fundamentally unsafe.
  • The NULL pointer is notably valid, and (on one layout) represents the highest byte of the stack. It is not deemed necessary to be accessible (as any code that would allocate anything there is about to overflow the stack into read only memory anyways).

For formal approval, this abi has been presented to the llvm-mos project, to avoid abi issues between the SNES-Dev project (my project for supporting Rust, C, and others on specifically the SNES), and the llvm-mos project (which is adding support for the m6502 and, in the future, w65 architectures to llvm), so I would want to await formal approval until discussions have occured there. However, I would be curious as to the process for that. I assume adding the extern"w65-interrupt" would require a T-lang (and possibly T-compiler) RFC. Would everything else require an RFC, or just a sign off by lang (and possibly compiler) teams (perhaps an MCP)?

I presume, though, that defining actual targets would be waiting on RFC 3145, though I would like to confirm that no further changes to the ABI are necessary before starting on the C compiler(which I intend to do in the near future as I finish up with the assembler/linker), so if there is currently a process for the approval, that would be useful.

1 Like

Everything in this definition sounds reasonable to me from a language perspective.

extern "w65-interrupt" would need a lang team signoff to implement (though not necessarily a full RFC), but it sounds entirely reasonable.

NULL being valid seems fine as long as you only access that via a raw pointer and a specialized memory access function that handles NULL, not via a reference or a normal raw pointer dereference.

For compatibility, you may want to consider defining a global lock that allows 32-bit atomic operations, so that AtomicUsize and AtomicPtr work.

You could define this as a Tier 3 target (and I'd love to see that). The only reason to go for the "Candidate Target" approach would be if you don't think you can meet the Tier 3 requirements anytime soon.

Doesn't rust require that atomic types be always lock-free? Otherwise, there isn't really a reason that any of the atomic types couldn't exist (though, strictly, they are less useful, as atomics are primarily useful for communication with interrupt handlers, which strictly cannot access non-lock free atomics - though my planned impl will allow access from the irq handler).

Tier 3 requires an implementation in rustc, does it not? That's really the main thing, as that's blocked on either llvm-mos or gcc (the latter of which is really waiting on "do I need to make changes to abi"). It also raises the question of whether I could upstream a target in rustc that doesn't have upstream support in llvm/gcc. Beyond that, I don't see a reason it couldn't (eventually) be a Tier 3 target.

You're right, it does. I did not realize that was a requirement. Given that, it does sound like there's no reasonable way to implement them. Code targeting w65 will just have to live without such atomics and data structures built atop them.

Yes, though that implementation may still be a work-in-progress.

I think it'd be acceptable for a tier 3 target to require a patched LLVM; a tier 2 or higher target couldn't, but a tier 3 target could. Nothing in the Target Tier Policy prohibits requiring out-of-tree LLVM changes. I think it'd be desirable for those changes to be on a path towards inclusion, but they don't have to be merged yet.

I'm glad to hear that.

This seems like a pure footgun. I believe there is no other such target supported by Rust today, and a lot of FFI code relies on usize being size_t. In fact, bindgen even has an option to codegen size_t as usize in FFI declarations.

How does this interact with niche optimizations? I would think this breaks many assumptions about the layout of Option<NonNull<T>> and the like, since the null pointer does not have the value zero.

1 Like

The validity is simply an artifact of the memory layout (on the SNES specifically, it mirrors the first byte of WRAM). It's not necessary that it be treated as valid at any level higher than this. I probably could have clarified that, though. The main concern with rust is that on layouts that put the stack in the higher part of this mirror, this is technically unsound as it can produce an automatic storage duration object at address zero (which can be borrowed and :boom:), however, as mentioned, there is a bigger issue of it's about to wrap to address ffff, which on the SNES is typically read-only memory and the interrupt vectors (and of course, a rust implementation could set up a stack sentinel there).

This is a bit more of a concern. The issue is that the upper limit size is 65536 (of which, it's fair to lose 1 byte), due to how memory works (it may be advantageous to split linker allocation at bank bounderies. Given that, using 4 bytes for size_t, which only stores 2 bytes, is a waste of both memory and time, given that 4 byte values need to be copied into specially reserved memory regions 2 bytes at a time, whereas 2 byte values can just be stored in registers (It's also more efficient to index pointers by a 2 byte value, as it can be done using one of the indexing operations, vs. having to perform 4 byte adds), which are the reasons for choosing size_t to be 2 bytes instead of 4. usize is 4 bytes because rust defined it as the pointer-sized type, so it needs to be the same size as *mut u8, which is 4 bytes because pointers are 3 bytes but extended to 4 for alignment. I'm unsure whether this would be reasonable to solve.

An example as to why making size_t 2 bytes is beneficial, here is the assembly for the memset and memmove with size_t vs. `usize

//
memset:
    txa
    memset._L0:
    sta (__r1,%D),%Y
    dey
    beq memset._L0
    ldx __r1,%D
    stx __r0,%D
    ldx __r1+2,%D
    stx __r0+2,%D
    sep #$20
    rtl
memcpy:
memmove:
    lda (__r2,%D),%Y
    sta (__r2,%D),%Y
    dey
    beq memmove
    ldx __r1,%D
    stx __r0,%D
    ldx __r1+2,%D
    stx __r+2,%D
    sep #$20
    rtl

and here is just memset with usize:

memset:
    txa
    memset._L0:
    pha
    jsl __add_int32
    lda 0,%S
    sta (__r0,%D)
    jsl __dec_r2
    beq memset._L0
    ldx __r1,%D
    stx __r0,%D
    ldx __r1+2,%D
    stx __r0+2,%D
    rtl

I'm not even sure the code that would be generated for memcpy/memmove in that case (but I'd have to let the compiler figure it out).

1 Like

I get that, the problem I see is specifically with the value of the null pointer being not zero, and thus changing the otherwise guaranteed layout of types which currently rely on it being zero, if I understand correctly.

Ah yes. I could perhaps have clarified that as well. The null pointer will remain address 0, for a few reasons, chief of which is rust defines it to be so. I was just mentioning that it is a valid address that could in theory have something allocated there

1 Like

This is true for most architectures.

1 Like

Even on x86 there can be something allocated there. Most OSes just don't allow you to map the zero page. On Linux for example you need to change mmap_min_addr to 0 before it is allowed.

1 Like

(Off-topic, but I’m curious where the rationale and motivation comes from to develop for the SNES in Rust?)

I've been working for quite a while to be able to develop homebrew for the SNES using modern languages like C and C++. When I first learned rust, I found the advantages obvious - I could use a modern language that had powerful zero-overhead abstractions but didn't have overly burdensome requirements on freestanding environments (cough C++ freestanding cough).

1 Like

As I have begun working on the gcc backend, and am getting ready to finalize many of the details, I'd like to seek formal input/approval from T-lang on whether rust can formally support the proposed abi, and any target using such an abi (though I do not intend at this current time to propose any targets at any tier, though I intend to do so after the gcc backend is complete and work on porting rustc has begun). Is this something that can be done at this time, and, if so, through what process would that be (MCP, RFC, otherwise)?

Specific questions that I would like addressed within reasonable time, as they affect details of the C compiler as well as rust:

  • Incompatibility between usize and size_t, wrt. to FFI and the issues presented here related to that.
  • Preliminary issues related to extern"w65-interrupt"
  • Currently, raw pointers and usize differ in their alignment requirement, with usize having alignment requirement 2, and pointers having alignment requirement 4. Is this an issue for rust?

Other questions that would eventually need to be addressed, but are not as immediate (as they solely impact rust and would not impact work on the C compiler or adjacent support libraries):

  • Does rust permit atomic types that are defined as "non lock-free" but do not use a global mutex/spinlock? In particular, can certain atomic types be defined to be "non lock-free" in order to restrict their use from certain interrupt handlers, while being implemented in a way that allows being defined in rust (for example, by using a cpu flag to lock interrupt handlers)
1 Like

Just for clarity on this one: usize is still 4 bytes to match the size of the pointers, right? (Since it says pointers are 4 bytes two bullets later.)

Aren't those just typedefs in C anyway?

I don't think there's any formal guarantee of alignment requirements for different rust types. (I wouldn't be surprised if u64 or u128 were different on different targets, for example.)

That said, I wouldn't be surprised at all for there to be extant rust code that allows things like &usize <-> &*const T, assuming it's fine.

Yes, it matches the size of pointers, and, as mentioned, corresponds with the C typedef uintptr_t (which is unsigned long, and likewise corresponds with the size of void*).

Another question I just thought of and should probably raise: The runtime library that supports various arithmetic features leaves some of them optional, and these correspond to target-feature flags. Some of these impact features required by the rust language (though by default, they are available). In particular:

  • Floating-point support may be disabled in the library with the configuration option --without-float. This corresponds to the float target feature, though float operations are also available with the hardfp target feature. This affects the ability to perform operations on f32 and f64.
  • 128-bit integer support may also be disabled in the library with the option --without-int128. This corresponds to the int128 target feature, and affects arithmetic on i128.

It is recommended that unusable features be ill-formed (otherwise, the result would be a link-time error if the library is compiled without the appropriate support). This should not affect rust code without the consent of the user, however, as the default will be that the features are enabled. Is this sufficient to avoid issues?