Pre-proposal: `#[arm]`/`#[thumb]` function annotations for ARM targets


#1

The old syntax for target_feature allowed enabling or disabling arbitrary features, which was convenient on ARM platforms where deciding per-function whether to use the ARM or Thumb ISA by enabling or disabling the thumb-mode feature was possible and even desirable. (For example: I’m writing Rust code which ends up running on a Game Boy Advance. Most code is compiled as Thumb for speed since the Game Pak only has a sixteen-bit bus, but if I provide an interrupt dispatcher it has to be compiled as ARM code because of how Nintendo wrote the firmware.)

The new stable target_feature adds a whitelist requirement, and doesn’t define syntax for disabling features; this makes it better-suited for the things it was being used for. I don’t think this is a problem; while avx is a binary feature (either a target has it or doesn’t have it), thumb-mode is a toggle switch where both settings are meaningful.

As such, rather than extending target_feature in ways which risk trying to make it into both a floor wax and a dessert topping, it seems more appropriate to suggest another way to configure ARM/Thumb status: two new function attributes, #[arm] and #[thumb]. These would disable and enable thumb-mode for the annotated function in the same way #[target_feature = "±thumb-mode"] used to; annotating the same function with both results in a compile-time error. (I can’t imagine anyone doing this deliberately, and it’s good for the compiler to catch typos.) As far as I know LLVM already knows how to provide the necessary interworking shims, so everything should just work.

An obvious extension of this (which was supported under the old syntax for target_feature) is to use cfg!(arm) and cfg!(thumb) to generate different code in ARM and Thumb mode, but this is a problem for inline assembly—while the ARM and Thumb ISAs share many mnemonics, there’s often significant differences in encoding. An example: the Game Boy Advance provides several library functions using the swi instruction, but in ARM mode the argument needs to be left-shifted sixteen bits, which is an invalid instruction if the file is processed in Thumb mode. Currently it’s not really a problem, but if we allow cfg!(arm)/cfg!(thumb) it’s something that needs to be sorted out.


That’s about where my knowledge runs out; I’d appreciate some input on things I may have missed.


#2

This seems cool.

Assuming that others don’t think you’re full of it, you should try filling out the RFC template and seeing how that goes. Particularly, the “Motivation” section might be kinda thin, unless all ARM devices can benefit from this?


#3

There’s no reason all ARM devices can’t benefit from this in theory, it’s just that generally other devices don’t have quirky bus characteristics that mean executing ARM code from certain sections of the address space is penalised but not others. (The only reason you’d have to write any ARM code at all on the GBA is the way the firmware calls the user-provided interrupt dispatcher doesn’t change the T bit, but some games do write some ARM code into IWRAM and execute it.)


#4

Nice write up @Ketsuban, I agree this is definitely needed in some form. I have thought about this before, but never made any proposal. I think rather than arm and thumb, it would be better to use the more modern and official names a32 and t32, arm is a very overloaded term so I think it’s better to avoid it here. I also, think that the existing target feature syntax is sufficient, i.e., #[target-feature(enable = "a32")] and #[target-feature(enable = "t32")] mapping to LLVM -thumb-mode and +thumb-mode respectively.


#5

Agreed. However, for the sake of helping people who know the name “thumb”, we could still have a “did you mean” suggestion for people who use that.

I do think it makes sense for these to remain names under an appropriate non-target-specific attribute like target_feature rather than becoming top-level attributes.


#6

Shouldn’t this be part of the function’s calling convention?


#7

As far as I know LLVM already generates interworking shims where necessary, so it doesn’t actually have to be. It might be nice to have, though.


#8

I’m not opposed to a solution involving target_feature—it hadn’t occurred to me that Rust target features didn’t have to map one-to-one to LLVM target features. I’d like to understand these terms a little more, though—the Thumb I know is sixteen bits, what’s the significance of the 32?


#9

No, I don’t think so, they use the same calling convention (AAPCS).


#10

The 32 refers to the register/pointer size. e.g A64 is the (only) instruction set on AArch64.