Stabilization path for asm!()?


#1

Inline assembly is critical functionality for a significant class of potential Rust applications, even if in only a few functions at the core of some central library. We’ll want to support it on stable eventually. But is the current macro the design we want forever? Are we OK living with what LLVM hands to us for this?

The current implementation needs a lot of love and tests before I’d be comfortable shipping it, if we want to stick with it.


#2

Last time we had a discussion with this, we weren’t sure inline assembly was important, as you could still link to an external assembly file.

Given that the macro was just written for Rust 0.6 and then barely updated, I doubt it’s great at the task.


#3

Linking an external asm file requires an assembler and doesn’t allow for inlining
of the defined functions, which is important for reducing overhead.


#4

So I think there is a basic decision to be made here. We can choose between:

  1. Try to do something nice, like D did.
  2. Try to do something compatible, basically copying what clang and gcc do (which I understand is roughly the same? feel free to correct me.)
  3. Try to do our own thing.

IIRC, D lets you basically just write x86 assembly inline, referencing variables and so forth, and everything Just Works. It’s pretty sweet. But it’s a lot of work, and it’s not portable across architectures.

OTOH, gcc/clang offer these bizarre templates that are basically cut-and-paste assembly into the output and they use a cryptic notation for communicating with the register allocator which essentially nobody understands (hyperbole, but not by much). However, it’s portable, and if we hew closely to what they’ve done, it’s plausible you can mostly copy-and-paste an existing solution from stackoverflow and get on with your life much of the time.

There is probably some middle ground, where we do something like gcc but try to improve on the opaque notation, making it easier to use, or making common cases more intuitive.

I think what I favor is adopting the gcc notation as closely as we can (which I think is what we’ve done…?) to start, and then considering unstable extensions that are essentially front-ends for this more cryptic variety. I feel like if we have the gcc stuff, people can write syntax extensions in cargo that translate to it from some more natural notation fairly easily and so forth, so this may never have to be part of the Rust compiler proper (that’d be ideal).


#5

Yes, what we expose right now is almost exactly what LLVM exposes for clang to use, which in turn closely (exactly?) adheres to what gcc does. I’m comfortable doing this, although it’d be really nice if we could improve the notation in some meaningful way. And also fully document it, as documentation (let alone good documentation) is rather hard to come by for inline asm.


#6

My fear is that if we don’t do something radically better, than we’ve just produced an even more obscure notation (even if it it’s better, once you learn it). This is why I was thinking maybe it’s best to “innovate” out of tree, desugaring into the current stuff, and perhaps eventually adopt the best sol’n as the official plan.


#7

Note that clang actually implements two versions of inline ASM: gcc-style, and MSVC-style.

gcc-style has the previously noted problems of having extremely ugly syntax, and obscure symbol combinations to denote what exactly an input or output means. It has the advantage that it provides a lot of control over the compiler’s interpretation of inline asm for experts.

MSVC-style (which falls under the rough category of “nice” inline asm) has its own problems. It requires extra frontend work for each supported architecture: you have to actually parse the asm to figure out the correct interpretation. Also, it doesn’t provide as much control over how inputs and outputs are interpreted.

I’m sort of doubtful that we can come with something that’s enough of an improvement over gcc-style to make it worthwhile: writing inline asm that performs well is a bit of a black art anyway, and adding our own variant of inline asm seems like it would make things easier rather than harder. That said, it’s probably not a good idea to stabilize the current version of asm!: if we’re going to stabilize gcc-style inline asm, it should be roughly the same as what gcc supports, not LLVM’s partially-lowered version.


#8

For one thing, GCC and Clang already allow named operands, like asm("nop %[foo]" :: [foo]"r"(123)). Rust currently does not; I suggest adding support, and given how annoying it is to make sure the numbers correspond to the order of operands, making them mandatory.

Then, there are few things I think could be cleaned up without being too disruptive:

  • You have to specify all outputs followed by all inputs… but the string for each (like “=r”) already indicates whether an operand is an output or input, and (at least in GCC) read-write operands, prefixed with “+”, don’t naturally fit in either section (but somewhat arbitrarily have to be put in the output section). It would be better, without being overly surprising, to allow both types of operand in any order.

  • The colons and brackets are pretty ugly and idiosyncratic in both C and Rust. Would be much nicer to at least do something like

asm!("nop %[foo]", foo: "r"(123))

or maybe going a bit further to make it resemble existing syntax:

asm!("nop %[foo]", foo: ("r", 123))

I think inline assembly would seem a lot less mysterious with these changes, without actually having to delve into the constraint implementation and such. Just my two cents.

[Off topic ‘fun’ fact I just discovered: the following string causes this forum’s Markdown parser to drop part of the text:

    - Next line must be blank
    
    - Must be two bullets
    ```
    code
    ```
    I am lost!

]


#9

Last time we had a discussion with this, we weren’t sure inline assembly was important, as you could still link to an external assembly file.

I’d cite Nadeko as the sort of use case where inline assembly would be very helpful:

I’ve described Nadeko as “Black F&@#ing Magic” in the past, but it really shows what’s possible with inline assembly. That is to say: Nadeko implements a sort of “crypto compiler” for a limited subset of Rust (basically arithmetic and simple if-based branching) that emits x86-64 assembly with “constant time” (i.e. data-independent timing) semantics simply by adding a “#[const_time]” attribute to your code.

It may be possible to ensure code compiled by rustc proper has similar semantics (Galois SAW looks amazing for this purpose), but aside from that, the only other options are assembly code (inline or otherwise) or hand-inspection of rustc generated assembly. Nadeko is nice because it solves the problem of “constant time” assembly generation in a way that users shouldn’t have to get their hands dirty with assembly itself.


#10

Just wanted to mention that there was a proposal for a different asm! syntax about a year ago: RFC: refine the asm! extension #129


#11

I would :heart: to see that in production as I’m writing crypto lib that need that functionality.