[Pre-RFC #2]: Inline assembly

I don't think that should be our semantic; we can do much better than copying.

One major reason for register constraints is to let the compiler give you a register it was already using to hold the value. For instance, if you have out(reg) r and then you return r, LLVM will just have you write the value directly to the register it uses for the return value (rax) and then return, without copying the value. And in(reg) on a function parameter should use the register that the target ABI puts that parameter in. No copying required.

I don't want to lose that property.

Keep in mind that out destinations are places, which in general (always, as far as the Rust semantics are concerned) live in memory, even if you restrict yourself to just place expressions that refer to locals. Thus there needs to be a copy in general, @Amanieu just described when and in what order those copies happen.

None of this conflicts with the optimizations you want, since copies are not observable effects. For example, this snippet:

let x = 0;
asm!("mov {}, 1", out(reg) x);
return x;

would naively be lowered to LLVM IR like

%x = alloca i32
%1 = call i32 asm ...
store i32, i32* %x, i32 %1 ; copy asm result to x
%2 = load i32, i32* %x ; get value of x and ...
ret i32 %2 ; ... return it

which can readily be optimized to (without even taking note of inline asm being involved, this is just regular SSA construction)

%1 = call i32 asm <...>
ret i32 %1

Actually eliding any reg-reg copies is the job of the register allocator but this IR is as good as it gets for that purpose.

3 Likes

I don't think that it is possible to avoid post-monomorphization errors for inline assembly. Off the top of my head:

  • Register allocation failures, which occur late in the backend.
  • Out of range immediates, which occur in the assembler.
  • Invalid assembly instructions or directives, or any other assembler error.

Since asm code is only generated after monomorphization, I don't think it is even possible to check against these errors ahead of time.

4 Likes

I guess. I could imagine some zany schemes to catch most errors before monomorphization, but they would likely not catch all errors and possibly have some false positives too.

Actually, speaking of inline asm in generic code, should the following be OK depending on monomorphizations, or is a type error? If it is an error, is there any trait bound that could make it OK?

fn only_call_with_small_ints<T>(x: T) -> T {
    asm!("add {0}, {0}", inout(reg) x);
    x
}

(I think it should be an error and trying to make it work with trait bounds is probably hard and low priority.)

2 Likes

Fwiw the current behavior on nightly with LLVM inline asm is... a rustc segfault. =D

4 Likes

That works even for dynamically-selected rounding modes. From the RISC-V unprivileged-ISA spec, p. 63:

[explanation added]

I can't seem to find any mention of "volatile" asm blocks. Are those planned? Or are those implied? What's the story there.

One minor bikeshed on the format of operand types: in, late and out seem mostly orthogonal to each other (all possible combinations are valid except latein), so why not separate them? Additionally, you could use attribute syntax, so register operands would look something like #[in, late] reg(<reg>) <expr>.

Edit: going one step further, how about attribute syntax for everything, e.g., #[in, late, reg(<reg>)] <expr>?

Implied, or rather the default. pure opts out of it (although discussion upthread indicates it might be specified overly aggressively in the pre-RFC).

Is it wise to pick the opposite default from GCC and LLVM?

I'm not too thrilled about Rust getting a GCC-like asm "manually declare my clobbers" approach, but I can also see why compiler implementors would prefer to do that: that makes it easier to implement.

The problem, though, is that code bases in C/C++ using asm blocks too often have wrong clobbers. They're usually good enough for the function in which they are declared, but as soon as inlining is involved (which LTO gives more opportunities for), things can go bad very quickly.

The above is not hypotheticals. I've had to fix multiple asm blocks in Firefox when we enabled LTO, some of which were old.

That seems like it would make the syntax much more verbose. What benefit would that provide?

1 Like

Good question. It's not clear to me whether, for inline assembly, the default should favor volatile memory or RAM/ROM-like non-volatile memory. Is it safer to assume that any targeted memory is volatile, or not? Also, which choice leads to more verbose code?

From my perspective, volatile is really an attribute that applies to a real address range, and thus to any virtual address range that maps to that real address range. In the embedded world, much of the use of assembly is for dealing with memory-mapped device registers where read and write accesses often trigger external side effects (which is what Rust classifies as volatile).

For me, other uses of inline assembly tend to be more register-oriented, where intrinsics could be used (e.g., u32 * u32 -> u64), or for constant-time crypto algorithms that involve compare-and-mask rather than variable-timing compare-and-jump.

That seems like it would make the syntax much more verbose. What benefit would that provide?

Consistency with the rest of Rust, it just feels like at least in, late and out should be attributes.

Any syntax that automatically checks clobbers would require parsing the assembly and having at least a high-level understanding of every instruction. And even then, it could only detect a conservative version; the compiler can't analyze arbitrary assembly control flow to know if the code correctly saves and restores a register. So you'd still want a syntax to override and un-clobber a register the compiler detected as clobbered.

I would propose that to the extent we can reliably detect this, we could add it as a lint later on. (And that lint would need a means of saying "no, in this case I really don't clobber that register even though you think I do".)

9 Likes

No, generic types would not be allowed (except *mut T-style pointers and references). We can't really make it a trait bound since anyone could implement that trait on their type and we only accept integers, pointers, floats and SIMD vectors as asm operands.

GCC's asm volatile is actually completely unrelated to volatile memory. It just indicates whether an asm block has side effects. See GCC's documentation,

The one issue with "system assembler" is when it comes to Windows.

The closest Windows has to a "system assembler" is MASM, which comes with Visual Studio. However, according to a 2010 post by Chris Lattner:

Nope, llvm's .s output is only compatible with GAS and other at&t syntax assemblers. It turns out that MASM syntax is highly ambiguous and MASM is not production quality for use by a compiler. This is why visual studio doesn't go through it. Long term, we'd like LLVM to be able to write out .o files directly [..]

In the 9 years since then, LLVM has indeed become able to write out .o files directly, but its assembly format is still based on GAS (GNU as), and incompatible with MASM. This applies to both the assembly LLVM generates (e.g. with --emit asm) and the assembly it accepts in inline assembly blocks. The incompatibility doesn't have to do with instructions themselves: MASM uses Intel syntax whereas GAS defaults to AT&T syntax, but GAS and LLVM's assembler both support switching to Intel syntax with the .intel_syntax directive.

The issue is with things like assembler directives and the syntax for declaring symbols. For example, to declare a symbol named msg containing a nul-terminated string:

MASM syntax:msg db "Hello", 0
GAS/LLVM syntax:msg: .asciz "Hello"

Now, you could argue that MASM isn't a true "system assembler" anyway. According to the post, not even MSVC goes through it – I don't have the expertise to confirm that or know whether it's changed since then, although I doubt it's changed. (However, MSVC does support emitting "listing" files that at least look like MASM-flavored assembly.) And Windows has a history of developers using a variety of assemblers rather than standardizing on one; in addition to MASM there's NASM, FASM, TASM, etc., which have their own custom syntaxes. If there's no "system assembler", then we could be justified in ignoring MASM and continuing to use GAS syntax.

Alternately, we might say MASM does count as a system assembler but ignore it anyway. It's more convenient to use the syntax already supported by LLVM, and I doubt many people are particularly attached to MASM and its "highly ambiguous" syntax, or enthusiastic about having to use different syntax depending on the OS. But such a decision does feel somewhat arbitrary and Unix-biased.

1 Like

That's a very good point. Maybe we should explicitly say that the assembler syntax is that of GAS (with .intel_syntax on x86)? I don't think that it's too big of a portability issue since there are 2 implementations of it: one in binutils and one in LLVM.

3 Likes

That would be acceptable if the eventual inline asm documentation included guidance on how to obtain an appropriate assembler for various host machines and various target machines.

1 Like

IMHO defaulting to volatile is more consistent with the rust philosophy of safe defaults. Personally, I've almost never used a non-volatile asm block either...

2 Likes