[Pre-RFC]: Inline assembly

I’ll just refer to readers to an earlier post on the matter.

I also suggest we drop AT&T x86 syntax support. I do not want alternative back ends to require support for multiple syntaxes. That is also helpful for readers of Rust code, since they would only have to deal with one x86 syntax.

2 Likes

That seems like it would introduce a lot of redundancy when inputs, outputs, clobbered registers, etc. occur more than once, with all the usual problems of redundancy.

Just specify it on the first usage, not on every instruction.

It can be flexible - some may want to specify all of them at the end of the block, some may want to interleave inputs/outputs/clobbers with the asm code itself.

But each input/output/clobber should be specified only once per asm block.

This looks pretty good, I'm happy that someone is picking this up.

  • Others have mentioned this too: it should be added to the drawbacks section that assembly code using {, } is harder/confusing to write. Why do I say confusing? I definitely always mess up the number of $ in my immediates when using inline assembly.

  • I like the explicit specification of lateout/inlateout, but I don't like the syntax. I think the “late” part should be specified separately from the inout part.

  • GCC doesn't allow you to specify an input and a clobber of the same register. We need to either just support this in a logical way, or error on it and have another way to specify what's needed. For example an inclobber direction. inout is not sufficient because inout will presumably require the binding be mutable. inclobber would also allow an immutable binding which becomes invalid (as if moved) after the assembly block.

Nits:

  • First example of flags(volatile) missing closing parens.
  • “Names should be speaking” do you mean “Names should be words”?

@Amanieu

Some architectures use special characters in register name, so it might be better to put register names in quotes: in("eax").

This is also a nice way to distinguish between general and specifc constraints, i.e. reg vs. "r10".

@Amanieu

Figuring out a short name for some constraints is not trivial, it might be easier to just stick with the existing GCC single-letter contraints. In particular some constraints can be very complex:

I think the idea is to not allow these kinds of constraints and force the programmer to just use explicit register names if they want anything more advanced than just any GPR.

@Amanieu

An asm! with no outputs is meaningless if the volatile flag isn’t specified. It should be a compile-time error for a non-volatile asm! to have no outputs. Previous discussion

+1. It would also be nice if the unused lint applied to asm statements as well, because specified but unused outputs will get optimized away too.

@Amanieu

Template argument modifiers are absolutely required in practice. I make heavy use of them in my code (ARM64 assembly). I think that we can just reuse LLVM’s single-letter modifiers here since these are used in the format string: mov {0:w}, {1:x}

I've never used this, could you please explain what this does?

@Amanieu

I suggest adding an additional direction specification tmp to deal with temporary registers and clobbered inputs:

+1

@main

I would generally separate two kinds of constraints: Those that select one specific register (eax) and those that merely constrain the compiler’s selections to a set of registers (reg). Parameters that are not directly referenced (“excess parameters”) should only be allowed if they belong to the first group.

+1. I'd also like to propose that explicit registers constraints can't be used for replacement in the template. After all, the programmer already knows what to write in the template.

@matthieum

I would generally expect an asm! call to be extremely platform specific; would it make sense to restrict the usage of the asm! macro to functions which are platform specific, for example, so that it is made clear that this piece of code is only valid for x86/x86_64 and cannot be compiled to ARM?

This meshes well with the portability lint proposal

@Zoxc

I also suggest we drop AT&T x86 syntax support.

Please no.

2 Likes

I think this is fine since, even in ARM assembly, use of { and } is quite rare. The main benefit is that it matches the Rust format string syntax and you will probably need braces anyways to specify template modifiers.

LLVM/Clang actually allows an input & clobber of the same register. What it doesn't allow is an output and clobber of the same register.

Also your proposed inclobber constraint is the same as the tmp constraint that I suggested in my previous post.

It causes the general-purpose register names to be printed with a w or x prefix, which indicates the register width to use (w = 32, x = 64): mov w4, x9. This is similar to eax vs rax in x86.

The mov example I showed will effectively truncate the second argument to 32-bits while moving it.

2 Likes

I see a lot of potential problems with trying to abstract over the back-end’s native minilanguage for specifying what registers, memory locations, etc. can be used for each operand. The most basic problem is that it may be impossible to express “use this specific register in this instruction.” LLVM’s documentation is inadequate for me to tell whether it has this problem, but I know for a fact that GCC’s inline assembly can only target inputs or outputs to specific registers on x86, and even then only for the original 8 integer registers (not the r8-r15 extension that comes with x86-64); this is a fundamental limitation in the low-level “RTL” IR that GCC uses (the short version is that the constraint codes are defined by each individual architecture’s ultimate back end, the “machine description”, so if that has no need to express “this specific register” prior to register allocation, neither can an inline assembly operation). Allowing arbitrary immediates to come in from the surrounding code is also risky, I see someone already pointed out how wacky the rules can get for which immediates are allowed by which instructions. The worst this can do is give you a weird error message, but it might be such a weird error message that someone thinks the compiler is buggy.

Relatedly, input/output/clobber semantics as specified by the language can’t deviate even a little tiny bit from the actual semantics understood by the back end, whatever it is, or we’ll have subtly incorrect code generation under high register pressure.

p.s. @main @zoxc I really do prefer ATT syntax to Intel syntax for x86 assembly language and I would object to dropping support for ATT syntax. This is not because of my past working on GCC, but because I learned MC68000 and SPARC assembly languages (both of which are quite similar syntactically to x86-ATT) first.

3 Likes

LLVM allows specifying all registers by name. Backends can change. If GCC wants to support Rust and this RFC ends up being how Rust does inline assembly, GCC will need to improve their inline assembly backend to have all the features needed by Rust.

1 Like

You can target to arbitrary registers with GCC, it just requires a completely different construct (register variables):

https://godbolt.org/g/3qmgzX

I would generally expect an asm! call to be extremely platform specific; would it make sense to restrict the usage of the asm! macro to functions which are platform specific, for example, so that it is made clear that this piece of code is only valid for x86/x86_64 and cannot be compiled to ARM?

The language already has the tools to do that (e.g. conditional compilation #[cfg(target_arch = "powerpc64")]). The toolchain also detects many errors (e.g. embedding ARM inline assembly into an x86 binary fails to compile) but these errors are reported late (during LLVM translation, while invoking the assembler, etc.).

The language "run-time" will be able to give you more precise information about which assembly instructions you can actually execute. For example, the coresimd library already provides run-time feature detection on x86 in core::, so you can use that to detect AVX2 support at run-time and execute some assembly instructions only if the host supports them.

What rustc doesn't have is a way to verify that some assembly code is going to compile successfully. For example, that you generate x86 assembly only in a part of the code that is protected by a #[cfg(any(target_arch = "x86", target_arch = "x86_64"))] macro. What rustc also doesn't have is a way to verify that some assembly code is safe.

There has been some progress towards eliminating common uses of unsafe with respect to using intrinsics, e.g., see RFC 2122, but extending the approach being pursued there to inline assembly would be a very long shot.

I agree that the issue you raise is a problem. But I don't think it is a problem worth solving "right now". Solving it requires a lot of work (rustc uses LLVM for inline assembly, so we would at least need to teach it to use the appropriate assembler to verify inline assembly "early"). Also, in all cases I can think of, if you screw up the code won't compile. That's bad, but it's not horrible, in particular given that the asm! macro will be unsafe and that everyone should be assuming that any code that uses the asm! macro is not portable.

We could always solve this later in a backwards compatible way by adding a checker for inline assembly code. This checker could warn on portability issues (e.g. asm! macro invocations not protected by a cfg(target_arch)) and produce errors on broken assembly (if the code was already broken it is not backwards incompatible to error on it).

2 Likes

I think that the proposal must mention what is an error, what is a warning, and the current state of affairs and maybe discuss what can be done to improve them but "better error messages" is a quality of implementation issue and I wouldn't want to block the stabilization of inline assembly on that.

To offer better error messages we need to verify the inline assembly in rustc somehow, e.g., directly using the system's assembler, parsing its errors, and trying to expose them in a good way (with line information, etc.). Even if we do that, there might be cases in which LLVM backend calling the system assembler might still fail. All of this is doable, but it is a lot of work, and nobody has volunteered to do it. This doesn't mean that error messages must suck forever either, maybe by stabilizing the syntax we'll give inline assembly more exposure and the current error messages will trigger somebody motivated enough to improve them.

1 Like

Rust already reports errors from inline assembly: https://play.rust-lang.org/?gist=35b145b367eafe528018f4237cdb9e44&version=nightly

The LLVM backend passes the error messages back to rustc so that rustc can display them.

The only (minor) downside is that this checking is not done for cargo check, only a full cargo build. I think this is fine as it is.

This is my fault for not filling the bugs, but I believe that anybody writing any amount of inline assembly was either extremely lucky or must have hit a lot of rough edges. These are some examples I was able to come up with in 2 minutes:

1 Like

Ah sorry I misunderstood. You are talking about error-checking on the constraint placeholders rather than the assembly code itself.

I agree, the current state of affairs is pretty poor in that regards. Resolving it is pretty simple however: all we need to do is port the constraint validation code that Clang uses to rustc, which should resolve this issue.

I was more concerned about portability than errors.

I was thinking that guaranteeing an error if the assembly is not used in a cfg(target_arg) enabled function (somehow) would in turn make it much easier to then have an analysis pass on crates which could identify automatically which architectures are supported by the crate.

@hanna-kruppe

I guess this is a classic "how much shall we lean on the language vs programmer?" question... I don't have strong feelings, but I do prefer to lean on the language a bit more...

To be sure, I agree that commenting well is important and that the developer should read carefully, but assembly is tricky, and it's easy to forget/miss stuff. For me personally, I would find it very helpful to have some help from the language.

1 Like

@main

Hmm... I see your point. I guess my complaint splits into 2 parts:

  • not all of an inline assembly block is "homogenous" in sense, I think. What I mean is that while some block of inline asm may have some property, I think it is also possible for a block to have a property specifically because of a single instruction. For example, the whole block may clobber memory because a single instruction that clobbers memory. If that instruction was removed, the block would not need the clobber any more. In such a case, I think it is useful for maintainance and documentation purposes to put the clobber on only that instruction.

  • Using positional arguments to define the interface is a huge pain (IMHO). There should be a more structured way of defining the interface.

I personally prefer ATT syntax, mostly because that's what I learned, and I've gotten used to it from GCC. I don't have a really strong attachment, though. If we decided to only use Intel syntax, though, would everyone have to switch from GCC to something else? I don't know how that would work....

I started off actually writing my assembly in a separate .S file and later moved to inline assembly. Since the assembly was already "correct", I didn't really see these errors :stuck_out_tongue:

Perhaps others have had the same experience?

@zackw Has a good point about the capabilities of the backend that has to actually make the inline asm work. I think it would be prudent to try and explore what limitations or behavioral differences future backends are likely to have and see if we can design around those. For example, if gcc can’t handle inputs in certain registers, can rustc automatically insert workarounds for this? If not, is gcc likely to lift this limitation if there’s enough demand? etc. This is something I’d like the eventual RFC to at least talk about, although I realize that an answer probably requires enormous amounts of research and experience.

1 Like

The compiler can't check the annotations you're suggesting, at least not without without knowing the machine state, instruction semantics, and assembler syntax (which won't happen). I gave one example of this earlier (x86 mul writes EDX and EAX) but there are myriad more. Realistically, what you're proposing can only provide a formal framework for the programmer to correlate their unchecked constraints and clobbers and flags with specific instructions (just as they might otherwise do in comments or by visual inspection). The only thing the compiler can do for them is accumulating these into a single list for the whole inline asm. It's possible that this is enough of an advantage to justify offering it as a variant, but it's certainly not a slam-dunk case.

Format string syntax lends itself to named arguments quite nicely: for example, asm!("mov rsi, {ptr} [...]", ptr=slice.as_ptr());

2 Likes

This is more of what I was going for, and I agree that it's not ideal, but I think better than nothing...

Yes, but this is more verbose, and I can't imagine it scaling well if you have to name all intermediate values that get passed between instructions (e.g. it seems like we would end up with ptr, ptr1, ptr2, etc, which is seems like it should be considered bad practice). Using positional arguments scales but is annoying for the reasons I mentioned before...