[Pre-RFC #2]: Inline assembly

Lokathor · November 20, 2019, 7:32pm

volatile asm isn't naturally safer, and again it's the opposite of what everyone else coming to rust will expect. it also sets people who start on rust and go to other languages in a bad spot as well.

If we're using existing assemblers and existing syntax, there should be a very strong case about not using other existing conventions.

mcy · November 20, 2019, 8:50pm

Can you clarify? When working with inline assembly, the "safest" mode you can operate in is one in which you tell the compiler to be maximally conservative. IMO non-volatile assembly should only ever be used in situations where being able to treat the black box as "pure" provides tangible performance gains; having pure as the default is kind of asking for trouble.

atagunov · November 20, 2019, 10:17pm

auto-detect clobbering of (some of?) the flags?
be conservative: if in doubt consider them clobbered

It is hoped to

have minuscule negative impact on resulting code performance
make writing asm! easier

gbutler · November 20, 2019, 10:25pm

Is that necessarily a useful Goal? Wouldn't write-once/read-many time apply more? That is, being as explicit as possible might be worse for writing, but, better for reading/understanding as well as correctness?

atagunov · November 20, 2019, 10:29pm

Of course! But for flags?.. Carry, Zero, Sign, Overflow? Even direction.. Would you rather I was explicit about them when I write asm! that you're going to read?..

zackw · November 20, 2019, 10:45pm

Something's been bugging me about this syntax and I finally figured out what it is. You can have an operand that's a sym, an imm, or a register. Only registers accept direction specifications. But with a register, you write the direction first and the register (which will often be just reg, which looks like a keyword from the same class as imm and sym) second. This is likely to be a source of confusion. For instance, it confused me into not being able to figure out how to write a sym operand for about five minutes while I was writing comment #43 on this thread, because I was looking at the wrong part of the BNF.

Can we please swap the positions of dir_spec and reg_spec in reg_operand? Making no other changes, i.e.

reg_operand :=  reg_spec "(" dir_spec ")" operand_expr

Rewriting some of the examples from the proposal in this form:

asm!("mov {}, 5", reg(out) x);

asm!("
        mov {0}, {1}
        add {0}, {2}
    ", reg(out) o, reg(in) i, imm 5);

asm!("
        mov {o}, {i}
        add {o}, {number}
    ", o = reg(out) o, i = reg(in) i, number = imm 5);

asm!("add {0}, {number}", reg(inout) x, number = imm 5);

asm!("out 0x64, {}", "eax"(in) cmd);

asm!(
    "cpuid",
    "eax"(in) 4, "ecx"(in) 0,
    "ebx"(lateout) ebx, "ecx"(lateout) ecx,
    "eax"(lateout) _, "edx"(lateout) _
);

mark-i-m · November 20, 2019, 11:31pm

Personally, I don't think it should be a goal to make writing asm super ergonomic, especially with hard-to-impl features like inferring clobbers, for the following reasons:

When I write assembly, it is usually for weird things like modifying MSRs or interrupt delivery. Trying to make everything implicit sounds dangerous here.
Modifying things like flags is especially dangerous because on some platforms (e.g. x86), they can do important things like enabling interrupts. Calling things like that out to the read seems worthwhile.
Writing out the clobbers makes me think twice about the correctness of my assembly, especially with respect to concurrency
asm is kind of the dreaded (but necessary) last resort that I avoid using. If we are going to put a lot of effort into ergonomics, I would rather it go to something like the borrow check that I interact with all the time.

josh · November 21, 2019, 1:28am

"eax"(in) or reg(in) reads very strangely to me, much like index[array] rather than array[index].

Also, the direction seems like by far the most important information about an operand, and having it first helps to quickly survey an assembly statement. It feels easier to quickly distinguish in(reg) and out(reg) rather than distinguishing reg(...) and reg(...).

I don't think most people will look at the BNF to write assembly, and we can easily make this clear in documentation.

Another way of looking at it: I feel like the top-level constructs are imm and sym and in and out and similar are the top-level tags for each argument, and then (reg) or ("eax") is a detail of the in/out/inout/lateout/etc (where do you want that to be, an arbitrary register or a specific register).

(Random idea: I wonder if we could make (reg) optional, and allow in expr to mean the same as in(reg) expr. Then you'd only need to specify a parenthetical if you need a specific register, or in the future, memory.)

gbutler · November 21, 2019, 2:48am

Honestly, yes. I've found over the years, working across many different languages and styles of programming that I ALWAYS regret not being explicit when I can be. I've had to improve, maintain, resurrect, finagle, and jury-rig all kinds of what can best be described as "Garbage Code" over the years and most of the "Garbage" comes from lack of explicitness, stringly-typed things, lack of proper constraints (database, types, etc, etc.). When things are explicit, the relationships become clear and apparent and refactoring and re-engineering are MUCH, MUCH, MUCH easier.

Basically, after 25+ years of software development, I consider that anyone who says implicit is better hasn't had to actually maintain complex software for any significant length of time, nor have they had to clean up others' messes very much.

Amanieu · November 21, 2019, 11:34am

I've expanded that section with an example of what the generate code could look like:

Amanieu:

Difficulty of support

Inline assembly is a difficult feature to implement in a compiler backend. While LLVM does support it, this may not be the case for alternative backends such as Cranelift (see this issue ).

However it is possible to implement support for inline assembly without support from the compiler backend by using an external assembler instead. Take the following (AArch64) asm block as an example:

unsafe fn foo(mut a: i32, b: i32) -> (i32, i32)
{
    let c;
    asm!("<some asm code>", inout(reg) a, in("x0") b, out("x20") c);
    (a, c)
}

This could be expanded to an external asm file with the following contents:

# Function prefix directives
.section ".text.foo_inline_asm"
.globl foo_inline_asm
.p2align 2
.type foo_inline_asm, @function
foo_inline_asm:

// If necessary, save callee-saved registers to the stack here.
str x20, [sp, #-16]!

// Move the pointer to the argument out of the way since x0 is used.
mov x1, x0

// Load inputs values
ldr w2, [x1, #0]
ldr w0, [x1, #4]

<some asm code>

// Store output values
str w2, [x1, #0]
str w20, [x1, #8]

// If necessary, restore callee-saved registers here.
ldr x20, [sp], #16 

ret

# Function suffix directives
.size foo_inline_asm, . - foo_inline_asm

And the following Rust code:

unsafe fn foo(mut a: i32, b: i32) -> (i32, i32)
{
    let c;
    {
        #[repr(C)]
        struct foo_inline_asm_args {
            a: i32,
            b: i32,
            c: i32,
        }
        extern "C" {
            fn foo_inline_asm(args: *mut foo_inline_asm_args);
        }
        let mut args = foo_inline_asm_args {
            a: a,
            b: b,
            c: mem::uninitialized(),
        };
        foo_inline_asm(&mut args);
        a = args.a;
        c = args.c;
    }
    (a, c)
}

Amanieu · November 21, 2019, 4:34pm

I'd like some feedback on two possible approaches for handling tied/inout operands.

Option 1

This is what is currently described in the RFC:

Input/output operands are inout(reg) expr where expr is a place expression (lvalue).
We may support expr being a value expression (rvalue), which initializes the register but discards the output value (i.e. a clobbered input).

Option 2

Input/output operands are inout(reg) expr_in => expr_out, where expr_in is a value expression (rvalue) and expr_out is a place expression (lvalue).
inout(reg) expr is a shorthand for inout(reg) expr => expr except that the expression is only evaluated once.
A clobbered input can be described with inout(reg) expr => _, which matches the syntax for discarding an out.

In both cases

in("eax") expr1, lateout("eax") expr2 works as a way of implicitly tying 2 operands through the same fixed register. It makes sense to allow this since it is valid for in(reg) expr1, out(reg) expr2 to be assigned to the same register. "eax" acts as a register class containing only one register.
in("eax") expr1, in("eax") expr2 is not allowed. Same with out/out and in/out. Only in/lateout can share the same register.

josh · November 21, 2019, 5:03pm

I don't think we should support inout(reg) expr with an rvalue and treat it as an implicit clobber, any more than we should support out(reg) expr with an rvalue; that seems error-prone. rvalues should only work with in(reg).

I do like the proposed => syntax. Would that work with both inout and inlateout?

atagunov · November 21, 2019, 5:27pm

For completeness (and in @zackw's spirit) how about this?

reg in(expr_in) out(expr_out)
reg inout(expr) //shorthand for reg in(expr) out(expr)

comex · November 22, 2019, 12:10am

The following is probably a bad idea, but it's been kicking around in my head so I thought I might as well mention it. There's been a lot of discussion about syntax, so... what if we didn't add new syntax?

Starting point:

fn inline_asm<const FMT: &'static str>() { ... }

Example usage:

inline_asm::<"nop">();

Well, that's kind of ugly; it would be better to implement some kind of "const arguments" feature so it could be inline_asm("nop") instead. Such a feature would be useful for other things too; SIMD intrinsics currently use an unstable feature to imitate it (rustc_args_required_const). But for now let's go with the existing syntax.

What about constraints? They could be passed as arguments to the inline_asm function. Ideally the function would be variadic so you could pass any number of constraints. Well, no need for new language features for that; it can already be simulated:

struct inline_asm<const FMT: &'static str>;

impl<Args: TupleOfInlineAsmConstraints,
     const FMT: &'static str>
    FnOnce<Args> for inline_asm<{FMT}> {
    type Output = ();
    extern "rust-call" fn call_once(self, args: Args) {
        // call intrinsic here
    }
}

where TupleOfInlineAsmConstraints is defined like this:

trait InlineAsmConstraint { }

trait TupleOfInlineAsmConstraints {}
// For now, manually implement for different sizes of tuple...
// size 0
impl TupleOfInlineAsmConstraints for () {}
// size 1
impl<T1: InlineAsmConstraint>
    TupleOfInlineAsmConstraints for (T1,) {}
// size 2
impl<T1: InlineAsmConstraint, T2: InlineAsmConstraint>
    TupleOfInlineAsmConstraints for (T1, T2) {}
// etc...

What do the constraints themselves look like? Something like this:

struct InReg<T, const NAME: &'static str>(T);
struct OutReg<'a, T, const NAME: &'static str>(&'a mut T);
struct InOutReg<'a, T, const NAME: &'static str>(&'a mut T);

Out and in-out variants take a mutable reference because they mutate their argument. The assembly code itself would not receive a reference; it would receive a register name that you write into, like usual. On one hand, this could be confusing. On the other hand, with the existing proposals, the idea that passing out(reg) x causes x to be mutated is, IMO, also confusing. There are very few language constructs that mutate lvalues without requiring you to explicitly take a mutable reference: the = operator and its variants, and the . operator. I can't think of any others.

Anyway, you could use constraints like this:

    let mut foo = 4;
    let mut bar = 0; // need to initialize with
                     // dummy value :\
    inline_asm::<"mov {bar}, {foo}">(
        OutReg::<_, "bar">(&mut bar),
        InReg::<_, "foo">(foo));

Ouch – for the constraints the turbofishes are even uglier. With "const arguments", the whole thing could look much nicer:

    inline_asm(
        "mov {bar}, {foo}",
        OutReg("bar", &mut bar),
        InReg("foo", foo));

Even with that, I'm not at all convinced that the benefits (of technically not adding any new syntax) outweigh the drawbacks. But I'm posting this anyway just as food for thought.

CAD97 · November 22, 2019, 12:24am

TBF, the macro form doesn't add any new syntax either, because a macro accepts an arbitrary token stream

I think the best way to push this forward would be for someone to implement the fallback implementation (c calling convention function via external assembler) as a proc macro. Then we can implement the "optimization" of targeting LLVM style asm and gain experience using it. (And building a tool to convert between the two would lower the cost of experimentation!)

Amanieu · November 22, 2019, 12:27am

Unfortunately that's not really possible since asm! needs to know the types of input/output operands so that it can print the correct register name for a given type (e.g. on ARM s0 vs d0 for f32 and f64).

197g · November 22, 2019, 12:48am

I like the idea. We must not forget that mimicking C (and the legacy carried over to C++) constrains us to a compilation model with very restricted constant evaluation. Envisioning a truely different design could provide much better usability. How about turning it into a standard const object available through const eval? That way it composes much better with user crates and is more orthogonal to existing features.

// Intended possible syntax, and usage:
const CPUID_SOURCE: &'static str = "
    mov $1, %eax
    cpuid
    mov %eax, 0(%{0})
    mov %ebx, 4(%{0})
    mov %ecx, 8(%{0})
    mov %edx, 12(%{0})";

// This is just a builder/descriptor. Real magic happens when this is 
// used as a const parameter to the intrinsic, see below.
const ASM: Assembly = Assembly::new()
    // Target-arch specific set of register clobbers, available like SIMD
    .with_clobbers(&[Reg::EAX, Reg::EBX, Reg::ECX, Reg::EDX]))
    // Request one input register, referenced by index 0. Adds memory to
    // clobbers due to mutable reference? Maybe a worthwhile idea.
    .with_input::<&mut Cpuid>(0)
    // Request the compiler internal assembler.
    .from_source(CPUID_SOURCE);

fn cpuid() -> Cpuid {
    let mut cpuid = Cpuid::default();
    intrinsic::call_asm::<{ASM}>(&mut cpuid);
    cpuid
}

And note how this even leaves the possibility open to not require the compiler itself to do the assembly, by providing a from_relocatable_instructions finish method as an alternative to from_source.

spunit262 · November 22, 2019, 1:46am

We can't do it with proper syntax, but we could at least do it with mandatory type ascription as a proof of concept.

petrochenkov · November 24, 2019, 6:38pm

So, this is not directly about the assembly specifics, but keep in mind that the output of asm!() will eventually need to be a token stream rather than AST.

That means there should be some "native" syntax that the asm!() macro expands to.
That syntax may be entirely unstable and unergonomic, but it should still be parseable without ambiguities when arbitrary expressions are passed to it, should be somewhat readable, and should probably use context-dependent identifiers ("weak keywords") sparingly.

I'm not sure what constraints this puts on the syntax accepted by the asm!() macro.

(Having a native syntax also means a possibility to implement alternative "frontend" asm macros, perhaps with an alternative input/output syntax, or something like that.)

mark-i-m · November 24, 2019, 9:39pm

It seems like it could just compile to a call to some permanently unstable intrinsic that takes the asm as a string and some struct describing the inputs and outputs.

Topic		Replies	Views
[Pre-RFC]: Inline assembly language design	70	14107	March 25, 2019
Stabilization path for asm!()? language design	11	3319	March 25, 2019
Older RFCs for discussion this week	9	1657	March 25, 2019
This week's older RFCs	3	1235	March 25, 2019
Next week's older RFCs for discussion	8	2161	March 25, 2019

[Pre-RFC #2]: Inline assembly

Option 1

Option 2

In both cases

Related topics