[Pre-RFC]: Inline assembly

glandium · August 29, 2018, 3:23am

Being in the middle of determining whether more assembly blocks than the two I already found are broken in the Firefox code base, I’d like to note how much of a massive footgun input, output and clobber definitions are, when they could be derived from the assembly itself. This was not possible in the old GCC model where a C file is preprocessed, then compiled into assembly, and then translated by an assembler into machine code, but in the LLVM model, where the LLVM framework contains the assembler, this could be treated differently. It is a bummer that clang doesn’t for GCC style assembly (at least not now; at the very minimum, I wish it had a warning about misplaced inputs/outputs, and missing clobbers), but it actually does for MSVC style assembly, which doesn’t come with inputs, outputs, and clobbers. There’s a chunk of code in clang that parses the assembly, and generates inputs, outputs and clobbers from that.

You might say that people writing assembly should write their inputs, outputs and clobbers correctly, but the fact that I was able to find problems in two different third-party code bases used in Firefox, one of which is 8 years old, and that it went undetected for so long is telling. And it only causes problems when things align in a certain way. That is, those things that I found in Firefox… they just happen to have been fine so far, but enabling LTO made things go in unexpected ways. And where it’s the most interesting is that it didn’t even break consistently on all platforms because things were not aligning the same way on all platforms. So, like, the last one I’m dealing with at the moment only happened in a visible way on mac, although it could very well happen on linux, but didn’t. That’s why I’m doing a more systemic scan, now.

josh · August 29, 2018, 8:35pm

I’d love to have rustc scan assembly and attempt to give warnings if the clobbers seem incomplete. I don’t think we can do that in the completely general case (consider code that saves and restores a register), but we could provide best-effort warnings for common cases.

I don’t, however, see how we could handle most cases of inputs or outputs.

Amanieu · August 29, 2018, 8:47pm

The consensus on previous threads regarding inline asm has been to expose a low-level API first, on top of which other people can build high-level APIs such as ones that automatically derive constraints from the assembly text.

glandium · August 30, 2018, 12:33am

FWIW, the two cases I fixed in Firefox were, in large part, bad input/outputs (to be specific, variables declared as input, but modified by the assembly ; like a count decremented in a loop ; those need to be declared as output instead with a + modifier)

josh · August 30, 2018, 2:28am

Ah, I see. Yeah, those would either need to become clobbers or (earlyclobber) input-outputs.

glandium · August 30, 2018, 2:32am

The downside to this approach is that the high-level stuff needs to have its own assembler parser and database of instructions, which LLVM already has.

CAD97 · August 30, 2018, 4:02am

It wouldn’t be breaking to add auto-clobbers later, though, would it? As anything that wouldn’t be already specifying the clobbers would be UB. So not providing them now is a conservative first step that also doesn’t require all of that work yet.

hanna-kruppe · August 31, 2018, 1:51pm

Could you please link this code? I went looking for it but couldn't find it, though I am not all that familiar with clang's internal organization so I may have looked in the wrong places.

glandium · August 31, 2018, 10:00pm

github.com

llvm-mirror/clang/blob/abdbb605f2c3cbe63cd589da230f648535dff76b/lib/Parse/ParseStmtAsm.cpp#L395


      
          ///         ms-asm-block ms-asm-statement
          ///
          /// [MS]  ms-asm-block:
          ///         '__asm' ms-asm-line '\n'
          ///         '__asm' '{' ms-asm-instruction-block[opt] '}' ';'[opt]
          ///
          /// [MS]  ms-asm-instruction-block
          ///         ms-asm-line
          ///         ms-asm-line '\n' ms-asm-instruction-block
          ///
          StmtResult Parser::ParseMicrosoftAsmStatement(SourceLocation AsmLoc) {
            SourceManager &SrcMgr = PP.getSourceManager();
            SourceLocation EndLoc = AsmLoc;
            SmallVector<Token, 4> AsmToks;
          
            bool SingleLineMode = true;
            unsigned BraceNesting = 0;
            unsigned short savedBraceCount = BraceCount;
            bool InAsmComment = false;
            FileID FID;
            unsigned LineNo = 0;

zackw · September 11, 2018, 5:37pm

I agree with @glandium, it is really hard to write GCC-style input/output/clobbers lists correctly even if you know exactly what you’re doing. I don’t have much experience with MSVC-style asm blocks but I wouldn’t be at all surprised if there were a bunch of unwritten rules you have to follow to make that do the Right Thing, as well.

Anything rustc and/or clippy can do to catch mistakes ahead of time would be valuable.

system · March 25, 2019, 8:29am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Richer inline asm compiler	4	903	November 19, 2022
Inline assembly syntax internals	7	4756	March 25, 2019
What assembly syntax "should" wasm `asm!` use? language design	2	704	January 30, 2024
Pre-RFC: Extended array literal syntax language design	5	829	September 21, 2019
[DRAFT] RFC: or patterns language design	4	826	March 25, 2019

[Pre-RFC]: Inline assembly

Related Topics