It appears that LLVM doesn’t have intrinsics for unaligned loads and stores, so the user of a high-level language can’t talk directly to LLVM to request unaligned loads and stores via something like the
link_llvm_intrinsics feature. Instead, the compiler for the high-level language needs to provide the means to have the kind of LLVM IR generated that eventually compiles to unaligned load/store instructions.
emmintrin.h, e.g. the Intel-defined
_mm_loadu_si128 SSE2 intrinsic doesn’t map to a
__builtin call but to a dereference of a pointer to a single-member struct annotated with
simd crate uses the same pattern for the same purpose with a single-member
#[repr(packed)] Rust struct. In debug builds, this works if the result of the dereference is assigned to a local variable before extracting the single member. Using an expression without the intermediate variable fails, though. Furthermore, AFAICT, debug mode doesn’t actually emit a
MOVDQU instruction but accomplishes the results of the computation by other means.
At present (did it work pre-MIR?), that pattern fails in release mode. The load is emitted as
MOVDQA, which requires 16-byte alignment.
Given the past and the clang approach, the obvious way forward would be to make the
#[repr(packed)] pattern work with MIR. However, making things work for packed structs generally seems over-complex considering the narrower goal of accomplishing unaligned SIMD loads/stores and too much of an obscure incantation from the language user perspective.
From the language user perspective, it seems to me that having
*mut would be more obvious and would be consistent with
Looking at the LLVM IR clang generates for
_mm_loadu_si128, it seems that the difference between eventual
MOVDQA instruction generation is annotating the LLVM
store instructions with
align 1 instead of
align 16. It seems to me that it should be possible to add
unaligned_store next to
volatile_store and make the new intrinsics generate LLVM
align 1. Then these could be exposed on
*mut in the same manner as the volatile variants.
Does this seem like an OK way forward?