I came here to make the same suggestion as @scottmcm.
As far as I know, a SIMD type in memory has the same layout as a fixed-length array, except for stricter alignment. Is that correct?
In that case, to avoid a combinatorial explosion of new nominal SIMD types – i8x16, i16x8, and so on, which we would presumably have to keep extending whenever larger register sizes are introduced – we can just define a single type:
#[lang(the_simd_type)]
struct Simd<T>(pub T);
This immediately lets us express any SIMD type we could ever possibly need as Simd<[i8; 16]>, Simd<[i16; 8]>, and so on, without any further definitions required.
Semantically, the Simd type behaves the same way as every newtype. You can use it with any T, put any value in, access it, and take it back out. It’s just a transparent wrapper.
For choices of T which correspond to valid SIMD types on the target architecture – that is [Prim; N] where Prim is a fixed-size primitive integer or floating point type, N is an appropriate power of 2, and multiplying yields a size of 128, 256, 512, or whatever – Simd<T> has the alignment of the corresponding machine SIMD type.
For all other choices of T, the alignment doesn’t really matter, and could be left unspecified, could be the same as that of T itself, or could be equal to size_of::<T>() (which I believe is always the case for SIMD types? so it would be consistent in that way).
This immediately gives us SIMD vector construction in terms of just array construction – per-element Simd([1, 2, 3, 4]), splatting Simd([42; 4]), and deconstruction and lane extraction as array element access: simd.0.0, simd.0.1, simd.0.2, simd.0.3, and so on.
Now the important bit: all of the exposed instrinsics would still be defined in terms of concrete choices of types. That is, with typedefs for convenience:
type f32x4 = Simd<[f32; 4]>;
type f64x2 = Simd<[f64; 2]>;
...
// I'm not familiar with all the vendor intrinsic names so I made some up for example's sake
fn _mwhatever_add_four_floats(f32x4, f32x4) -> f32x4;
fn _mwhatever_add_two_doubles(f64x2, f64x2) -> f64x2;
...
The win is just that we don’t need to define completely separate struct types for every single valid SIMD type and all of their construction, casting, and so on operations.
If and when the type system is suitably extended with e.g. constant generics, it will also be straightforwardly possible to define the signatures of generic operations such as fn simd_mul<T: Primitive, const N: usize>(Simd<[T; N]>, Simd<[T; N]>) -> Simd<[T; N]>, without having to define even more separate types for that purpose, but there’s no reason that needs to be done in the first iteration. (Just because the Simd type itself is generic, doesn’t mean that the operations defined over it need to be!)