Check if target features are available at start

When compiling a program with any recent target feature, a program could still be executed on a machine that does not support the required target features. And would still run until it encountered the invalid instruction, which could lead to either (best case) a program crash or undefined behavior.

Some programs therefore check at the start if the machine the program is running on supports all features the program is compiled with to fail safely early.

And this should also be possible in the pre-main/runtime start, or is there any reason to not include this check (even optionally)?

4 Likes

This would be hypothetically possible to implement but we would need to first deliberately remember all features we were compiled with. This also would only be fully effective for binaries, as non-binaries will not necessarily execute the runtime startup.

And finally, asm! can always work around it, of course.

Oh, and the final detail is that we don't actually know if target_feature(enable) code will be hit or not, you're supposed to check manually anyways?

1 Like

This also would only be fully effective for binaries, as non-binaries will not necessarily execute the runtime startup.

This, of course, only applies to binaries and executables with std, as the check would have been done at runtime. And one could argue that this is good enough because calling DLLs can only be done using unsafe (and there you have to make sure that certain conditions are met), and rlibs are compiled with the same set of target features anyway.

Oh, and the final detail is that we don't actually know if target_feature(enable) code will be hit or not, you're supposed to check manually anyways?

It's still possible to use all features even if a target feature is not set. But if it is executed, it has to be gated by checking at runtime if the platform supports the given feature. On the contrary, when specifying target features at compile time, the target must support all the features, as the compiler itself is allowed to use these features. That's also why it's important to do this check as early as possible in the execution.

As an example if target-feature contains SSE 4.1, but the machine its running on its not supported. It will most likely crash at some random point when LLVM decides to use SSE 4.1. You can't really manually check this. (I mean you could do it in your main, but doing it manually is error prone, and its a good thing to have in every binary generated)

The actual implementation is a different story, and I'm not too involved in the internal structure of the Rust compiler, etc. This has to be generated when compiling the final executable, but when it is introduced, it will most likely be feature-gated in std, which requires recompilation of the std lib anyway.

On the contrary, when specifying target features at compile time,

Ah, you mean setting e.g. -Ctarget-feature=+avx2 on the command line specifically. Yes, that introduces it for the entire binary. The only problem is that this means that, if we also use, say, -Zbuild-std, it's now permissible to optimize the stdlib's runtime startup implementation using this knowledge about features.

calling DLLs can only be done using unsafe (and there you have to make sure that certain conditions are met),

...huh, is that really true? I thought you could call Rust dynamic libraries (NOT cdylibs) without unsafe if the function was defined as safe.

Exactly, which wouldn't actually be a bad thing, you would only have to ensure somehow that the generated assembly for the check doesn't use extensions.

But more ideally you would generate a symbol which for example contains a bitset with all required features, and the rt just checks that the symbol addresses value and the value the the rt is build with is compatible though bit magic.

Ah you're right, never seen dylibs in the wild. Is their ABI even stable across Rust versions?

ABI is not stable as far as I know. I believe Bevy and Fyrox can both use this though to speed up incremental builds. The engine is compiled into a dylib which your game then links. There was a video the other day on reddit of hot code reload in Fyrox even, where the user code was compiled into a dylib as well.

In these cases the dylib(s) are part of the same workspace as your program, built by the same compiler version. So less ABI concerns. I'm not a game developer so I haven't tried either of these myself though.

Exactly, which wouldn't actually be a bad thing, you would only have to ensure somehow that the generated assembly for the check doesn't use extensions.

I mean, yeah, but we'd really want to panic tho', with a useful (albeit possibly very terse) message, and that now means all the panic code has to be compiled with a specific weakened ISA too...

1 Like

There is rtprintpanic rust/library/std/src/rt.rs at master · rust-lang/rust · GitHub. But yep it would have to be compiled with the bare minimum ISA, or just x86-64-v1 which is the default for Rust anyway at the moment. (I assume it might be raised in 5-10 years). Even though it could also be a stub.

My main reason for checking target features, is to allow for distributing binaries with a more recent micro architecture level, without getting ambiguous issues. Sure you could do it yourself in the main, but you can't predict any behavior. And solving this problem once for all is in my opinion a better solution.

The goal of this post is just to get a general opinion if its worth investing time in.

Btw. x86-64-v1 doesn't even contain SSE3, SSE4, AVX and BMI

1 Like

The goal of this post is just to get a general opinion if its worth investing time in.

I think it's reasonable yeah, if you can figure out a way to handle the "split the target feature with the baseline of the target and the actual compiled value" thing inside the compiler. Which is possible to do, to be clear, the LLVM modules we emit allow it.

1 Like

Minor implicit context — using rustc flags like -Ctarget-cpu=native is implicitly unsafe, because it confers a requirement to the binary author to ensure it's only run on target CPUs with at least the host's target-feature set.

I can absolutely see people not intimately familiar to native compilation seeing a recommendation to use -Ctarget-cpu=native for maximum performance without any further caveats (I've never actually seen the caveats spelled out explicitly) and then running the built binary on an older machine, opening them up to UB. This is trivially possible if using "latest" CI machines to build binaries, and I don't know whether cargo-dist or other such installer providing tooling handles multiversioning installs based on target CPU milestones in additionin to by target triple.

So even if a check is only performed at std's #[lang = "start"] before entering fn main, and entirely ignores dylib usage which would require "true" life-before-main, checking the detected feature set satisfies the globally enabled feature set would be an improvement to the status quo.

This check can't really be done soundly in non-std library, because fn main is entered with the features enabled. But by virtue of being precompiled, std is compiled with the minimal (well, default) set of target features for the target triple. It would thus need some compiler supported sidechannel to actually perform the check (either a feature mask or the actual test function linked from the crate compiling fn main), but it can do the check where non-std can't do so properly. (Namely, even with asm!, fn main is entered with the target features enabled, meaning it's LLVM level UB for them not to be available.)

1 Like

Not quite, it's compiled for the default target-cpu for the target, for example in the case of x86_64-unknown-linux-gnu that's x86-64 (the suffix-less v1 of the 4 standard feature sets) which has a few features:

> rustc --print target-cpus | rg default
    x86-64                  - This is the default target CPU for the current build target (currently x86_64-unknown-linux-gnu).

> rustc --print cfg | rg target_feature
target_feature="fxsr"
target_feature="sse"
target_feature="sse2"

Which is itself very necessary because some features such as sse change the ABI of functions, so linking crates compiled with/without it active together is UB IIUC.

Not to forget -Zbuild-std also, which I hope one day becomes the default way of working.

Arguably, those features are part of the minimum feature set of the x86_64-unknown-linux-gnu target triple, specifically because they impact ABI and std (core) is compiled with them. If you want a target without those features, you need to use a different target triple, e.g. perhaps x86_64-unknown-unknown.

This is independent of whether rustc allows you to compile a binary with the target features turned off.

2 Likes

However, those features are required on all x86-64 CPUs; they exist also as target features because they are optional on 32-bit x86, and it's nice if you can unconditionally use the target feature to select the version of a function that benefits from SSE or SSE2 during autovectorization on x86 or x86-64 (instead of having to have one for pre-SSE x86 and one for x86-64).

If it wasn't for 32-bit x86 compatibility, those features would be entirely implied by x86-64.

They also exist so that kernels can be compiled with them disabled. That way they don't have to save and restore vector registers across syscalls.

2 Likes

I'm currently prototyping this approach, and cleaning up std_detect, might also found a bug, where is_x86_feature_detected!("avx512f") returns true even if the os doesn't support it, because it only checks the cpuid, but the operating system logically, also has to support this feature. On Windows this is checked via IsProcessorFeaturePresent(PF_AVX512F_INSTRUCTIONS_AVAILABLE) like its done with AArch64.

And the tsc, and mmx feature are not registered as compile time features? As it fails when checking them at compile time when running ./x test library/std

MMMX support has been removed in

as the intrinsics are very hard to use correctly and in some cases even impossible to use correctly due to the compiler being allowed to reorder floating point operations into the region where MMX support is enabled. See

for more info.

1 Like

Arguably it is then also unsafe to even build binaries for embedded. You could load an image on the wrong device for example. This could cause it to do all sorts of interesting things. For example you might make a binary for a device you specify has 8 MB SPI attached RAM (as some Esp32 boards do) and then load it on one with 4 MB SPI RAM.

And this is keeping within rust narrow definition of unsafe. In a more everyday sense of unsafe it is easy to blow up electronics when working with embedded.

So I would say that while technically true, it is not a particularly useful.

2 Likes

IIRC it's handled by checking contents of the extended control register using the XGETBV instruction.

1 Like