Pre-RFC: stabilization of target_feature

(If you couldn’t tell, I’ve found this work to be inordinately humbling!)

1 Like

Another option would be make the unsafety specific to each feature. Then if we make all the supported features safe now, and then one comes along that breaks like @gnzlbg mentions, it could be an unsafe one. AFAIK rust makes no claims on safe code not segfaulting, I think you can quite easily do that by overflowing the stack into a guard page.

I would personally not be a fan of that. It sounds like it’s too easy to make a mistake. I’d really like to chase a general solution to this problem. I don’t want our memory safety guarantees to be steepd in “well if you use this feature and that feature then it’s safe, but if you add this other feature, it becomes unsafe.”

I guess we should settle the following questions:

  • Is there a way for us to reliably detect at run-time whether a platform supports some target feature or not?
  • If there is, can we use this to detect that a call to a function tagged with #[target_feature] will fail, without introducing a run-time cost?

I don’t think so (to both). Other languages have cpuid libraries that try their best to query the CPU for the features it supports, but I don’t think that the CPU is required to announce all features it supports, and I don’t know whether these libraries can handle CPUs / features that they do not know (e.g. do these libraries become outdated?). I think we need to do some research here.

Anyhow, even if we could, we have to insert this checks before the function is called, and perform them only if the function is going to be called. I don’t think that we can do this either without introducing a run-time cost.

If these things turn to not be possible, then I think that we are very constrained in what we can actually do. Forgetting about ABI and LLVM issues for a second, we could:

  • make #[target_feature] unsafe to use by requiring functions annotated with it to also be annotated with unsafe fn. This makes our lives significantly easier:

  • specification becomes a no brainer: if the function is called on a platform that does not support the feature the behavior is undefined. To introduce undefined behavior, an user will need to write unsafe. Safe wrappers could then be written on a best-effort basis using third-party cpuid libraries, and those that do not want to pay for safety don’t have to. But if you get a segfault, the reasoning that it must come from unsafe code is still correct.

  • it gives the compiler a lot of leeway with respect to quality of implementation: it allows emitting warnings for issues that can be detected at compile-time, and even automatically inserting run-time instrumentation in some builds to detect issues and gracefully panic (using some best-effort cpuid library)

  • If ABI issues introduce undefined behavior “that’s kind of ok” because the user wrote unsafe code. Yet because undefined behavior allows us to do anything, we could insert instrumentation to automatically detect these in some builds and, e.g., panic with a nice error message. Maybe we could do even better here and detect them at link-time, but even if we can’t, I think that this solution is stil ok-ish.

    • make #[target_feature] safe to use: that is, calling a function that is marked with #[target_feature] is safe (unless it’s explicitly marked unsafe fn):
    • We need to specify the behavior of Rust code on platforms that do not support the feature. The Rust-y thing to do would be to panic, but this might require catching SIGILL exceptions on some platforms and turning them into a panic which might introduce a small cost (which for some will be unacceptable). It’s also not clear what to do on platforms that might do something completely different from raising an exception. If we can detect the lack of a feature at run-time (e.g. during process initialization) we might be able to query cpuid flags at run-time before a function call, but I don’t know whether we can do this without introducing a run-time cost.

I don’t know 100% sure the answers to the two questions above, but if the answer is that those things cannot be done reliably and efficiently, I think that the best solution is to only allow tagging unsafe functions with the #[target_feature] attribute.

If we ever find a way to make it safe without introducing a run-time cost, we can always drop this requirement in a back-wards compatible way. Otherwise, we could always introduce a #[safe_target_feature] attribute that does not require unsafe but that does introduce a run-time cost.

EDIT: pinging @nikomatsakis because whether we make #[target_feature] safe or unsafe does depend on what the execution model for Rust assumes/expects/guarantees with respect to illegal instructions, and also, because this interacts with the unsafe guidelines: segfaults might not introduce memory unsafety per-se but since we cannot guarantee, in general, that we are going to get a segfault, in some platforms an illegal instruction might lead to memory unsafety. Whether those platforms will ever exist and whether Rust might want to support them is an open question, I think we need guidance from the Rust team on this issue.

@gnzlbg Could you say something about whether you think unsafe is sufficient at all? If it can possibly leak through safe APIs, then that would be bad. But I don’t have the expertise to judge that claim.

1 Like

I personally wouldn’t want to forget what @parched mentioned above about compiler flags. Notably it’s “considered safe” today to compile code with -C target-feature=... or things like -C target-cpu=native. We can debate language vs implementation here but there are real practical ramifications of this, what will we document about those flags? Is it safe or unsafe to pass them?

I would also personally feel that running code with a target feature that your CPU doesn’t support is memory unsafe without more data supporting such a conclusion. I guess it’s theoretically possible that some future cpu supports a SIMD instruction that an older cpu though was otherwise normal instructions, but does this happen? Is there any data for this? Are we ready to completely hamstring Rust’s SIMD story based on this theoretical outcome?

I would ideally like to see a proof of concept showing that if you mis-apply #[target_feature] it results in memory unsafety or compelling evidence that it’s possible. Otherwise I do not think we should have such a restriction about requiring unsafe to use #[target_feature] and SIMD at all.

1 Like

Another thought. Let’s say we stabilize that it’s safe to use #[target_feature]. Let’s then say that in 10 years a fancy new CPU comes out which violates our safety guarantees around #[target_feature]. You could imagine us working around this problem with:

  • The compiler has a whitelist of “known bad” #[target_feature] directives. These features, when enabled but run on the wrong cpu, may result in memory unsafety.
  • Consequently, the compiler injects runtime support to detect these features existing.
  • Each function tagged with #[target_feature] has a boolean check at the top (load a global, see if it’s false) and if so then it panics immediately.

The hypothetical cost here is quite small (SIMD functions are all about doing all your work inside anyway) and it seems like a reasonable restriction to me for a ‘well maybe at some point in the future this is relevant’ situation. This further makes me think we should push forward with "#[target_feature] is safe to use, it may generate SIGILL". We would have plausible scenarios to dig ourselves out if the worst happens.

1 Like

Do you have some particular example in mind? The only things I can think of that could make it leak would be if LLVM decides to use the target features on surrounding safe code (e.g. due to inlining or a bug), which AFAIK is not the case.

I also think that safe wrappers around target feature will probably never be perfect, but err on the safe side. For example, a cpuid crate that tells whether a cpu support some feature should return false if it doesn't know (e.g. because it doesn't know the CPU, the feature, or it cannot parse the flags, or there was some system error, or...). Anyhow, if there is a problem (e.g. the cpuid crate has a bug), you can always follow the unsafe code to find out where that bug comes from.

Does that make sense? Or do you mean something else by "leak" ?

I don't think this is considered safe; it is just not considered. If something goes wrong, the behavior is, to the best of my knowledge, completely undefined. If you are lucky you'll get an LLVM error which is something that should never happen. I consider these flags as a way of talking with LLVM directly, while defining some cfg! macros at the same time.

The most important thing we should document is that they define some cfg! values, which is the only thing that directly affects Rust-The-Language. We should also document that they pass some flags to the LLVM code generation backend, and that users should what are the LLVM preconditions on those flags, what are their semantics, and how they should be used so that they do not generate undefined behavior within LLVM nor Rust.

This answer might be unsatisfying, but I don't think we can do better. On one hand, these things are implementation defined. More often than not even LLVM doesn't know what the exact semantics of some of these flags exactly are. On the other hand, these features exist to allow users do this kind of things. Both are in tension.

I would also personally feel that running code with a target feature that your CPU doesn't support is memory unsafe without more data supporting such a conclusion. [...] Are we ready to completely hamstring Rust's SIMD story based on this theoretical outcome?

Making #[target_feature] require unsafe fn only actually implies that calling a function on a CPU that doesn't support the features it uses can potentially introduce memory unsafety. I only work on platforms that raise a SIGILL exception when this happens, so in my experience, this is never the case because your process is guaranteed to crash. I don't know if this is the case everywhere though, but even if the worst thing that can happen is a crash, I don't think that making them unsafe is that bad (the worst thing that can happen on nullptr dereference is often also a crash). Yet, and this is my opinion, a truly safe wrapper wouldn't just crash, but rather panic with a nice error message.

It seems to me, however, that you meant to say what you need is proof that if we make #[target_feature] safe it would be possible to use it to create "real" memory unsafety in pure safe Rust code. I agree that we should try to see if this is actually the case, and in particular, which target features currently available in Rust do we actually need to create it.

@burntsushi showed an example above in which the semantics of the program are changed by using target feature. I think that it is possible to adapt his example to create memory unsafety by just encoding pointers as integers and back, and use those to read from memory. With target feature enabled the value of those pointers would change, producing memory unsafety when memory is read. The main issue I have with this example is that it does require writing unsafe code. However, the unsafe code is correct with target feature disabled, and incorrect with it enabled. Maybe somebody can come up with a better example during the RFC process that does not require unsafe.

Another thought.

Replying to your second post, you are somehow implying that making #[target_feature] unsafe somehow makes SIMD in Rust worse. I don't see how. SIMD intrinsics are unsafe anyways, and a safe SIMD wrapper can provide a safe abstraction over target feature in the same way that it can provide a safe abstraction over these intrinsics.

Since #[target_feature] will need to support multiple, sometimes very different, target features, across possibly different backends in the future, I would be wary of making it safe, at least initially, and then trying to workaround this soundness language bug.

I am with @burntsushi in that we should either make target_feature safe or unsafe, but not make this "feature-combination dependent". I also think that making it unsafe won't be that bad in practice, and given that he already found a trivial way to modify the behavior of a program using target feature, I would rather err on the safe side for an initial RFC. Still, we shouldn't make things unsafe lightly, and I will definetly write this as an issue that we must resolve before stabilization. I hope that during the RFC process some people will try to prove that a safe target_feature is unsound, at least with some of the features available on nightly, and also hope that they will succeed, since this will make it clear that the feature must be unsafe to use. Not because I like unsafe, but because I don't think that we can ever prove that this is safe to use, and if nobody manages to break it, we really will never know if its sound or unsound to make it safe.

Hm I'm not sure I quite understand this. I do think we can do better, and I have proposed how we can do better. Do you feel this proposal is insufficient, however?

This is what @burntsushi meant would generate a hard error above. The previous example is an invalid program that will either be rejected by the compiler or fixed internally. To make sure we're on the same page though, we're only talking about one thing: what happens when a CPU executes an instruction it does not support? This can happen through a number of means:

  • You pass -C target-feature=+avx and then LLVM auto-generates AVX instructions during auto-vectorization.
  • You have a function you tagged with #[target_feature = "+avx"] and LLVM auto-vectorized it.
  • You used an AVX instrinsic itself, generating that instruction.

But again to be clear, we're only talking about unsupported instructions. There's some CPU in existence today which supports the instruction, but our CPU we're currently running does not support this instruction. Given that, we're talking about one and only one case of memory unsfatey: an older CPU interprets a new instruction and does something different than what the instruction originally intended.

This seems... like it's not possible to exist. We're talking purely about hypotheticals. I don't think we should assume that this happens unless we actually find data that it does.

Yes I do personally feel this. Leveraging unsafe in Rust is something that should always be avoided at all costs. It's a huge downside to otherwise 100% safe code, as all of a sudden every time you read/write it you have to understand "why is this unsafe?" and often that question is quite nuanced.

Safe, fast, usable SIMD would be a huge boon for Rust. I feel that blanket declaring all usage unsafe without much concrete evidence supporting such a conclusion is premature and harmful to adoption. It's making everyone reach for unsafe where it would otherwise not be necessary. For example I translated an example SSE4.1 powered hex encoder and it doesn't need any unsafe at all.

If we make these functions all unsafe, we need a clear and crisp statement as to why they are unsafe. Right now we know for a fact they can generate SIGILL, and that's not memory unsafety. We speculate that some combination of future development and old CPU may generate memory unsafety, but this is quite vague and not strong enough rationale in my mind for justifying usafety.

I think this the wrong foot to start off on. By construction is it not impossible to prove that this is 100% sound? The possible unsafety we're talking about is an older CPU misinterprets an instruction intended for a future CPU and doesn't generate a SIGILL. With this problem statement the answer seems trivially "yes it's possible to design something that does this", but the real question is does any cpu actually do this? I would hazard a "no" guess and focus energy in having a contingency plan in case this does happen (which I believe we have).

I really don't think that we can value safety enough, especially in domains like this. Making "SIMD in Rust" a massive pile of unsafe because "one day a CPU may exist that violates it" seems backwards. Let's work with what we have today where (a) it's unlikely anything actually exhibits memory safety and (b) if a weird cpu exists we have a contingency plan to support it.

1 Like

or simply getting --target wrong which I have seen people do when cross compiling for ARM.

If this ever happens we will have problems without target feature anyway because if you try and run the code compiled for the older target on the new one it will break.

IIUC your proposal is to make target-feature safe, is that correct?

Ah no! It is kind of late here, sorry about that. I was talking about whether #[target_feature] is safe to use in general. I'll try to keep this short because its been a long day.

IIUC this example is currently accepted, and whether or not one uses target feature changes its semantics. @parched proposed a solution that works for this case. I don't know if it works for all cases.

When talking about what happens on an illegal instruction we can really just say "a SIGILL exception will be raised". If we ever want to support a CPU that does not do this (which might never happen), we will be able to work it around somehow (maybe with an additional run-time cost), so I am 100% with you on this. I just wasn't sure whether we are already supporting CPUs that might not do this, I have only experience with ARM and x86, I, for example, really have no idea what micro-controller CPUs do in this case.

But I was not talking about SIGILL. I was talking whether there are other ways to introduce memory unsafety using target feature. My concerns stem from @burntsushi's example and the ABI issues he mentions.

#[target_feature] is a function attribute that is saying "compiler you are allowed to use special instructions if you want to", so IMO if the compiler uses instructions that change the semantics of the program, this is already a compiler bug independently of whether they introduce memory unsafety or not.

I don't want to dig on this, but I think the SIMD example is not particularly good because SIMD intrinsics are unsafe, and any safe API over them needs to deal with unsafety anyways, so that in this case wether #[target_feature] is unsafe or not is kind of irrelevant in my opinion. Anyhow, I agree that we shouldn't make things unsafe lightly.

EDIT: Summarizing, if we can just specify that calling a function decorated with #[target_feature] on a target that does not support the feature does raise an illegal instruction exception, then it is a no brainer that we should make it safe (as long as we can resolve the ABI issues, which I think we can). So IIUC that means we both agree.

Wouldn’t it be simpler and more use friendly to have #[target_feature] have the semantic “compile this code with the feature enabled and only run it at runtime if the feature exists”? That way there is no risk of SIGILL because the runtime check is always guaranteed by the compiler. And any implementation of rust will already have the code to detect features to enable -C target-cpu=native so pushing the runtime checks to crates is just duplication. Maybe even duplicating the same exact syntax of the compile time variant:

if cfg!(runtime_target_feature = "bmi2") {
    ...
} else {
    ...
}

@pedrocr

For that we need to reliably be able to detect target features at run-time. This will introduce a run-time cost that not anybody wants to pay: at binary initalization, on function call, missed optimizations (this might prevent inlining of the function), and also will increase binary size since the run-time needs to detect target features.

It would be great if we could do this with only slightly increased binary size and some upfront cost at binary initialization (without cost on function calls, and without missed inlining optimizations).

@gnzlbg oh thanks for clearing up! Sorry if I was getting a little forceful as well, I forgot that we hadn’t really spelled this out much yet!

I figure though it may be worth writing down more of this issue in detail. A lot of this boils down to the way LLVM generates functions and works with arguments. Let’s say you’ve got a function like so:

pub fn foo(arg: u64x4) {
    // ...
}

Here we just have a simple function which is taking a 256-bit vector argument. This function importantly works on all platforms, even those that don’t have support for 256-bit vectors. Essentially LLVM has an emulation layer where it’ll recreate operations with other primitives, depending on what’s available.

So for now I’ll focus on x86_64 for ease, and we can be in one of three situations:

  • We’ve got AVX2 instructions and access to 256-bit vectors natively
  • We’ve only got SSE2 which gives us 128-bit vector
  • We have nothing, no SIMD registers

In each of these three situations LLVM will codegen the above function differently. It’s important to note that LLVM’s IR representation is identical regardless of enabled target features. What’s happening here is that LLVM is taking a function and then creating machine code based on the enabled target features for that function. Each function can then have a different set of target features enabled.

So for example for each of the three situations we have:

  • If we have AVX2, arg is passed to the function in %ymm0
  • If we only have SSE2, arg is passed in %xmm0 and %xmm1, two 128-bit registers
  • If we don’t have anything, arg is passed in four registers %rdi, %rsi, %rdx, %rcx (I think these are the first four parameter registers on x86_64)

So given that LLVM will codegen all of these differently, there’s a very big problem if we mismatch them! As far as I know LLVM will not catch this problem for us. I think it’s taking a “naive” view of the world and simply generating code for each function isolation, avoiding cross-function mismatches.

This means, for example, that this cde has a mismatched ABI:

#[target_feature = "+avx2"]
fn bar() {
    foo(u64x4::splat(0))
}

#[target_feature = "-sse2"]
fn foo(arg: u64x4) {
}

In this situation bar will call foo by passing a parameter in the %ymm0 register, but foo will expect the first four argument registers to have the arg instead. While this isn’t memory unsafe yet it’s not hard to imagine it becoming memory unsafe quickly! Also note that inlining only helps to some degree, we’ll always have (due to separate compilation and the #[target-feature] attribute) this scenario in one way or another.

So this all sounds clearly bad, what can we do about it? My thinking is that rustc is the one generating all these call instructions. Namely we can know:

  • What types differ in ABI-passing depending on CPU features (e.g. u64x4 is a “simd type”)
  • What cpu features are enabled for the caller, which in this case is bar
  • What cpu features are enabled for the callee, which in this case is foo

Given all this information, the compiler can detect that the the function foo called by bar uses a type u64x4 which is changing ABI, and hence will not work. The compiler can take one of two options here. First it could emit an error saying that this is invalid. Next it could also do something like insert a shim like so:

#[target_feature = "+avx2"]
fn bar() {
    foo_shim(&u64x4::splat(0))
}

#[target_feature = "-sse2"]
fn foo_shim(arg: &u64x4) {
    foo(*arg)
}

#[target_feature = "-sse2"]
fn foo(arg: u64x4) {
}

Notably the shim, foo_shim, has the same ABI as the callee, in this case foo. Additionally all arguments to not rely on SIMD registers, for example in this case the argument is passed by reference (it’ll be stored on the stack).

So while this works it’s obviously a performance hit, and probably not intended at all for SIMD usage. I’d personally be in favor of starting off with a hard error here and then taking the shim route if we really need it. This strategy has a few consequences, however:

  • Errors show up during monomorphization. This is very rare for Rust where we avoid this as much as possible. This means that the crate-at-fault could be way upstream in your dependency graph and you have little-to-no recourse to fix it yourself. The saving grace here, however, is that SIMD types as arguments tend to be very local to a crate and don’t propagate much. Functions that do actually return or take SIMD types are often #[inline] which means the compiler has more leeway as well. Again though I also think the errors here can be a stopgap to a solution if we need it. You’re almost for sure doing something wrong if you hit this error, and likely want to be notified of it anyway.

  • I think this means that all SIMD types are basically banned from FFI. We won’t know how the callee is genreated, so we don’t know what ABI it’s using to expect the SIMD argument (or returning it). This is sort of like how many types just aren’t FFI safe today though, so there may not be much to worry about here.

The overall tl;dr; though is that this issue (a) originates in LLVM translation, (b) should be detectable by rustc, and © we can statically rule out this ever happening. So faults or problems with “abi passing problems” with functions that don’t match on #[target_feature] I believe is eliminated. This only leaves us with the “is it safe to execute an unknown instruction” question, which I detailed more above.

Let me know though if any of that doesn’t make sense!

5 Likes

This detail helped tremendously. Before I thought it was a given that something as low-level as #[target_feature] would have to be unsafe without some heroic efforts that should not block an initial RFC. But now this error-detection approach sounds feasible enough that we probably should try to rely on in the initial implementation.

What I wanted to ask was how you imagine this error detection logic actually working. I assume you’re not proposing that we require rustc to literally know what sets of registers LLVM would use for each SIMD type on each architecture in order to detect these errors. Is the idea that every #[target_feature] annotation corresponds to a set of allowed features, and foo() can only call bar() if bar’s “feature set” is a superset of foo’s? Since #[target_feature] can both add and remove features, does that mean there’s a “default feature set” all code assumes by default, and the annotations are effectively editing that default set?

Sure! I'd imagine something like this:

  • Metadata for each crate records -C target-cpu and -C target-feature. Additionally the metadata for each function records #[target_feature], if present.
  • All logic goes in rustc_trans and happens during translation and/or monomorphization.
  • This check happens whenever rustc starts translates a function call
  • If none of the arguments are a type tagged #[repr(simd)] then everything is ok, no need to take action
  • The caller has its own set of {-C target-cpu, -C target-feature, and #[target_feature]}
  • The callee's set of the above is loaded from metadata
  • If the caller and the callee don't match, an error is emitted, otherwise translation continues.

This is a pretty coarse error but I think it'll work. Almost all code passing around SIMD types will be compiled with the same "set" of pieces above and will be exactly the same. This could also perhaps one day be like a true set relation of some form. The compiler also uses #[repr(simd)] as a proxy for "weird registers will be used for this" which I believe works out in practice here. Types like u64x4 are tagged #[repr(simd)] and I believe we don't intend on stabilizing #[repr(simd)].

Does that make sense?

Oh wow, I thought "target features" were broader than just SIMD instructions. If that's all they are then this makes total sense.

Or there are other features, but the SIMD ones are the only features that can lead to ABI mismatches?

Just to check, you're only ruling out user-defined SIMD types here, and we do intend to stabilize the "primitive SIMD types" like u64x4, right?

I think @alexcrichton might have already clarified this, but it's important: certainly some vendor intrinsics are unsafe---for example, any that accept a raw pointer---but most of them are completely safe. For example, _mm_max_epu8 takes two u8x16 vectors and returns a u8x16 vector. No memory unsafety is possible if and only if #[target_feature] itself is guaranteed to be memory safe.

Correct. (And @alexcrichton is correct that we don't intend to stabilize repr(simd).)

This is a really good point. Not every vendor intrinsic actually accepts or returns a SIMD vector. For example, the _mm_crc32_u64 intrinsic accepts two u64 values and returns a u64. But couldn't the compiler decide to use the SSE2 ABI and pass both parameters in a single xmm0 register? It sounds like @alexcrichton's solution still works, but you need to remove the "only do the check if the function has SIMD types in its signature" condition.

@gnzlbg

For that we need to reliably be able to detect target features at run-time.

Indeed, but rustc already needs to be able to do that somehow to implement -C target-cpu=native. Having that already and then pushing the same problem onto the user to solve independently doesn't seem ideal.

This will introduce a run-time cost that not anybody wants to pay

I'd say the opposite. I don't really see a situation where you want to use #[target_feature] but don't want to do runtime detection and have alternative branches. If you're fine with your binary just crashing if the feature is not available just enable the feature for the whole compile and be done with it, fine grained control doesn't really seem worth it.

A good question! AFAIK yes because this isn't a problem in practice for anything else, but I couldn't speak to the exhaustiveness of that.

Oh keep in mind that target-cpu=native is a compile time option, not a runtime option. My guess is that LLVM doesn't do much fancy for target-cpu=native running on architectures like ARM or MIPS. While local feature detection is pretty easy on x86 I believe it's, for example, not easy to do at all on ARM (read: platform-specific).

I agree with @gnzlbg that we cannot inject runtime detection in the code, that's going to need to be the job of third-party libraries.