Optimization barriers suitable for cryptographic use

The current wording forbids cryptographic use, even in cases where it is probably the best choice for getting the codegen we want, when we are manually inspecting the output.

There seems to be some confusion where compiler authors are interpreting "not forbidden for cryptographic use" as "provides guarantees for cryptographic use".

I'm not asking for guarantees. I'm asking for the documentation to not forbid use in cryptographic code.

I want the connotation to be "you are on your own and solely responsible for using this correctly" as opposed to "never, ever, ever use this for cryptography".

I would appreciate if you could incorporate some nuance in your wording.

There's a distinction between "the current wording could be misinterpreted" vs. "the current wording actually forbids".

I have made the case that it is only the former and have not heard any refutation.

1 Like

It says "This immediately precludes any direct use of this function for cryptographic or security purposes".

That sure sounds like it forbids cryptographic use to me.

3 Likes

"precludes use for purposes" != "precludes use"

All the words in a sentence matter, not just the most salient ones. It just means you cannot use it achieve some security purpose. You can still use it, it just won't have any effect or at worst even a negative effect towards achieving that purpose.

For example if we implement blackbox as

fn black_box(val: u64) -> u64 {
   static ESCAPED: AtomicU64 = AtomicU64::new(0);
   // extremely unlikely, easy to branch-predict, so doesn't impact benchmarks much
   if val.rotate_left(23).wrapping_mul(0x121a8b7c677d0175) % 0xfe040f2bb4f27e3c == 0 {
      // not a great barrier, but at least forces the compiler to materialize the intermediate result of a computation
      // best we can do on this platform
      ESCAPED.store(val, Relaxed);
   }
   val
}

which might be better than nothing for benchmarking but it would be very much unfit for cryptographic use.

And this is still allowed with the wording on nightly.

So yes, precluded for security purposes, or even contraindicated (even though we have not written that). Unless you ignore the API contract, the possibility of future incompatibilities, pin the compiler, inspect the output to ensure it does what you want. Which is what you have been doing all along. But, as I have argued repeatedly, if you're already doing all that then you're ignoring everything the API contract says and its language does not matter.

There are two worlds:

A) You write Rust™ source code, only make use of behavior explicitly guaranteed by the specifications and API contracts and can rely on the compiler performing a semantics-preserving transformation from your input to machine code. The semantics are also guaranteed to be preserved by future compiler versions, on different platforms or with different optimization settings.

B) You shovel arbitrary inputs into rustc, get some outputs, check if the outputs have the desired semantics. Making use of any existing API documentation is merely an optimization, a search-guiding heuristic to find inputs that get rustc to produce the desired output, beyond that they're irrelevant and can be ignored. The downside is you may need to re-execute the search when targeting different versions or platforms.

Everything the black_box documentation says is entirely correct in world A). By operating in world B) you can already ignore it anyway, therefore its words are meaningless.

In neither world a change is necessary.

3 Likes

I don't think rewording the sentence as:

This immediately precludes any direct use of this function for cryptography or security.

changes anything about the meaning over:

This immediately precludes any direct use of this function for cryptographic or security purposes

Or in other words, the mere inclusion of the word "purpose", at least to me, does not meaningfully change what's expressed in the sentence, but perhaps that's my misinterpretation.

I think what you're describing is closer to "fitness for purpose".

At least we have found the crux.

But I don't think chopping off words of sentences and then assuming it doesn't alter their meaning is a good way to read specification texts.

Especially if you take two preceding sentences into account which talk about correctness and relying on it to control program behavior. The sentence in question is a corollary of those.

To me this seems like a case of selective out-of-context quoting to to create a more scary thing than what it actually says.

Indeed, I have already said so earlier.

1 Like

The quote says "This immediately precludes any direct use [...]", so it's not a ban on any use. It's just saying it doesn't provide cryptographic security directly. You're not using it directly to get any guarantees, you're using it as a heuristic, and the real process is inspecting the assembly output.

But a change to the wording has already been merged anyway.

5 Likes

Indeed, that's a classic rule of Statutory Interpretation,

If possible, every word and every provision should be given effect. None should be ignored and none should needlessly be given interpretation that causes it to duplicate another provision or to have no consequence.

https://www.law.georgetown.edu/wp-content/uploads/2018/12/A-Guide-to-Reading-Interpreting-and-Applying-Statutes-1.pdf

1 Like

What makes a "direct use" different from a "use"?

I continue to maintain that "cryptography or security" vs "cryptographic or security purposes" do not have a meaningful distinction, thus adding the word "purposes" does not meaningfully change the definition of the sentence. But I guess we'll have to agree to disagree on that one.

I'd say this is all moot given the changes to the wording that have been merged, which I agree with, but it seems like they're being potentially re-litigated, which is a bit worrisome to me.

3 Likes

A direct use for cryptographic purposes would be where some cryptographic properties are provided by the primitive. In other words, if your proof of cryptographic security includes "I used this primitive and therefore this code has the following cryptographic property".

If you happen to use it in the code but your proof of cryptographic properties doesn't rely on the properties of the function, that's not direct use for cryptographic purposes.

5 Likes

That is a particularly idiosyncratic and non-obvious definition of the word "direct".

Is there a document that outlines terms like "direct" and "purpose" being defined in this manner? I'd look to something like RFC2119 as such a document for defining terms in such an idiosyncratic manner.

Drawing from that RFC, and based on my understanding of how people are defining terms in this thread, I believe I can draw the following conclusions, but I'm not sure they actually make sense:

"black_box MAY be used for cryptography or security, but black_box MUST NOT be used for cryptographic or security purposes"

"black_box MAY be used for cryptographic or security purposes, but black_box MUST NOT directly be used for cryptographic or security purposes"

I greatly understand and strongly sympathize the desire for precision here, as such precision is a mandatory prerequisite for correctly implementing cryptography. But when I look at the statements above, I do not think they are clearly delineating the intended properties.

A lexicon/glossary of the specific terminology, especially idiosyncratic usages of words where the intent is to impart more than the colloquial meaning, would be very helpful (i.e. a document containing canonical definitions, as opposed to people giving ad hoc definitions on a forum)

5 Likes

I think the previous wording was at best ambiguous. You are not supposed to rely on using black_box to directly provide any strong cryptographic/security properties, but it's not forbidden from use as part of obtaining such properties. The strongest factor is that everything in the docs about how to use functionality that is not in a # Safety section is an implied SHOULD, and violating such requests is at worst a class of "safe misbehavior" appropriate to the violation.

Restated more formally: You MAY use black_box for its codegen side-effects (i.e. as an optimization barrier). The side-effects MUST NOT impact any behavior considered observable by the Rust AM (i.e. it does not remove UB). You MUST NOT blame the Rust project if the side-effects change in a way which impact secondary codegen properties of your code.

The new wording is improved. It's difficult in general to document "weak" properties of an API well, even when it's much more straightforward (e.g. what can happen when ordering properties don't properly hold) than black_box and properties the compiler pipeline considers immaterial. It'd be incorrect to even call the properties "best effort," imo; it's in effect much closer to "reasonable effort" to behave as usually expected.

4 Likes

Can you link those term rewriting systems that can provide similar guarantees?

Most of the ones I know of are concatenative languages (like cat and kitten) which took inspiration from Joy. The classic reference here would be Manfred von Thun's A Rewriting System For Joy.

2 Likes

where is this specified? does this only apply to rustc, and not any of the things that rustc calls? would this not be broken by standard instruction reordering done by most assemblers?

does this also apply to LTO?

Yes

It doesn't apply to the assembler level. This rule exists for making self-modifying and self-inspecting inline asm possible.

https://doc.rust-lang.org/reference/inline-assembly.html#rules-for-inline-assembly

The compiler cannot assume that the instructions in the asm are the ones that will actually end up executed.

  • This effectively means that the compiler must treat the asm! as a black box and only take the interface specification into account, not the instructions themselves.
  • Runtime code patching is allowed, via target-specific mechanisms.
1 Like

I don't understand what you think that would change. An empty asm block provides no guarantees for what optimizations will do to the surrounding code. Specifying and reasoning about asm blocks is extremely complicated, so documenting black_box in terms of asm is trying to explain something subtle in terms of something way worse; it is not going to help people understand black_box -- it likely will mean people think they understand it because they think they understand asm, but they are probably wrong.

This couldn't be further from the truth. The libs docs authors very much understand that distinction. But we are concerned that users reading the docs will not understand what "no guarnatees" means -- some people get very creative when interpreting docs. So we explicitly tell them "if you think about using this for crypto, stop, because we can't promise it does anything there". In that sense the function is unfit for cryptographic use. People that know what they are doing can always decide to use it for cryptographic code anyway, we can't stop you -- but we want to make sure that if this fails to block some optimization, you understand that it is your responsibility to tweak the code further to avoid that.

When choosing between docs that confuse experts and docs that confuse novices, I think we should pick the latter. Ideally we confuse nobody, but it seems that is hard. I had not expected the docs to stop a crypto expert from doing anything, as crypto experts know and understand that this doesn't actually guarantee anything, and they will validate the output assembly so whatever the docs say is entirely irrelevant -- if the output is good, all is fine.

It is an often-repeated advice to say "never ever roll your own crypto". And yet of course crypto experts roll their own crypto. You should interpret these docs the same way. You could say "rolling your own crypto is hard and you are likely going to get it wrong", which is more nuanced and more correct, but it is less effective at preventing people that don't know what they are doing from rolling their own crypto. Sometimes you have to engage in a bit of hyperbole to make sure people don't hurt themselves. Crypto experts do it, and libs doc authors do it, too.

14 Likes

It seems a perma-unstable core intrinsic that performs something along these lines has appeared: Add `select_unpredictable` to force LLVM to use CMOV by Amanieu · Pull Request #128250 · rust-lang/rust · GitHub

It leverages this work which adds "unpredictable" metadata to selects ensure that the X86CmovConversion pass does not rewrite predication instructions with branches: ⚙ D118118 [SDAG] Preserve unpredictable metadata, teach X86CmovConversion to respect this metadata

We've tried to implement something similar in the RustCrypto cmov crate using a combination of inline assembly with a portable fallback based on masking (where the latter is the source of problems this thread is asking for solutions to): crates.io: Rust Package Registry

I sure hope there is eventually some stable API for such functionality.

3 Likes

If it were to be stabilized it'd probably be a hint too, so no security guarantees, similar to black_box.

1 Like