[Pre-RFC] always_panic feature and lint rule

dingelish · April 8, 2019, 11:01pm

Update: #[stub(...)] seems to be more popular, such as

#[stub(warning="This always returns 4, determined by fair dice roll"]
#[stub(fail="This function always panic and should not be called")]
#[stub(info="This function process privacy input. Please be careful")] And #[allow(stub...)] or similar stuffs still need.

Hi all,

I would like to propose a new feature gate always_panic as well as a lint rule to check calls to unsupported/unimplemented or similar functions.

Feature Name: always_panic
Start Date: 2019-04-08
RFC PR: (left empty)
Rust Issue: (left empty)

Summary

Add a new feature gate always_panic works on unimplemented/unsupported functions to trigger compile-time warnings.

Motivation

Compile-time warnings are always better than runtime failures. However, with the current design of libstd, we may encounter inevitable runtime panics without compile-time warnings. For example, on target wasm32-unknown-unknown, we can compile the following code without warnings:

#[wasm_bindgen]
pub fn greet() {
    let _ = std::fs::File::open("foo.txt").unwrap();
    alert("Hello, wasm-game-of-life!");
}

It always compiles perfectly, but turns out to panic due to fs is unsupported in wasm32-unknown-unknown. Similar panics would happen on platforms have unsupported features, such as sgx does not support process, wasi does not support net, etc.

With the proposed new feature, rustc would generate compile-time warnings on invoking such always_panic functions, similar to the feature gate deprecated. This will make Rust one step closer to if it compiles, it works.

Guide-level explanation

The proposed feature gate is pretty straight-forward. The one who designs platform-specific codes (such as codes under libstd/sys) may add this feature to unsupported functions or not yet implemented ones. And others could help on tagging existing codes as well. One concrete example of libstd/sys/wasm/thread.rs is as follows:

impl Thread {
    // unsafe: see thread::Builder::spawn_unchecked for safety requirements
    #[always_panic]
    pub unsafe fn new(_stack: usize, _p: Box<dyn FnBox()>)
        -> io::Result<Thread>
    {
        unsupported()
    }
}

Then if rustc meets the following code:

#[wasm_bindgen]
pub fn greet() {
    let _ = std::thread::Thread:new(......);
    alert("Hello, wasm-game-of-life!");
}

rustc would generate a warning like:

warning: use of always_panic item 'std::thread::Thread::new'. Please check if it is needed.
  --> src/lib.rs:22:5
   |
22 |     let _ = std::thread::Thread:new(......);
   |             ^^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: #[warn(always_panic)] on by default

This change may cause many compile-time warnings during the porting of a crate. In the past, the ported crate would compile smoothly. With the proposed feature gate and properly tagged functions, when porting a crate with #[deny(warnings)] attribute, the compilation would fail. This would alert the developer about the incompatibility of this crate and the target platform.

Reference-level explanation

The proposed solution may need the following changes of Rust:

A new feature gate in libsyntax. Its template may be deprecated:

 552     // Allows `#[deprecated]` attribute.
 553     (accepted, deprecated, "1.9.0", Some(29935), None),

One implementation could be:

// Allows `#[always_panic]` attribute.
(accepted, always_panic, "1.xx.0", Some(_), None),

A default lint rule in librustc.

Since this feature works like a built-in lint rule, we should add it to librustc. In librustc/lint/builtin.rs:

declare_lint! {
    pub ALWAYS_PANIC,
    Warn,
    "detects use of items which always panics (unsupported or unimplemented)",
    report_in_external_macro: true
}

report_in_external_macro: true here is the same as feature deprecated, intended to warn the developers about more potential incompatibility.

Whether to keep the implementation inside librustc, or put it in librustc_lint remains a question. I cannot fit it into any lint groups of "nonstandard_style", "unused", "rust_2018_idioms", "rustdoc" defined in librustc_lint. So I intend to put it in librustc, if possible.

Basically, it works similar to the deprecated tag, but irrelevant to stability issues.

always_panic should be orthogonal to existing feature gates because it adds additional semantics to the compiler.

Drawbacks

Potential drawbacks would be

Increasing the number of compile-time warnings/failures on some platforms with limited sys support.

Rationale and alternatives

Why is this design the best in the space of possible designs?

Basically, it is a step we can take to get closer to 'if it compiles, it works'.

Rust is being used in more and more platforms which provide limited or no support to net/process/thread etc. Basically, there are ways to solve this: (1) Re-factor libstd to make it's mods optional. This seems not friendly to existing Rust ecosystems. (2) 'std-awared' cargo would solve this problem in another way, but left stealthy in-compatibility on upstream targets. (3) The proposed always_panic feature gate, which would solve this gently and neatly.

From my experience of security code auditing (~5 years), I think this feature and its checker is similar to MSVC's checker on dangerous functions (such as strcpy), and also a bunch of security tools provides rules against CWE-676 Use of Potentially Dangerous Function. Though panic is not dangerous, it still hide the incompatibility between crates and target platforms which would lead to unexpected behavior. As a compiler, rustc has the ability to prevent this and this approach would increase the trustworthiness of Rust much.

What is the impact of not doing this?

Without the additional semantics of always_panic, we could never know if a crate is compatible to a platform -- until it triggers a runtime panic, maybe once in a week and we don't know what happened. It's better to get aware of such problem from rustc during compilation.

Prior art

Discuss prior art, both the good and the bad, in relation to this proposal. A few examples of what this can include are:

For language, library, cargo, tools, and compiler proposals: Does this feature exist in other programming languages and what experience have their community had?

Microsoft's Security features in the CRT
Coverity's checker on "CWE-676". I cannot find it's document which is open to all.
sgx_tstd of rust-sgx-sdk, and Parity's pwasm-std . To eliminate such uncertainty of calling unsupported functions, these projects removed unsupported functions from their standard libraries to prevent unpredictable runtime panics.

For community proposals: Is this done by some other community and what were their experiences with it?

Don't have an answer yet.

For other teams: What lessons can we learn from what other communities have done here?

Papers: Are there any published papers or great posts that discuss this? If you have some relevant papers to refer to, this can serve as a more detailed theoretical background.

Rust SGX SDK: Towards Memory Safety in Intel SGX Enclave This paper is accepted as a poster on the SIGSAC's Computer and Communications Security conference, which is a well-known 1st tier system security conference. To eliminate such stealthy incompatibility, the standard library sgx_tstd removed the unsupported mod to generate compile-time failures.

Unresolved questions

What parts of the design do you expect to resolve through the RFC process before this gets merged? What parts of the design do you expect to resolve through the implementation of this feature before stabilization? What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC?

Future possibilities

Another dimension of trustworthiness/certainty

Rust seems to be the best language for production-level trusted computing. To accommodate trusted computing with Rust, we need compile-time trustworthiness/certainty semantics. always_panic is a start which alerts between incompatible codes and target platforms. In future, we may have untrusted_function to annotate functions brings uncertainty/untrusted inputs to the environment, such as timing in Intel SGX, more specifically, the future mesatee-sgx and mesatee-optee target (soon available maybe next months) which aims at providing hardware-assisted trusted execution environment with Rust's help. Additional features would help us achieve higher-level of trustworthiness and get away from uncertainty much more effectively.

With Trusted Computing, the computer will consistently behave in expected ways, and those behaviors will be enforced by computer hardware and software.

(Chris Mitchell (2005). Trusted Computing. IET. ISBN 978-0-86341-525-8.)

comex · April 9, 2019, 1:33am

, but “always_panic” seems a little too specific. It seems like this would also be useful for functions that are “stubbed out” in a way other than panicking, e.g. always returning the same value or always returning an error code, so perhaps it could be named something like #[stub]. Alternately, perhaps it could be a general-purpose “warn if called” attribute that doesn’t assume any particular reason for the warning.

Prior art:

GCC supports marking a function deprecated using __attribute__((deprecated("explanatory text"))), which produces a warning if the function is used, but it also has more generically-named attributes: __attribute__((warning("explanatory_text"))) and even __attribute__((error("explanatory text")).
Clang is even cuter and supports a warning attribute which can be conditional based on the arguments passed; example from the docs:
```
int abs(int a)
  __attribute__((diagnose_if(a >= 0, "Redundant abs call", "warning")));
```

Lokathor · April 9, 2019, 2:22am

We can probably start small with just a no-reason version (gives a generic warning) and a version with a message (gives that message as the warning message if called).

#[stub]
#[stub(warning="This always returns 4, determined by fair dice roll")]

dingelish · April 9, 2019, 4:59am

Thanks @comex, @Lokathor !

I really like the idea of #[stub(...)]! It seems more generic than my first proposal. I think it makes much more sense.

@comex Could I add your ‘Prior art’ to the Pre-RFC document? Thanks!

zackw · April 9, 2019, 1:32pm

I also really like #[stub(...)].

Would it make sense to infer #[stub] for fns that do nothing but invoke unimplemented!() ? That might reduce the amount of churn involved in annotating the stdlib with this attribute.

CAD97 · April 9, 2019, 2:06pm

Along the same lines, should we infer #[stub] for functions that only call one other #[stub] function? That call only #[stub] functions? That call a #[stub] function at all?

The former is useful for e.g. a library that uses their own #[stub] fn unsupported() -> ! that gives a more useful message than unimplemented!(). The latter is useful to propagate stub warnings through ease-of-use APIs that call the stubbed functionality.

I realize I’m kind of slippery-slope-ing the inference here. But I think all of these variants of inference make sense. (And it’s just used for better diagnostics, right?)

dingelish · April 9, 2019, 9:27pm

#[stub(...)] style seems to have a limitation: it may not pair to #[allow(...)] effectively. How to solve it?

scottmcm · April 9, 2019, 9:44pm

I think this is solving the wrong problem. We have a way to specify always_panic: -> !

So really I think the problem is that the stubs exist, and thus the actual domain is that of the std facade or the portability lint.

comex · April 10, 2019, 1:32am

Sure.

dingelish · April 10, 2019, 6:34am

Hi @scottmcm, I think these two ways are different.

I think -> ! means the type of a return value is !. Besides, the bang type also tells the compiler about control flow termination and helps on dead code detection. It’s something like a “syntax-level” helper.

Here we want to solve a problem: fixed type functions (the type of its return value and arguments are cannot be modified) are not supported. It’s another semantics other than return value. It seems more like a “semantic-level” helper.

ckaran · April 10, 2019, 1:58pm

I actually really like your inference idea cascading through all the functions! Assuming that there was added support for it, you could then do a reverse topological sort to figure out a ‘bottom up’ order for implementing/testing your code. That would allow nearly instant agile top-down design, with bottom-up instantiation of code. E.g., you write out the docs & stubs for a pile of functions/types, run the (hypothetical) command rustc --infer-stubs | tsort | tac or cargo --infer-stubs | tsort | tac, and you instantly know what the next function/class you need to expand is. Keep expanding & running rustc --infer-stubs | tsort | tac to see what priority your work queue has to be in. If there was some way of adding in support for doctests, or RLS, that would be an amazing productivity booster!

scottmcm · April 10, 2019, 1:58pm

But why is the type of the function fixed? Why does it even need to be provided? Why can't it be cfged out? If it's just because of stability, then it seems like the answer is "deprecate it and add something new", as is the normal solution to "we have something incorrect".

RustyYato · April 10, 2019, 2:04pm

This is meant for functions that are defined but not yet implemented, and will be implemented with the given signature. This is not for the final public api, rather for intermediate work.

So cfging them out is counter productive, as we want to be able to call these functions so we can write out more important logic first, then get to the details specified by these stubbed out functions.

Centril · April 10, 2019, 5:27pm

It would be one thing to have rustc_always_panic as a hack only the standard library can use. (and notably this does not even require an RFC) However, adding always_panic for intermediate work as a substitute for std-aware cargo and such doesn't seem justified.

dingelish · April 10, 2019, 5:49pm

Such cases exist in libstd, such as Condvar in target wasm. Current design of libstd does not allow us to cfg anything out.

dingelish · April 10, 2019, 5:54pm

Hi @Centril,

The first idea of always_panic seems be too narrowed – #[stub(...)] seems to be more reasonable and provide more information to help developers.

Another good usage of #[stub(...)] is to mark up functions directly processing sensitive data, such as user privacy. In many cases we want such functions do not return/propagate privacy and the tag propagation can help us a lot.

For the std part, I think #[stub(...)] cannot replace std-aware cargo, and vice versa. If we have a full functional std-aware cargo, we can easily use any crates as std, but we cannot make the built-in std more safe, or say “if it compiles, it works”.