Pre-RFC: making unsafe more safe to use


#1

Introduction

I just want to suggest this here first and discuss it with you before starting to write a full RFC for that.

We all know that there is this little dangerous keyword named unsafe.

Officially unsafe means something like:

I, the programmer who wrote this have looked at every single aspect of this code and I am 120% sure that this is working as it should and will don’t introduce any memory unsafety or similar problems.

After having read this thread, I am sure that there is a whole lot of work before us to get this thinking into peoples minds.

For the lazy ones, the particular code in question is the following (have a look at the lifetimes):

fn fast_str(v: &[u8]) -> &'static str {
    let x: &str = unsafe { mem::transmute(v) };
    x
}

The Change

Behaviour

What I want is that the Rust compiler itself reminds the programmer to look at what they’ve done and question their own code.

Every time you write an unsafe-block or function, the compiler asks you to:

  1. read the above definition of unsafe
  2. look at the code again
  • should probably also print the following checklist
    • is there any existing solution
    • are your lifetimes correct
    • is there a possible memory leak
    • is the unsafe even needed
    • to be expanded
  1. write a summary comment for why this unsafe is needed and what the code within does
  • only ask for that if there is no comment there yet
  • write the comment directly into the source code
  1. the checksum of your unsafe block is calculated
  • this checksum is written right after the comment inserted in step #3
  • if needed, overwrites old checksum

These steps are only gone through if you are either

  • missing the comment
  • the checksum is missing or has changed

These steps are needed to compile the code.

Result

This has the following consequences:

  • new users not used to Rust are immediately introduced in what it does
  • new unsafe blocks are not silently introduced
  • all unsafe blocks are commented so you can look up what they are actually doing
  • encourages people to refrain from using it if possible
  • experienced programmers still write comments for those parts
  • experienced programmers are encouraged to check their code
  • unsafe code is (most likely) not accidentally modified

Heads Up

The programmer does not need to do anything if:

  • there is no change of code

Review

This might be a good or a bad idea, I don’t know, so please help me to work this out a bit more so it is enough for an RFC to be proposed.


#2

This increases the amount of “bureaucracy” to write unsafe code, but it doesn’t make it safer.

For me making the unsafe code a little safer means something very different:

  • Improving the type system and standard library to reduce the amount of unsafe code you have to write;
  • Adding (static) type system parts that work inside the unsafe part of the code too, to avoid bugs;
  • Adding automatic (and/or on-demand) run-time checks inside the unsafe code in debug builds (like in Cyclone and the many sanitizers that Clang now offers to C++ code) to help catch some bugs at run-time during the debug builds.

#3

This could and should be implemented as an out-of-tree lint. Now that attributes on expressions have landed, they could be used instead of comments.

https://github.com/kmcallister/launch-code contains some old code (not using expression attributes, but attrs on the function)


#4

I’m greatly against this, or at least as a lint that could be disabled.

Why so ? Very simple: my wayland-client crates contains more than 3000 lines of FFI code, generated from XML files, with maybe half of them containing the unsafe keyword.

How would I be able to even compile my lib with such a huge burden ?


#5

The Rust compiler is already patronizing enough (though, in most cases, for good reasons); lecturing people on the follies of unsafe and requiring them to write an essay explaining the presence of each unsafe block is just excessive. This can be enforced pragmatically on a project-by-project basis or implemented as an out-of-tree lint like others have suggested…


#6

Sorry for the misleading title, I just couldn’t come up with a better one.


#7

Okay. I agree with you. I didn’t think of that before.

Thank you for your advice.


#8

I agree that it should be possible to disable this on a per function/file/crate basis, similar to how you can disable lints.


#9

I, the programmer who wrote this have looked at every single aspect of this code and I am 120% sure that this is working as it should and will don’t introduce any memory unsafety or similar problems.

Why does unsafe mean this? People argue that unsafe means this, but it should mean something more sinister, in my opinion…

I, the programmer, need to do something outside the boundaries of what Rust can guarantee. The following code is at least partially unchecked by the compiler, and should therefore be met with the most scrutiny possible.

The first definition basically says ‘Hey guys, I’m the programmer and I say this works!’ But, like C, has a chance of being incorrect. The second one properly states the intent of unsafe, in my opinion.

also, -1 to this proposal, for reasons the same as reply #1


#10

I’m not a fan of a programmer-checked checklist. Every time I write a bug I think I know what I’m doing :slight_smile:

Perhaps unsafe could be made safer by splitting it into narrower, less unsafe variants of unsafe for specific purposes, e.g. unsafe for calling foreign functions, unsafe for transmuting/casting, unsafe for unchecked bounds access, etc. i.e. allow only unsafe features that programmer explicitly asked for, so the other unsafe actions can still be caught.


#11

To deter thoughless unsafes, there may be something like use unsafe or explicit #![allow(unsafe)] in the file (module, crate) header.

Proposing to add big checklist in compiler output and require to insert checksums before any unsafe{} block feel like an “1 April RFC”. It may look funny the first time, not N’th.


#12

That feels rather bureaucratic, too. Also, should these be user-classifiable?

On the greater scope of this proposal: I’m against it, I would be writing “I’m calling a C function marked unsafe, duh!” about a hundred times.