Half-baked idea: postfix/monadic `unsafe`

BatmanAoD · May 15, 2019, 8:50pm

Niko raises an interesting point about unsafe in the ongoing discussion about await:

This makes me wonder if a variant postfix syntax for unsafe that takes an expression rather than a block might be useful; building on that, as an analogy to futures, I wonder if it might be reasonable to support a monad-like way of performing unsafe operations as some kind of first-class citizen, permitting the creation of special unsafe closures.

An example

My knee-jerk response is that this would not be a good idea, because it would obscure unsafe code somewhat, and because it would separate the unsafe code itself from the location where the keyword is used. But just as a thought experiment, here's what it could look like, introducing a dummy keyword and type:

fn read_mem(ptr: *const u32) -> Unsafe<Fn()->u32> {
    ptr.unsafe(core::ptr::read)
}

Here, unsafe is a postfix keyword (maybe?) that creates an object of type Unsafe. The keyword operates on an expression and takes a callable as an argument; the callable must be able to accept the expression as an argument. Unsafe itself is the "unsafe monad", which supports an interface something like this:

fn get_val(op: Unsafe<impl Fn()->u32>) -> u32 {
    op.do_unsafe
}

Here, do_unsafe is a postfix keyword, like .await.

Note, of course, that Unsafe<FnOnce> and Unsafe<FnMut> would also exist.

If the user wanted to execute the unsafe bit immediately instead of constructing an Unsafe object and postponing the unsafe operation, then the keywords could be used in conjunction, like so:

ptr.unsafe(std::mem::read).do_unsafe

Here is some Playground code demonstrating, essentially, the "desugared" version of the above example (with an extra unsafe that wouldn't be necessary if this feature were adopted).

RustyYato · May 15, 2019, 11:01pm

I don’t think that it is a good idea to generalize unsafe like this. unsafe works on the idea that you must uphold some rules, when you generalize unsafe like this you lose that ability to uphold the rules.

The main way to limit how many unsafe blocks you have (which is a horrible metric of how unsafe some code is) is to move the unsafe blocks to the construction of the values, rather than where the values are used. (See the Pin api for an example of this)

Also your playground example is horribly broken. Even if we implement this feature, your example is broken. Can you please provide a better example as motivation for this feature.

I cleaned up the api in your playground example a bit and added FnOnce and FnMut

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=d624d00eb5e2d1a82d48f4779fbafb6d

BatmanAoD · May 16, 2019, 12:36am

How so?

I agree that's not a good metric of anything, and I'm also not convinced it's a desirable goal; why are you bringing it up?

Sorry, but I don't actually know what's broken about it; why do you say so?

The example isn't intended as motivation. I don't actually see a compelling motivation, which is why I wrote:

If anything, the main potential benefit I see from even thinking about this syntax is to explore why .await seems good but .do_unsafe doesn't.

RustyYato · May 16, 2019, 12:58am

You can use it to cause UB in safe code, but given

I don't think it matters for this discussion

Because the unsafe code depends on implementation details and you can't really generalize over implementation details.

Looking back, I'm not sure what was going through my head when I wrote that

BatmanAoD · May 16, 2019, 1:11am

Do you mean that be stand-alone program I wrote invokes undefined behavior as-is, or that the interface to the unsafe monad isn’t safe? If the latter, the point is for do_unsafe to replace unsafe, so that’s by design.

RustyYato · May 16, 2019, 2:23am

the latter, and if the interface isn’t safe, shouldn’t accessing the value inside Unsafe<_> be marked unsafe?

I don’t quite see how this parallels async

BatmanAoD · May 16, 2019, 3:44am

You can’t access the value inside Unsafe directly; you can only invoke the operation and access the result value. The hypothetical postfix-keyword do_unsafe would be how the access is “marked”.

The relationship to .async is that both are postfix keywords that perform some operation that can’t be done normally. .await modifies control flow, while .do_unsafe (which, like .await, wouldn’t actually be a method call or have parens) evaluates a single unsafe expression.

RustyYato · May 16, 2019, 3:51am

Ok, but how so you convey the rules that the unsafe is supposed to follow? For example in the code you provided, you must guarentee that the *const u32 is aligned and points to some 4 bytes of allocated memory (either on the stack or heap) at the very least, but by wrapping it in Unsafe<...> you lose that documentation.

There is a reason that we don’t see unsafe fn(...) -> _ in the wild.

I think the hypothetical value.try syntax would be a better comparison.

H2CO3 · May 16, 2019, 8:58am

I think, as often, "Might it be useful?" is not the right question when it comes to adding a feature to the language, or changing an existing one. Almost everything people can come up with can be useful. The right question is, whether its existence results in advantages that outweigh the potential downsides.

What do you exactly mean by "first-class"? The unsafe construct is a language built-in today and can be used wherever any other expression can.

Independent of "first-class-ness", I think it's a very bad idea to make unsafe postfix. I see why people argued for postfix in the case of await, but all those arguments of ergonomy and composability break down here because the very purpose of unsafe is to stand out like a sore thumb. I don't think that introducing more alternative syntaxes for it helps this goal; furthermore, the prefix, block-like syntactic construct encourages just a much more visible and obvious style.

I also strongly disagree with Niko's conclusion about the original intention of unsafe, and with the perceived discord between that and its current usage. There are many legitimate use cases for bundling together several unsafe operations, e.g. it's typical when the code is heavy on FFI calls or raw pointer manipulation. I've seen such code in the wild and have even written it myself. I don't think that "only a single expression within the unsafe" block is the one true style that should exist, and it's certainly not the only one that does exist.

In addition, his last cited point seems to indicate a very dangerous direction:

I'd very much not like to lose the delimiters for this exact reason, and I don't think it needs any sort of "reconsideration".

Centril · May 16, 2019, 12:25pm

While I agreed with the rest of what @nikomatsakis said, I agree with you here entirely. I believe await should be ergonomic and need not stick out so much. But unsafe { ... } should indeed stand out, be well commented, and communicate "WARNING WARNING" because undefined behavior is not just any regular bug.

scottmcm · May 16, 2019, 7:48pm

Two things come to mind for that:

Because it's affecting a block, not a value. This is the same way that it's async { ... }, not { ... }.async -- it's telling you context you need to understand the block. Similarly, { ... break x ... }.loop would be surprising because there wasn't the signpost warning you that the break was coming the way there is in loop { ... break x ... }. (And though I'm probably not in favour of actually doing this for break, note that loop { ... x.break ... } doesn't have the same "surprise" factor, the same way that async { ... x.await ... } doesn't.)
Because async/await is a delayable effect, but unsafe isn't. With async/await, you can call the function to get the impl Future outside an async block, then later choose whether or not to call .await on it. Similarly, you can get your Result, put it in a Vec, and later decide to ? on it. On the contrary, unsafe needs to be "proven" before the call -- you of course can't call the unsafe function from safe code then later somehow only use the result if it was sound. (const also happens like this, where you can't make something const later; it has to be const from the beginning.)

BatmanAoD · May 20, 2019, 4:40pm

The "first class" modifier applies to the whole concept, most especially the monad. And as I said above, I'm not actually advocating for this, I just thought it would be interesting to explore. The comments (yours and others) have persuaded me that my initial instinct that it was a bad idea were correct.

comex · May 21, 2019, 6:13am

As a simpler alternative, how about just allowing the braces to be omitted? So instead of

unsafe { foo.bar(); }

you could write

unsafe foo.bar();

Centril · May 21, 2019, 8:06am

Note that due to backwards compatibility, this would likely need to be added as a new syntactic form in the AST and an exception in the parser to avoid interpreting unsafe $block.foo as unsafe ($block.foo) as opposed to the current (unsafe $block).foo. A similar consideration exists with try $block but in that case we don't have backwards compatibility to deal with.

DDOtten · May 21, 2019, 8:30am

Isn’t is so that the only difference between

unsafe { /**/ }.foo()
// and
unsafe { { /**/ }.foo() }

is that the second is also allowed if foo is unsafe? (I may be missing something with macros) In that case we could interpret unsafe expr in the way that we do unsafe { expr } today. That way i can see people writing

let x = foo(unsafe { y.get_unchecked(2) }, z).bar();
// as
let x = unsafe foo(y.get_unchecked(2), z).bar();

more often even though only get_unchecked is unsafe (this is also possible today but less ergonomic as it requires braces around the entire line). I personally don’t think this is a bad thing as I think one unsafe per line is enough to show you to pay attention and losing the {} increased readability. Also note that a .unsafe is not necessary with this version is not necessary as one unsafe at the start of the line allowes you to do unsafe things the entire line.

Centril · May 21, 2019, 8:37am

Hmm; that’s interesting – in the case of async { ... } and try { ... } it definitely matters how things bind but in this case it might not because you are only widening the scope of the unsafe { ... }. Baring some corner case it could technically work.

DDOtten · May 21, 2019, 8:49am

Currently we already have two kinds of unitary prefix keywords. We have unsafe and soon async that have the form

keyword { expr }

and we have break and return from the form

keyword expr

so moving unsafe to the second form doesn’t introduce an inconsistency. An interesting idea at least.

scottmcm · May 21, 2019, 8:29pm

Related to what I described earlier, there is actually a difference here.

Currently if you have keyword { expr }, it cannot be changed to let x = expr; keyword { x }:

for loop it either does something different or fails to compile if there's a break
for unsafe then it'll no longer compile (if it was needed)
for async it'll no longer compile if the expr includes an await.
for try (on nightly) it'll do something differently (assuming the was a ? in the expr)

If the keyword is break or return, though, let x = expr; keyword x always does the same thing.

So I think there's actually a useful difference.

That said, as prior art, C#'s unchecked keyword can be used as both unchecked { statements } or unchecked(expr).

system · August 19, 2019, 8:29pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Pre-RFC: making unsafe more safe to use language design	12	3254	March 25, 2019
[Pre-RFC] Another take at clarifying `unsafe` semantics	41	4139	March 25, 2019
What does `unsafe` mean? Unsafe Code Guidelines	66	10825	March 25, 2019
Unsafe Blocks / Async Blocks : should they be parsed differently? language design	9	860	April 9, 2023
[Pre-RFC] Single function call `unsafe` language design	31	649	February 10, 2025

Half-baked idea: postfix/monadic `unsafe`

An example

Related topics