Is synthesizing zero sized values safe?

Manishearth · December 18, 2019, 11:19pm

We've deprecated mem::uninitialized, but I'm not sure what the recommended way to synthesize zero-sized types is.

If a type is known to be zero-sized (e.g. a Fn type with no capture) is it safe to initialize an instance of it with mem::zeroed or mem::uninitialized?

Manishearth · December 18, 2019, 11:19pm

cc @RalfJung

mcy · December 18, 2019, 11:27pm

I mean, semantically, I would hope not, especially if you're using a ZST as some kind of scoped witness type. I think it's about as bad as forging a raw pointer.

You probably need to qualify what kind of thing you want to pull out of the aether better than "any zero-sized type".

CAD97 · December 18, 2019, 11:45pm

Of course, for arbitrary types, it's not necessarily sound, as arbitrary types can add arbitrary safety invariants (not validity).

Even without mem::zeroed/uninitialized, you can always mem::transmute any other ZST. The layouts of all ZSTs are necessarily the same.

scottmcm · December 18, 2019, 11:57pm

It's valid to synthesize them. zeroed works, or you can do other things like NonNull::dangling().as_ptr().read() or mem::transmute(()) or ...

It's not safe to synthesize them, as they might be used as tokens for a semaphore or similar.

EDIT: the reply after this one makes the good point that −∞-sized types appear to be 0-sized, and those are not valid to synthesize.

mbrubeck · December 19, 2019, 12:25am

enum Void {} is zero-sized (according to size_of).

Manishearth · December 19, 2019, 1:29am

Yeah, I should have asked if it's UB to operate on them, not if it's unsafe, it should definitely be unsafe to deal with zero-sized tokens and such.

I don't have to deal with never-sized types here, since the crate I'm writing does have access to values of this type before it kicks off the series of events that lead to the value eventually being conjured. The only reason I can't pass that specific value down is that closures aren't serializable and this is cross-process.

bill_myers · December 19, 2019, 3:55pm

The whole approach in the mitosis crate is broken, because there is no guarantee that the closure code exists and is at the same address in the new process (e.g. it could be from a dynamically loaded library that could not be loaded or loaded at an arbitrary address, and the "ASLR mitigation" doesn't work unless you know which library the code is from).

What you should do is simply pass the closure to call in the child to the init function and not to the spawn function, and only let the user specify a serializable message at the spawn point.

If you really insist on putting the code at the spawn point, then define a macro and have it generate a C++ global constructor that registers a bare fn associated with a GUID with a central register, and send the GUID to the child process. It will still not work in the case of libraries that are not loaded, but it will fail gracefully rather than starting to execute code at an arbitrary address.

Manishearth · December 19, 2019, 4:29pm

Uh, that's why the actual pointer being shipped over is a function pointer that I control, not an arbitrary function pointer being created by the user. The user can only pass down zero sized closures and fn item types, which get wrapped in a monomorphized fn that I convert to a pointer. The ASLR mitigation is operating on pointers from the same library.

This is what I was trying to say in https://github.com/Manishearth/mitosis/issues/5

Passing the closure to init defeats the purpose of this crate, this crate needs to be able to call spawn multiple times for different closures.

mcy · December 19, 2019, 8:33pm

Separately, regarding your crate...

I wonder what it would take to make it possible to ship across a capturing, serializeable (FnOnce() -> R), i.e., having a way to decompose a closure into its captures and its body, and to implement traits (like serialization) on the closure captures... I seem to recall a discussion about this on the order of 6-12 months ago??

bill_myers · December 19, 2019, 9:12pm

Sorry, I got confused a bit, but I think it may still potentially broken, because run_func() code is going to be generated during monomorphization in the crate calling spawn, whereas init() is generated in the mitosis crate, and I think that could end up in a different dynamic library (unless Cargo is prevented from ever doing so, which may be the case already?).

A macro still seems better since it allows to statically assert that the closure has no data by trying to convert it to a bare fn (passing a bare fn doesn't work, because then run_func is no longer monomorphized, and the bare fn could be in any library, which is what I was mistakenly thinking about).

Manishearth · December 19, 2019, 9:30pm

Cargo doesn't do rust dylibs, so this won't be an issue.

comex · December 20, 2019, 12:00am

Pretty sure it does do Rust dylibs, if Cargo.toml contains crate-type = ["dylib"]. mitosis presumably doesn't have that, but I think it could still break if someone makes an intermediate crate that is a dylib, depends on mitosis, and re-exports its symbols.

Manishearth · December 20, 2019, 12:11am

It allows you to create dylibs, but not depend on them from rust code.

comex · December 20, 2019, 12:54am

No, you can depend on them, if they're dylib rather than cdylib. (I tested it.)

Manishearth · December 20, 2019, 3:46am

Huh. Might need to do something else then

Lokathor · December 20, 2019, 8:32pm

To answer your question directly:

If the ZST is Copy you can safely make as many of them out of thin air as you like.
If the ZST is not Copy then you must make them in whatever way the crate that designed that type allows you to make them (and only in that way).

mcy · December 20, 2019, 8:43pm

I don't think that's even true, in general. What if the semantic is that the ZST is a token that a one-time initialization has happened, and you can hand it out like candy to anything that needs to know about the initialization?

But this is about safety: regarding validity, I think that point is already addressed.

RustyYato · December 20, 2019, 8:49pm

I think @Lokathor was assuming you already got access to one, in which case you can make as many Copy zero-sized values as you want.

Because of this,

matt1985 · December 20, 2019, 8:59pm

Just to leave it said I have a zero sized type which is Copy but is unsafe to construct,because it has a safety invariant that can't be upheld in the type system,which is that the FieldPathSet<_,UniquePaths> represents a set of disjoint fields.

That said,if there is already a soundly constructed instance of the specific FieldPathSet<T,UniquePaths> type,it would be sound to construct with std::mem::zeroed().

Topic		Replies	Views
Documenting more layout guarantees Unsafe Code Guidelines	20	1353	December 22, 2024
Size of uninhabited types language design	18	2126	March 25, 2019
Make mem::uninitialized and mem::zeroed panic for (some) types where 0 is a niche Unsafe Code Guidelines	30	3912	December 22, 2024
Type-safe atomics usage libs	17	1785	August 23, 2021
Relaxing the improper_ctypes lint to allow passing ZSTs behind a raw ptr language design	11	1821	March 25, 2019

Is synthesizing zero sized values safe?

Related topics