Is synthesizing zero sized values safe?

We've deprecated mem::uninitialized, but I'm not sure what the recommended way to synthesize zero-sized types is.

If a type is known to be zero-sized (e.g. a Fn type with no capture) is it safe to initialize an instance of it with mem::zeroed or mem::uninitialized?

cc @RalfJung

I mean, semantically, I would hope not, especially if you're using a ZST as some kind of scoped witness type. I think it's about as bad as forging a raw pointer.

You probably need to qualify what kind of thing you want to pull out of the aether better than "any zero-sized type".

6 Likes

Of course, for arbitrary types, it's not necessarily sound, as arbitrary types can add arbitrary safety invariants (not validity).

Even without mem::zeroed/uninitialized, you can always mem::transmute any other ZST. The layouts of all ZSTs are necessarily the same.

It's valid to synthesize them. zeroed works, or you can do other things like NonNull::dangling().as_ptr().read() or mem::transmute(()) or ...

It's not safe to synthesize them, as they might be used as tokens for a semaphore or similar.

EDIT: the reply after this one makes the good point that −∞-sized types appear to be 0-sized, and those are not valid to synthesize.

2 Likes

enum Void {} is zero-sized (according to size_of).

2 Likes

Yeah, I should have asked if it's UB to operate on them, not if it's unsafe, it should definitely be unsafe to deal with zero-sized tokens and such.

I don't have to deal with never-sized types here, since the crate I'm writing does have access to values of this type before it kicks off the series of events that lead to the value eventually being conjured. The only reason I can't pass that specific value down is that closures aren't serializable and this is cross-process.

The whole approach in the mitosis crate is broken, because there is no guarantee that the closure code exists and is at the same address in the new process (e.g. it could be from a dynamically loaded library that could not be loaded or loaded at an arbitrary address, and the "ASLR mitigation" doesn't work unless you know which library the code is from).

What you should do is simply pass the closure to call in the child to the init function and not to the spawn function, and only let the user specify a serializable message at the spawn point.

If you really insist on putting the code at the spawn point, then define a macro and have it generate a C++ global constructor that registers a bare fn associated with a GUID with a central register, and send the GUID to the child process. It will still not work in the case of libraries that are not loaded, but it will fail gracefully rather than starting to execute code at an arbitrary address.

Uh, that's why the actual pointer being shipped over is a function pointer that I control, not an arbitrary function pointer being created by the user. The user can only pass down zero sized closures and fn item types, which get wrapped in a monomorphized fn that I convert to a pointer. The ASLR mitigation is operating on pointers from the same library.

This is what I was trying to say in https://github.com/Manishearth/mitosis/issues/5

Passing the closure to init defeats the purpose of this crate, this crate needs to be able to call spawn multiple times for different closures.

Separately, regarding your crate...

I wonder what it would take to make it possible to ship across a capturing, serializeable (FnOnce() -> R), i.e., having a way to decompose a closure into its captures and its body, and to implement traits (like serialization) on the closure captures... I seem to recall a discussion about this on the order of 6-12 months ago??

Sorry, I got confused a bit, but I think it may still potentially broken, because run_func() code is going to be generated during monomorphization in the crate calling spawn, whereas init() is generated in the mitosis crate, and I think that could end up in a different dynamic library (unless Cargo is prevented from ever doing so, which may be the case already?).

A macro still seems better since it allows to statically assert that the closure has no data by trying to convert it to a bare fn (passing a bare fn doesn't work, because then run_func is no longer monomorphized, and the bare fn could be in any library, which is what I was mistakenly thinking about).

Cargo doesn't do rust dylibs, so this won't be an issue.

Pretty sure it does do Rust dylibs, if Cargo.toml contains crate-type = ["dylib"]. mitosis presumably doesn't have that, but I think it could still break if someone makes an intermediate crate that is a dylib, depends on mitosis, and re-exports its symbols.

It allows you to create dylibs, but not depend on them from rust code.

No, you can depend on them, if they're dylib rather than cdylib. (I tested it.)

1 Like

Huh. Might need to do something else then

To answer your question directly:

  • If the ZST is Copy you can safely make as many of them out of thin air as you like.
  • If the ZST is not Copy then you must make them in whatever way the crate that designed that type allows you to make them (and only in that way).
1 Like

I don't think that's even true, in general. What if the semantic is that the ZST is a token that a one-time initialization has happened, and you can hand it out like candy to anything that needs to know about the initialization?

But this is about safety: regarding validity, I think that point is already addressed.

5 Likes

I think @Lokathor was assuming you already got access to one, in which case you can make as many Copy zero-sized values as you want.

Because of this,

Just to leave it said I have a zero sized type which is Copy but is unsafe to construct,because it has a safety invariant that can't be upheld in the type system,which is that the FieldPathSet<_,UniquePaths> represents a set of disjoint fields.

That said,if there is already a soundly constructed instance of the specific FieldPathSet<T,UniquePaths> type,it would be sound to construct with std::mem::zeroed().

2 Likes