Synchronized FFI access to POSIX environment variable functions

How about cribbing off of POSIX and call them std::env::set_env and std::env::unset_env?

1 Like

If they are already in std::env why not just set and unset?

1 Like

set_ffi_env and unset_ffi_env?

1 Like

:+1: for set and unset.

1 Like

Sounds decent to me. Idiomatic usage would presumably use std::env; and call env::set/env::unset.

  • std::env::set & std::env::unset
  • std::env::set_env & std::env::unset_env
  • something else

0 voters

1 Like

Question: If in the future POSIX has safe get/set functions, would Rust's stdlib be unable to safely use them because we guarantee a shadow environment?

I think Rust should document the effects of what happens. For example, something like (pay not attention to the actual function names used here, the jist of the doc is more important):

mod env {

// Safe functions. Guaranteed to be visible to:
// - other code using these APIs
// - subprocesses executed with `std::process::Command`
pub fn var_set(var: &OsStr, value: Option<&OsStr>);
pub fn var_get(var: &OsStr) -> Option<&OsStr>;
pub fn var_snapshot() -> Environment; // a snapshot of the environment for iteration

// Unsafe functions. Guaranteed to be visible to:
// - other code using available platform environment access (limited to platform restrictions such as different C runtime usage within the same process)
// - other code using these APIs
fn platform_var_get(var: &OsStr) -> Option<&OsStr>;
// Long list of reasons this can go badly.
unsafe fn platform_var_set(var: &OsStr, value: Option<&OsStr>) -> EnvironmentPlatformResult;

// APIs to sync between the above…

}
5 Likes

The problem with set/unset is that std::env is not just about environment variables.

This module contains functions to inspect various aspects such as environment variables, process arguments, the current directory, and various other important directories.

3 Likes

I don't think we should offer a Rust-private environment at all.

7 Likes

That's fine. My point is more that the docs should be about the effects the APIs have. Whether they use the C library directly, use a shadow environment, batch updates via some RCU-like mechanism, poke around for the right C runtime to use for the caller's library, or using SIGSTOP on all other threads to guarantee it being race free are all valid (not saying that these are good ideas) implementations as long as the documented effects occur. If we don't want to provide some of these APIs, the docs for what are provided should be accurate, but also not overly constraining in their guarantees.

Unfortunately, this doesn't help with compiler reordering (though that is admittedly rare).

From a portability perspective, I suppose a libc would theoretically be in a position to do this. Normally you would have to worry that the main executable, or another library, might overwrite this segfault handler with its own handler. But a libc could intercept signal and sigaction calls and wrap the user's handler with its own.

However... this would require a ton of added complexity, given that the libc implementations I've seen (on Linux and macOS) do not currently wrap signal handlers in any way, instead just passing the user's function pointer directly to the kernel. And there are probably some weird binaries that, while linking libc, also call syscalls directly, which would still be able to stomp on libc's signal handler. All in all, I suspect this is too big of a hack to be adopted by common libcs, let alone universally adopted.

(Though I do wish there was a portable API for registering segfault handlers for specific memory and/or code regions, for unrelated use cases!)

Anyway, zackw's suggestion to make writing to environ UB when multithreaded is probably more viable. In theory, a libc could harden this by having environ point to a page that gets remapped to PROT_NONE as soon as multiple threads are started, so that instead of UB you get a guaranteed crash. Though I have no idea whether this would be considered worth the cost of an extra page of memory and an extra syscall.

Or perhaps there could be some sort of deprecation and/or hardening mechanism where "newly built" programs do not get access to environ at all. But I don't think there's much precedent for libcs distinguishing "new" programs from "old" in order to make source-breaking changes to the new ones – except for major, unavoidable breaking changes like changing off_t and time_t to 64 bits, where you had flags like -D_FILE_OFFSET_BITS=64 that projects had to manually add to their build systems (very intrusive). My knowledge of the history here is somewhat limited though.

Glibc heavily uses symbol versions for exactly this purpose: How the GNU C Library handles backward compatibility | Red Hat Developer Musl on the other hand doesn't use them and needed to rename the symbols for

I believe.

because of reset_threads_spawned, you may have an issue. If you don't otherwise synchronize-with a thread that reads the environment and then terminates, and then set the environment via try_set_var, that's a data race.

That's why reset_threads_spawned is unsafe. The caller has to ensure that no other threads are running before calling it. Idealy you do any actual environment manipulation before you spawn any threads, but if you really have no other choice (or the platform supports quering if the program has multiple threads), we provide this function as an escape hatch.

There can be no other threads running, but unless you synchronize the read, so that it actually happens-before the write... Welcome to the C++ Memory Model that Rust incorporated.

If a thread was spawned before the read the C++ memory model will guarantee that threads_spawned() will return true on the same thread. On any other thread there is a memory fence between the write of the global and the start of the new thread.

As for all other threads exiting before you call reset_threads_spawned, there has to be synchronization anyway to know that all other threads have exited in the first place, so that shouldn't cause an issue for a weak memory model either.

Checking if a thread has joined has to incorporate either a load with acquire ordering or some kind of barrier, ensuring that no writes can be reordered before this point. (as any code after the join shouldn't need synchronization anymore. It is only safe to call reset_threads_spawned if you ensure no other threads are running, so that's all the synchronization that's required.

The only reason an atomic is used is because threads might be spawned from other threads, so simultaneous writes to it might occur.

I wrote up a draft spec for C library changes to allow safe muitithreaded access to the environment. I'll turn it into a proper blog post and invite feedback from a wider community in about a week, but I thought I would circulate it here first. Best way to comment is probably email (zack at owlfolio dot org).

https://research.owlfolio.org/scratchpad/threadsafe-env-v0.md

@josh @comex @cuviper @jhpratt @bill_myers @RalfJung

11 Likes

I don't see how we can deprecate set_var without a replacement based on a Rust-private environment. What alternative do you propose we offer to existing code that uses set_var, only cares about its effects inside Rust, and has no way of ensuring that no other thread runs? It is already bad enough that we cannot offer an alternative to code that does care about the outside-Rust effects, but here our hands are bound -- C does not support that case either. But anecdotally a lot of Rust code that mutates the environment is fine with this only affecting other Rust code (and processes spawned from Rust code), so I think it is important that we offer a reasonable migration path to that code.

This is not really something that a user crate can offer since the crate setting the variable and the crate reading it (or spawning a process) might not be the same one.

If we don't offer such a migration path, my concern is that many crates will throw up their hands and just call the unsafe function with a comment explaining that yes there might be other threads but what are they going to do about it? There is no alternative and they have to silence the deprecation warning somehow. (Or they just allow(deprecated) which is no better.)

3 Likes

[quote="RalfJung, post:179, topic:15475"] I don't see how we can deprecate set_var without a replacement based on a Rust-private environment. [/quote]

My suggestion would be along the lines of

  • the existing set_var becomes an unsafe function
  • there is a new API which wraps the new C-library API I wrote up, and which is only available on platforms that support it

This accurately reflects the reality that environment manipulation is (thread-)unsafe on platforms that don't have the new API.

Since std::process::Command has a comprehensive set of methods for tinkering with the environment for individual subprocesses, I am inclined to think that code that only cares about the effect of set_var inside Rust is vanishingly rare and does not need to be worried about.

1 Like