Synchronized FFI access to POSIX environment variable functions

OK I tried to collate the discussion options so far...does this seem accurate?

Rust: setenv and threads options

try_setenv: Add a new API which errors if threaded

See try-setenv-rs. Basically we introduce pub fn try_set_env_var<K: AsRef<OsStr>, V: AsRef<OsStr>>(key: K, value: V) -> Result<(), Error>.

In this path we could choose to deprecate set_env (but only on Unix?), or not.

Pros:

  • Does not break existing working code
  • The dynamic checking makes it very likely that any race conditions would be detected at runtime

Cons:

  • Is known to fail today for apps using e.g. [tokio::main] which spawn threads; these apps would need to stop using the macro, or async runtimes like tokio would need to be refactored to only spawn threads when a future is run.
  • A lot of potential churn in the ecosystem to change to try_set_env().unwrap().

Change POSIX.1 standard to have internal locking and add getenv_r

First, see this comment from zackw.

(And get patches into glibc, musl, bionic, etc.)

Pros:

  • Fixes the problem for all languages, including plain old C

Cons:

  • See linked issue above that notes that C programs can already access environ and the 3rd argument to main directly outside of an API
  • We would have to wait a long time before this propagates far enough that we could rely on it
  • Some people on standards body already opposed this and took the "just don't setenv() with threads" stance.
  • Only fully mitigates the problem if everything in the process uses getenv_r.

Only operate on Rust-specific copy of the environment

This is what Go does: https://cs.opensource.google/go/go/+/master:src/syscall/env_unix.go;l=39?q=Setenv&ss=go%2Fgo

Pros:

  • Does not break existing working code.
  • Can be implemented without special platform capabilities (such as "check if there are other threads running") or introducing new requirements for the Rust runtime as a whole (such as informing it when C code spawns a thread).
  • Gives the ecosystem the opportunity to slowly migrate each use of set_env to one of the two new functions.
  • If the two new operations are not sufficient for some code, we will hear about that and can consider adding more options.

Cons:

  • Will actively break any code that is today calling set_env() but directly invoking e.g. fork()+execve() and relying on set_env() to propagate into the child process. For example, a container runtime in Rust might use set_var() but directly call Linux clone3() etc. These would all need to be fixed to call e.g. libc::setenv().
  • A lot of churn since every use of set_env needs to be changed. But honestly I don't think this can be avoided, those all need to be audited.
  • Having such a shadow environment could cause confusion. But Rust is not alone here, and the shadow-env-mutating variant of set_env could have a suitable name to make this clear ( set_env_rust , for example, with set_env_process for the unsafe one).
  • Provides no safe way to mutate the process environment. But I am not sure if we have consensus that this is even possible on all POSIX platforms, so possibly it makes sense to delegate a safe try_set_env_process to a user crate for those platforms where checking the existence of threads is possible. [/quote]

Note that because Go has a runtime, it already doesn't work to call fork() directly. But Rust is a systems programming language and should support it.

Mark get_var/set_var unsafe and just have everyone try to deal with it

Pros:

  • Raises visibility of the problem to all the crate authors who aren't reading this thread
  • Maybe some crates figure out how to avoid setenv

Cons:

  • This doesn't really help knowing that we've fixed the problem
5 Likes