Pre-RFC: deprecate env::set_var

eefriedman · October 26, 2015, 3:07am

env::set_var seems extremely dangerous: it might technically be “safe” in some restrictive sense, but it isn’t really safe in practice if you’re dealing with non-Rust code… there isn’t any reasonable way to ensure that environment-dependent functions from the C library aren’t called at the same time as set_var. (Even the standard library itself screws this up: it calls execvp without taking the internal lock.)

See https://doc.rust-lang.org/nightly/std/env/fn.set_var.html for more details on why set_var is unsafe.

Given that there isn’t really any good reason to modify the environment, I think env::set_var should just be deprecated. Is there something I’m missing?

bluss · October 26, 2015, 11:41am

Good proposal, but to be strictly safe it needs to be more than deprecated(*). glibc, which Rust uses on linux, has the attitude that setenv must not be called in a multithreaded program (Rust issue link).

(*) set_var could be changed to do nothing, or it could panic if called in a multithreaded program.

larsberg · October 26, 2015, 12:13pm

Surveying Servo and all of its dependencies, set_env is used for the following things:

Build scripts, to set up vars for subsequent tools
Unit tests, to change config params in libraries (e.g., timezone info)
Turn on RUST_LOG (is there an API for this?)

tomaka · October 26, 2015, 12:30pm

I have a std::env::set_var("RUST_BACKTRACE", "1"); at the start of all my programs :-/

alexcrichton · October 26, 2015, 5:20pm

I don't think it would be that tenable to deprecate without replacement, it's an operation expected of many standard libraries (especially when interoperating with other code as you've mentioned). We should of course attempt to make it as safe as possible and fixup any holes, but there are just inherent caveats with environment variables which doesn't necessarily mean that Rust shouldn't expose it.

This sounds like a bug! Not necessarily a reason to remove the function outright...

bluss · October 26, 2015, 6:35pm

Modifying the environment seems to be fundamentally incompatible with multithreading, though, we can’t just will this to be safe.

cuviper · October 26, 2015, 7:37pm

Whatever is decided, env::remove_var should follow suit since unsetenv is equally unsafe.

That's in fn child_after_fork(), which is a rather special situation, no?

eefriedman · October 26, 2015, 7:55pm

Well, it's special in the sense that the access to the environment doesn't happen in the original process... but that doesn't really help in terms of thread safety.

Okay... maybe we can come up with some sort of proposal to deprecate it, and replace it with a function marked unsafe?

cuviper · October 26, 2015, 7:59pm

What threads? This is a newly forked process that's about to exec -- there's nothing else going on.

cuviper · October 26, 2015, 8:14pm

Also, any mutex which happened to be locked by any thread when the fork was called, will remain locked forever in the new process thanks to COW (unless it was in shared memory). So if you wanted ENV_LOCK just in case, you’d have to lock it across the whole fork. But I think it’s totally unnecessary.

bluss · October 26, 2015, 8:15pm

It sounds like the env lock can cause a deadlock on fork + set_var, unless it’s set up to unlock itself on fork (pthread_atfork).

eefriedman · October 26, 2015, 8:42pm

There's a newly forked process that's about to exec... and that process's environment is corrupted because it was forked in the middle of a setenv call.

Right, we need to lock it across the fork. (The new process will never try to access ENV_LOCK, so it doesn't matter what state it's in when the process is forked.)

cuviper · October 26, 2015, 8:49pm

Ok, point taken. The exec is sort of an afterthought to the problem, that fork may snapshot the environment during another thread's modification.

alexcrichton · October 26, 2015, 10:00pm

This has been discussed before and I personally haven't changed opinions since then. These functions are safe to call from Rust, and they're documented as being a hazard if you're mixing Rust and C, and I think that's a pretty ok compromise to make. I don't think having an unsafe set_var is to palatable and I don't think it would prevent anything like this deal with set_var + fork.

eefriedman · October 27, 2015, 12:35am

Sort of... the problem is that being "safe" is infectious; everything marked safe has to be safe to use together. If I'm publishing a crate, I want to be able to say "my crate plus the standard library is safe: you can't break the safety rules without an unsafe block", and we want to ensure that this is reasonable.

For example, suppose I'm the author of the time crate, and I want it to be safe combined with the standard library. https://doc.rust-lang.org/time/time/fn.at.html uses localtime_r to convert a time to the local timezone. I have to mark time::at unsafe; otherwise, you can write "safe" programs which segfautt.

Or suppose I want to write a safe crate which wraps GTK; the initialization function has to be unsafe because the user has to promise they won't call set_var at the same time.

Basically, I think making set_var safe draws the line for safety at the wrong place; being able to write safe wrappers for C code seems more valuable than being able to modify the C environment in safe code.

(Another possibility to get around this mess would be to build a copy of the environment in the Rust standard library: we could have a set_var function which is threadsafe, but doesn't modify the C environment. Probably more confusing than useful, though.)

Oh, and error_string in the standard library also needs to lock the environment. The number of places we screw this up in the standard library is a good indication that other libraries will run into trouble, especially given that it's impossible to fix code outside the standard library to take the Rust environment lock.

RalfJung · October 27, 2015, 9:26am

Given that there isn't really any good reason to modify the environment,

Eh, wait, that sounds wrong to me. When interoperating with foreign libraries, or executables (that are fork'ed and exec'ed), it is sometimes necessary to be able to modify the environment. At least on unixoid systems, the environment is a crucial part of the interface provided by the operating system. It's API may be very badly designed, but still Rust has to expose it. So, marking set_env unsafe would be fine, but deprecation without replacement would not.

arielb1 · October 27, 2015, 10:45am

You can use execve to create an executable with a new environment.

RalfJung · October 27, 2015, 11:35am

Ah, good point. That doesn’t help for shared libraries, though.

alexcrichton · October 27, 2015, 4:48pm

@eefriedman

My feeling is that saying that set_var is unsafe isn’t necessarily the right line either, however. We can just include as part of the “unsafe contract” that you carefully touch environment variables, but in terms of foreign code there’s no reason that set_var is more unsafe than var (as foreign code could be both getting and setting environment variables).

I’d be fine exposing a way to lock the environment manually, it seems like a reasonable thing that may be required to expose robust bindings in some cases.

eefriedman · October 30, 2015, 2:07am

Hmm... what would that look like? Maybe something like this?

/// Lock the C environment.
///
/// Safe Rust code cannot write to the C environment for the lifetime of
/// the returned EnvLockReadGuard.
///
/// Multiple EnvLockReadGuard instances can be active at the same time.
///
/// Any code which calls a function which might read the C environment
/// (including a lot of functions in the C standard library which might
/// not be obvious) should hold this guard while calling those
/// functions.
///
/// The guard can be held for an indefinite period of time, which might
/// be necessary for complicated libraries (like GUI libraries) which
/// spawn threads.
fn lock_environment() -> EnvLockReadGuard;

/// Guard the environment from writes.
///
/// EnvLockReadGuard meets the bounds `'static + Send`.
struct EnvLockReadGuard(...);

EnvGuard needs to be something like a RwLockReadGuard: if, for example, someone has a program which uses both some GTK and the time crate, both crates can correctly lock the environment without causing a deadlock. Not sure if Send is necessary; it would be convenient in some situations, but probably painful to implement.

Not sure if the corresponding write API is necessary; it's probably rare to run into C libraries which unavoidably mutate the environment.

If EnvGuard doesn't correspond to a real lock on WIndows, there might be subtle portability issues, but there's probably no good way around that.

Topic		Replies	Views
Function to hold lock on execution environment (for FFI) libs	26	2786	April 4, 2021
Synchronized FFI access to POSIX environment variable functions Unsafe Code Guidelines	284	17554	December 22, 2024
Thread-safe environment variable mutation	2	1364	February 20, 2023
Why should std::env::var panic? libs	35	3613	September 28, 2021
PSA: RUST_LOG for rustc got renamed to RUSTC_LOG compiler	5	817	August 11, 2019

Pre-RFC: deprecate env::set_var

Related topics