Lack of API mutating args at std::process::Command

cehteh · June 22, 2021, 4:34pm

std::process::Command has no way to mutate the supplied arguments when one wants to execute a process again. Sometimes (esp in testsuites) this is required.

The other attributes (environment variables, current_dir) have API's for mutating them.

Therefore I'd like to propose/discuss what could be done for the commands args.

Traditionally program arguments are an array of strings where argv[0] is slightly special as it (usually) contains the program name, it can still be changed but is not part of the arguments handled by the argument parser. std::process::Command will already initialize that for you.

For mutating the argv as bare minimum I'd propose:

pub fn args_reset()

Shall reset the argument vector to the state it was when Commmand::new() created it. That is: setting its length to 1 keeping argv[0] intact.

Furher (to be discussed) API's may be:

pub fn arg_set(index: usize, value: &str) -> Result<(), Error>

replacing the argument at the given index, results in an error when the index was out of range. This can be used to change argv[0]

pub fn arg_pop() -> Result<(), Error>
// and/or
pub fn args_pop(n: usize) -> Result<(), Error>

drops the last element / n last elements returns an Error when there are less elements available. (discussion: shall argv[0] be preserved?)

pub fn args_new<I, S>(&mut self, args: I) -> &mut Command where
    I: IntoIterator<Item = S>,
    S: AsRef<OsStr>,

sets a new list of arguments, like args_reset() followed by args(). leaves argv[0] untouched

bjorn3 · June 22, 2021, 4:40pm

Can you recreate the Command from scratch?

bascule · June 22, 2021, 7:07pm

It seems like if std::process::Command impl'd Clone, you could clone it just prior to adding the arguments, and then clone it again if you want to change them. Unfortunately it doesn't.

cehteh · June 22, 2021, 7:56pm

Sure, the Command can be recreated from scratch, there are workarounds for this issue. Implementing 'Clone' would do the job too.

The point here is that 'Command' is already a builder and can be used for multiple instances with one of the execution triggering methods. The other attributes (Oops as I am writing this I haven't checked for Stdio) do have API's to refine them before firing one execution again, only the 'args' does not. I'd call this an inconsistency in the API.

IMO it would be valuable to add this, to my understanding it won't cost much and won't break anything, just making things more consistent and easier to use.

kornel · June 22, 2021, 10:43pm

clear_args() seems like the simplest thing that could work. More than that goes onto a slippery slope of "why not < insert any random Vec method > too?"

cehteh · June 23, 2021, 12:22pm

That's why I put other API's up to discussion. A 'reset' function is the bare minimum required (i'd suggest reset instead clear, because it should preserve argv[0]).

Additionally I'd opt for the 'set' function that replaces an existing component by index (dropping or returning the old one). Because it is common that one wants to prepare a command to be executed and then run it over different filenames for example.

kornel · June 23, 2021, 4:57pm

argv[0] is specified using Command::new(), and can't be set using .arg(), so I think it's fine to present an API as if the "args" meant argv[1..] only, so clear() clears [1..].

Set by index would raise question why not implement Index and command[n] = arg. And if there's set, why not remove, and if there's remove, why not insert, retain, and so on.

cehteh · June 23, 2021, 6:19pm

Question is answered by that the 'args' is not implemented as single Vec, nor are all the methods Vec offers needed in this case.

My aim is for an usable ergonomic API. Take a look at 'env' which has env_clear() and env_remove().

Having an 'args_clear()' that clears [1..] is Ok. (while sometimes one wants to chage argv[0]) but that is unrelated to this topic, if need arises this can be discussed again under another topic.

Looking at env, values can be updated with the env() and envs() method, since that is an associative store. For args it is very common that one wants to update an argument at a certain position. I am thinking this really deserves to have an API, not for efficiency (which is irrelevant when we talking about spawning processes), but for ergonomic reasons with something like:

let mut command = Command::new("dosomething").args(["FILE"]);
somefiles.iter().for_each(|file| command.arg_set(1, file)?.spawn());

There are certain use cases where one wants to remove or insert more arguments, but that may only fatten the API, in that case it is Ok to clean and rebuild the argv from scratch.

cehteh · June 23, 2021, 6:31pm

Just some observation:

arg_set() should return a Result<Command, Error>

Which surprises me now since the env() functions don't return a Result but a Command. But there are certain ways which would fail setting an environment variable (illegal characters in its name). Is this intentional or a defect.

chrisd · June 23, 2021, 6:40pm

Any errors will be returned when you call spawn, etc.

I assume they aren't tested any earlier because it would have made the builder less ergonomic back before we had the ? operator.

cehteh · June 23, 2021, 6:57pm

Actually errors are not handled in that case: (testing on linux here)

fn main() {
    use std::process::Command;

    let out = Command::new("bash")
        .env("FOO=BAR", "BAZ")
        .args(["-c", "/bin/echo $FOO"])
        .output()
        .unwrap()
        .stdout;
    println!("Surprise: {}", String::from_utf8(out).unwrap());
}

this results in "BAR=BAZ" as if one did .env("FOO", "BAR=BAZ") ... which isnt unexpected when you know how the underlying representation in the libc is.....

chrisd · June 23, 2021, 7:02pm

Ah, interesting! I assumed it would do some testing when building the variables but only a null check is done.

cehteh · June 23, 2021, 7:12pm

Now I am puzzled about how to fix that. Other OSes would certainly put other restrictions on characters a env var name can hold (and possibly different rules on the value as well). Changing the return into a Result<> would be a breaking change, nogo for the stdlib. Validating the env before spawning seems to be costly (ymmw). Possibly one could store the env validation state in Command, dirty it every time the env is mutated and validate it when dirty -> set it to Ok or Error on spawn time, returning Error perhaps.

mathstuf · June 23, 2021, 7:29pm

See this thread for more on env validation.

chrisd · June 23, 2021, 8:05pm

Huh, Unix Rust seems to use a saw_null variable as a way to signal an error.

cehteh · June 23, 2021, 9:23pm

I think Command::env() things should follow whats there (PR) finalized. But I have no idea how to deal with a breaking change in rusts stdlib (feature?). Anyway this is going off-topic here (the 'args' thing). Shall we open a new thread on this forum about the 'env' issue or file a ticket straight away about a possible defect? I am the new kid on the block here and don't know the rust process on these things.

mathstuf · June 23, 2021, 9:40pm

I think just mentioning that Command::env also care in that thread should be enough. Easier than splitting up the tasks associated with whatever comes out of it.

CDirkx · June 24, 2021, 12:54pm

The Command env methods are implemented internally using the private construct CommandEnv (rust/process.rs at 456a03227e3c81a51631f87ec80cac301e5fa6d7 · rust-lang/rust · GitHub), which has methods like clear, set, remove. An alternative would be exposing CommandEnv and introducing an equivalent CommandArg (possibly with a clearer name like CommandEnvStore?).

tspiteri · June 24, 2021, 2:03pm

For environment variables, remove makes sense because the variables are like a map, with key and value, and the variables do not rely on order. But arguments may depend on order: an argument can have an effect on the next arguments, so removing a single argument can be much more complicated than removing an environment variable.

In summary, while there are both env_clear() and env_remove(key), and it looks harmless to have a similar arg_clear(), a method arg_remove(index) looks very tricky and error prone.

tspiteri · June 24, 2021, 2:08pm

Also, env_clear and env_remove are necessary because Command::new inherits the current process environment, while there is no such thing for arguments. So even adding arg_clear is not really equivalent to env_clear.

Topic		Replies	Views
Allow std::process::Command to change args libs	7	786	August 9, 2022
Why can't `std::process::Command` implement `Clone`? libs	3	361	December 1, 2024
Pre-RFC: std::os::unix::env::{argc, argv} libs	19	1197	April 18, 2024
std::process::Command resolve() to avoid security issues on Windows? libs	12	1540	September 21, 2021
Pre-RFC overhaul Command for 2021 ed libs	15	2245	May 4, 2021

Lack of API mutating args at std::process::Command

Related topics