Std::process on windows is escaping raw literals which causes problems with chaining commands

example playground link (includes a matching linux command for comparision): https://play.rust-lang.org/?gist=5e265e70d23f4c8ba762db85ee2898e5&version=stable&mode=debug&edition=2015

  • Linux version works correctly passing just "
  • Windows version escapes the inner quotes as \"

an example command where this causes problems

let test: String = simple_run_command::run("cmd",     &["/c", r#"powershell -command "Get-WmiObject -class Win32_Product""#], "");

to verify it’s escaping the quotes compare the output of these commands in cmd.exe to the output of the above command

powershell -command \"Get-WmiObject -class Win32_Product\"
powershell -command "Get-WmiObject -class Win32_Product"

I posted here instead of GitHub for two reasons. Firstly to verify this is actually a bug and not me using it incorrectly. Secondly I don’t actually know where the Github is for this

Here’s a simplified example (it tries to open notepad with a bad path) You can check the arguments it got with Microsoft’s Process Explorer

use std::process::{Command, Stdio};
use std::{thread, time};


fn main() {
    let mut child = Command::new("notepad")
        .args(&[r#"test "something in quotes" "#])
        .stdin(Stdio::piped())
        .stdout(Stdio::piped())
        .spawn()
        .expect("failed to execute child");
    let check_every = time::Duration::from_millis(10);
    loop {
        match child.try_wait() {
            Ok(Some(_status)) => {break;},  // finished running
            Ok(None) => {}                  // still running
            Err(e) => {panic!("error attempting to wait: {}", e)},
        }
        thread::sleep(check_every);
    }

}

Just adding a cross-reference to your post on the users forum:

Thanks, I referenced to here, but not back since this was intended to be the main post.

There’s some contention as to whether this is a bug, but the core of the issue is there doesn’t seem to be any way in rust to call

some.exe "a multi-word argument"

It will only send as

some.exe \"a multi-word argument\"

Which creates problems with programs that use a different escaping scheme. This problem is hidden (more likely worked around) in rust to rust programs since the argument digester for rust invisibly removes the \". Unfortunately many older cli programs in windows aren’t that smart and we need to be able to send raw unescaped quotes as arguments when calling other programs

But it only happens in windows. It may be due to issues caused by the CommandLineToArgvW function of the windows API

You’re right that Rust doesn’t support commands with quoting scheme other than CommandLineToArgvW, but the example you’ve given is incorrect. Command::new("some.exe").arg("a multi-word argument") runs some.exe "a multi-word argument". In fact, Command is incapable of producing your second example. It can do

some.exe "\"a multi-word argument\""

the outer " is a crucial difference, as there is a different syntax between and inside arguments, and unescaped " switches between modes.

Given how messy the syntax of some legacy Windows commands can be, it may be useful to extend Command to support arbitrary syntax.

I think the simplest would be to add a method such as:

fn arguments_escaped(&mut self, the_command_line: &str) -> &mut Self

that takes the whole command line except the command name itself. So:

Command::new("foo").arguments_escaped("\\   args \" bar \" \"\"\" baz")

would execute command + space + arguments_escaped value verbatim, so GetCommandLineW in the executed command would be:

foo \   args " bar " """ baz

I’ve named it _escaped, meaning already escaped, because the caller is responsible for escaping arguments using some command-specific syntax (the name could be raw_args or anything else :bike:). The syntax is arbitrary as interpreted by the command being executed, so the Rust stdlib can’t possibly know the syntax for all commands. User of Command would have to take that responsibility.

It’s not exactly as bad as the Unix equivalent of passing a whole shell command as a string, because by default there’s no dangerous shell syntax interpreted. It could be made dangerous if used as Command::new("cmd").arguments_escaped(format!("/c command {}", args)), but then it’s equivalent of Command::new("bash").arg("-c").arg(format!("command {}", args)), which is a bad idea as well.

Linux, and probably other platforms, don’t even have a concept of passing raw unparsed command string from the shell, so this method couldn’t be portable. It could be OK if it was an extension trait in Windows-specific corner of the stdlib.

Alternatives:

  • Add .raw_arg(str)/.escaped_arg(str) that just appends a string to the command line (perhaps delimited by spaces). The upside is that it looks more like regular usage of Command, but the downside is that .raw_arg(one).raw_arg(two) could set expectation of passing two arguments, but the actual meaning of it is impossible to define.

  • Add .arg_with_syntax(str, ArgSyntax::DoesNotSupportNestedQuotes), .arg_with_syntax(str, ArgSyntax::ThisIsTheLastArgAndEverythingFromHereIsVerbatim) with quoting/sanitisation/mangling options for variously broken ad-hoc parsers.

  • Recognize the command being executed (e.g. dir, notepad, cmd /c, echo), and choose a different argument serialization syntax appropriate for whatever ad-hoc argument parser that command uses. The upside is that .arg() would magically work as intended, the downside is endless whack-a-mole with unlimited set of broken parsers. Also recognizing commands is unreliable (e.g. if the executable gets renamed).

4 Likes

Agreed, it's unfortunate so many windows legacy cli programs have arbitrary argument parsing with different implementations but to be able to live peacefully with them we need some way to send very precise unescaped strings to them.

I do think the current implementation probably still makes sense as a default since it simplifies the process for those who aren't deeply versed in the (sometimes bizarre) world that is microsoft systems programming, but we do need a way for me as the programmer to say "I accept the responsibility of understanding how the argument parsing for the program I'm calling works, and I want to send a very precise string exactly as I need"

1 Like

That could be added to CommandExt. You'd have to figure out the semantics for mixing this with plain Command::arg and Command::args, but if you can come up with a good design, it would be a nice feature.

1 Like

Can std::os::windows::process::CommandExt actually be extended? I don’t see anything to keep me from implementing it for a custom struct (on Windows only) in a completely-useless-but-would-be-broken manner.

Ugh, that’s unfortunate…

@kornel I thought that the usual way of escaping quotes in Windows is to use two quotes? At least this is what I usually use and what is supported by most commands.

The syntax of it is bizarre. It seems to work for one consecutive quote only by accident. The syntax tries to be everything for everyone, it gets progressively weirder around edge cases. If you’re expecting straightforward quote doubling to escape safely, you may be surprised:

"a""b" => a"b
"a""""b" => a"b
"a"""""b" => a""b
"a""""""b" => a""b

That’s because """ is interpreted as a quoted quote character (same as any "x").

2 Likes

Whoa, that’s crazy. Still, it might be a better solution for the simple cases.

it might also be possible to pass them through environment variables, if we’re using something that will expand them when calling to a process

Here’s a quick example of it working within cmd.exe, if it’s running through a windows api there’s a chance whatever mechanism rust is using to execute files may support the same

rem set /p test=type something:
set test="c:\temp\"quote test".txt"
notepad %test%
cmd /c start "" notepad.exe %test%

Notes:

  • rem” is cmd’s comment indicator,
  • cmd /c start "" xyz abc might be a way of calling this if we can’t do it through the API but that would be super janky
  • you can check it’s actually passing the quotes, not just the %test% by using Process Explorer from microsoft (link in one of the posts above)

Here’s a test example using a cmd /c + env workaround. I leave it here for anyone who needs a temp workaround until we work out a long term answer

NOTE: I have reworked this to use powershell because cmd complains about network paths. This was more difficult that I had anticipated so I’ve put the updated version here for people who aren’t as versed with powershell’s eccentricities

//mod windows_runner;


fn main() {
    let quote_test: &str = r#" one" two"" three""" four"""" five""""" "#;
    {   // to show you can collect stdout
        let stdout_test: String = windows_runner::run("write-host", quote_test, ""); 
        println!("{}",stdout_test);
    }
    {   // to check it passes stdin correctly, also shows you can call without argument
        let stdin_test: String = windows_runner::run("nslookup",r#""#,"google.com\nexit\n");
        println!("{}",stdin_test);
    }
    {   // to check (with procexp) that the arguments actually pass exactly as given (including without surrounding quotes), though it does add one extra space between program and arguments
        windows_runner::run("notepad", quote_test, "");
    }
}


mod windows_runner{
    use std::{thread, time, str};
    use std::process::{Command, Stdio};
    use std::io::Write;
    pub fn run (program:&str,arguments:&str,stdin:&str) -> String /*(String,String)*/ {
        let launcher = "powershell.exe";
        let build_string: String;
        {
            if arguments.trim() == "" { // no arguments (powershell gets confused if you try to execute a program with an empty array as the argument set)
                build_string = format!(r#"& '{}'"#,program);
            }
            else {
                let mut arguments_reformatting: Vec<&str> = Vec::new();
                for argument in arguments.split(" ") {
                    arguments_reformatting.push(argument);
                }
                let arguments_reformatted = arguments_reformatting.join("','");
                build_string = format!(r#"& '{}' @('{}')"#,program,arguments_reformatted); // powershell digests: & 'pro gram' @('argument1','argument2') => "pro gram" argument1 argument2
            }
        }
        let launch_command: &[String] = &[build_string];

        let mut child = Command::new(launcher)
            .args(launch_command)
            .stdout(Stdio::piped())
            .stdin(Stdio::piped()) // disable this if you want the user to be able to speak with the child instead of doing it yourself
            /*.stderr(Stdio::piped())*/ // if you want to collect stderr instead of displaying to user
            .spawn()
            .expect("failed to run child program");

        {   // send stdin, disable this if you want the user to be able to speak with the child instead of doing it yourself
            let stdin_handle = child.stdin.as_mut().expect("Failed to get stdin");
            stdin_handle.write_all(stdin.as_bytes()).expect("Failed to write to stdin");
        }

        // would you kindly wait for the child to finish
        let check_every = time::Duration::from_millis(10);
        loop {
            match child.try_wait() {
                Ok(Some(_status)) => {break;},  // finished running
                Ok(None) => {}                  // still running
                Err(e) => {panic!("error attempting to wait: {}", e)},
            }
            thread::sleep(check_every);
        }

        let output = child
            .wait_with_output()
            .expect("failed to wait on child");
        let stdout: String = String::from_utf8_lossy(&output.stdout).to_string();

        
        /*{ // if you want to collect stderr instead of displaying to user
            let stderr: String = String::from_utf8_lossy(&output.stderr).to_string();
            (stdout,stderr)
        }*/

        stdout
    }
}

I'm pretty sure the current policy is that they can be arbitrarily extended and that you're never supposed to implement them on your own types. They're only a temporary crutch until we have a better platform specific lint system.

Meh. If we’re worried about people implementing CommandExt, then just add an empty-bodied default implementation:

trait CommandExt {
    fn raw_args<T: Into<OsString>>(&mut self, _arg: T) {
        unimplemented!();
    }
}
impl CommandExt for Command {
    fn raw_args<T: Into<OsString>>(&mut self, arg: T) {
        self.raw_args = arg.into();
    }
}

I don’t see why not, as long there is a default implementation (which could be even unimplemented!()), it should be fine.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.