Pre-RFC: Support declaring varargs functions and processing a va_list


#1

A colleague of mine was trying to use Rust to write a debugging library that intercepted specific C function calls and interposed tracing. Some of those C function calls used variadic arguments (varargs). Rust supports declaring external variadic functions with extern "C", but does not support declaring such functions with a body. While it might be possible to dissect a va_list once obtained, it isn’t currently possible to write the variadic function itself in Rust, and obtain a va_list from the .... And even once you have the va_list, you’d have to write your own custom assembly routines to extract the arguments, even though LLVM already has built-in intrinsics for that on every target platform (e.g. llvm.va_start and llvm.va_end).

I think it makes sense to have built-in compiler support for this, based on the corresponding LLVM intrinsics.

I’d like to propose a minimal RFC, that allows declaring an extern "C" function in Rust (with a body), and calling (unsafe) intrinsics from within such a function, to obtain and process a native Rust type VaList that corresponds to va_list.

That could look something like this:

#[no_mangle]
pub unsafe extern "C" fn func(fixed_arg: SomeType, ...) {
    let args = std::intrinsics::va_start();
    let arg1 = args::next::<i32>();
    let arg2: *const c_char = args::next();
}

The VaList type would provide a single function, next, which can return any type that implements a specific marker trait; the types supported by LLVM’s intrinsic should implement that marker trait. The type would implement Drop (to call the equivalent of va_end), and would not implement ?Sized or Copy or Sync or Send. It could potentially implement Clone (based on the intrinsic for va_copy).

Note that the caller could pass the VaList down the call stack, but could not allow it to propagate up past the variadic function itself. Since all the intrinsics are unsafe anyway, that doesn’t seem excessively unreasonable, though it might be nice to enforce that somehow if we have a straightforward means of doing so.

(I could imagine alternative syntactic sugars that desugar to the above, such as naming the ... in the argument list rather than calling an intrinsic, but that seems excessively magic.)

Does this seem reasonable?


#2

This would also make it simpler to gradually port over old C codebases to Rust. On a related note, it would be useful for giving Corrode a way to handle varargs.


#3

Sounds reasonable!
This is a small library addition, and the interface can match C.

If VaList has a lifetime parameter, maybe there’s a way for intrinsics::va_start to set it to the “current function lifetime” or “current block lifetime”, magically or not. Intrinsics can do magic if really necessary (e.g. transmute).


#4

Thanks for your review and support!

If possible, that does seem like it would reduce the likelihood of error. unsafe code could of course break that, but if we can make the common case less error-prone, we should.

Given your expertise in the internals of the compiler, if this seems feasible to you, what terminology would you suggest I use in the RFC to describe the desired type of intrinsics::va_start?


#5

I don’t know, unfortunately. It’s better to ask @arielb1 or @eddyb.
(I’m not even entirely sure this is doable.)


#6

Unfortunately it’s not quite that simple, see this eerily well-timed thread on llvm-dev. As with other ABI related things, the abstraction in IR is imperfect. On some targets, it’s all you need, but on other targets (especially in complex cases) the frontend has to generate target specific IR. So the implementation will be somewhat (though I can’t tell how much) more complicated than just exposing the intrinsics.


#7

Maybe make it into an argument?

pub unsafe extern "C" fn func<‘a>(fixed_arg: SomeType, ...varargs: VaList<‘a>) { … }

This might be slightly harder to implement, but I think not much; could be wrong.


#8

Yes, I would love to see some kind of varargs support in Rust for use in ABI-compatible replacements for existing C code, such as what Corrode generates. There are very few ABI-relevant areas where Rust has no equivalent to C, but this is one of the two that show up a lot. (The most common case is bitfields in structs.)


#9

@rkruppe Thanks for the clarification.

That thread seems to suggest that one possible solution would be to fix the intrinsics to always work, and that seems like the right answer. We could also get it to work in more cases if we initially limit the types extractable from the VaList; as a first pass, supporting i32, u32, i64, u64, isize, usize, and *T seems sufficient.


#10

RFC now posted, at https://github.com/rust-lang/rfcs/pull/2137 .