True C++ interop built into the language

FD: I am a member of the C++ committee, but I speak for myself here.

I'm not familiar with CppFront, but I don't think Val is subject to this as there are definitely different semantics that (I feel) the authors wanted here (not having to "fight" a borrow checker for one). I suspect that Carbon may have some other goals that Rust doesn't cover as well (especially given the sheer amount of C++ Google has). Maybe better C++ interop would have helped, but I doubt it.

The former is a mess because it's not clear what should be "available" and transitive includes just pollute this space. Modules have better isolation, but the model for building them is complicated (I also work on CMake and am implementing C++ module compilation support).

The non-destructible move semantics C++ works with are going to be hard to deal with. There's have to be some kind of CxxCell that can act like Copy as far as implicitness goes, but instead does a Clone and leaves a husk of a type behind while also (ideally) becoming invalid for the borrow checker for usage, but still calling Drop at the end of the scope.

Instantiating templates will be a problem. With SFINAE and <type_traits>, one has to answer things like:

  • std::is_trivial<NonZeroU32>: not default constructible, but feels trivial otherwise
  • std::is_copy_constructible<Vec<T>>: Clone satisfies this, but feels dirty

before one can even know what APIs can work in places. Lookup is also a minefield for Rust's syntax I feel. How do I call a constructor because it doesn't have a name? Do we just bind T::__ctor or something?


I think a good place to start would be to try and answer <type_traits> questions for Rust types. And look at what traits various C++ types might be able to implement. This feels like the core part of being able to get Rust and C++ APIs to talk to each other. Starting with Pin sounds hard, but is also probably where most of the difficulties will arise too (especially with C++ move operations).

6 Likes

The second place to look for roadblocks would be using overloaded operators from C++ types in Rust. The third would be how to differentiate int, long, and long long overloads from the Rust side between the two that are the same size on the platform (going via the c_* aliases would be very verbose).

4 Likes

Regarding headers vs modules, I can see reasons for both. Supporting headers would also help C interop. Supporting modules is probably more ideal since they're closer to Rust modules.

Regarding move semantics, it appears there may be some solutions for that already as mentioned above. I don't think we should try to preserve the implicitness of various C++ operations (including copy, move, reference) on the Rust side, though. In order to pass something to a move constructor, you'd have to explicitly call thing.cpp_move() or something along those lines.

I agree that templates are going to the most difficult part of this. But I also think templates are the best argument for having this capability in the compiler. Being able to use Rust types in C++ templates doesn't sound that useful to me, since you can't really do anything with them on the C++ side anyway. Maybe we could provide an opaque wrapper for passing Rust types through C++ code.

1 Like

Note that adding this to the Rust language basically means saying "if you want to implement a Rust compiler, please also implement a C++ compiler (or bolt an existing one onto it)". Given that there are 4 (mature) C++ implementations, two of which are closed, this seems like a steep ask.

I feel like there is still space to be explored on the library side before we tack on the C++ standard as an addendum to the Rust language.

13 Likes

For overloading, I think we can use feature(fn_traits)

struct CppFoo_run<'a>(&'a CppFoo);
impl FnOnce<(u16,)> for CppFoo_run<'_> {
    type Output = u32;
    extern "rust-call" fn call_once(self, b: (u16,)) -> Self::Output {
        // ...
    }
}

impl CppFoo {
    fn run<'a, Args>(&'a self, args: Args)
    -> <CppFoo_run<'a> as FnOnce<Args>>::Output
    where CppFoo_run<'a>: FnOnce<Args>
    {
        CppFoo_run(self).call_once(args)
    }
}

For constructors, we could have a separate CppConstructor trait that looks very similar to the above. Then you could call the constructor like so:

CppConstructor::call::<CppFoo>((...))

Yeah I certainly understand the gravity of what I'm proposing. I know it's a huge ask. I'm glad there's discussion happening though.

1 Like

For most things (and for everything in C, probably) I don't think adding it into the language has much benefit over generating bindings like bindgen or cxx. The one important exception is templates. For templates, the binding generator needs to know the set of types that will be used. And I'm thinking adding one generic extension to Rust could allow implementing the rest as a crate, with the benefit of allowing doing similar bindings to other languages and also some other features: I'd call it proc traits.

Proc traits would work similar to proc macros, but they would be called after type analysis, and given full type information, but in exchange would be restricted to generating implementations of traits that were already declared (so they can't invalidate the type analysis). Something like

struct AnExternalTemplate<T> where … { … }
#[generate(cpp)]
impl<T> AnExternalTemplateImpl<T> for AnExternalTemplate<T> where <…>;

It would have to be called with the full set of types a generic impl should be instantiated. Then

  • cpp/autocpp could emit these declaration—though without concepts it would still need some help with the constraints—and then add explicit instantiations to the C++ compiler input for types that the Rust compiler asks for implementations for.
  • The same thing could be used for generating bindings to .NET, Swift, Julia or any other language that has generics.
  • And I think it would also make some pure rust derive cases simpler or more flexible, since the code generation would know the definition of types of members and arguments, not just their names.

I thought about this some time ago when I saw the C++ metaclass proposal. It is quite a bit more flexible than Rust's proc macros exactly because it gets the parsed code with resolved types rather than just tokens proc macros get. But C++ single-pass compilation allows it to add anything to the declaration and it's usable from that point on, but with Rust multi-pass compilation we can't add new items. But I think generating function bodies we already promised the compiler to exist would be possible.

3 Likes

I like the idea of exploring this direction, but note that for this to work, the Rust impl will still have to declare the trait's associated types and associated constants, since those participate in the type and trait solving.

3 Likes

Yes, associated types would have to be given earlier.

I think for the common cases in C++, the binding generator should be able to do it when defining the trait. But it would still need this plugin to find for which types the trait is actually used so it could request appropriate explicit instances from the C++ compiler and bind the resulting mangled functions.

And if some complex cases won't get covered, I think it's not that big of an issue. Modules that make sense to call cross-language are the kind like implementations of data formats (e.g. zip) and network protocols (http, tls, database etc.) and those tend to have simple interfaces relying on dynamic dispatch and just simple templates like collections. While the heavy template magic (like many things in boost) would be poor fit for Rust anyway and it makes more sense to build a proper native abstraction instead.

So I think even with that limitation, it would go a long way towards improving the C++ integration without tying the Rust compiler to the specifics of any C++ compiler, and would also help bindings to other languages (Swift? .NET?)

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.