True C++ interop built into the language

pitaj · July 19, 2023, 4:03pm

There are a number of emerging experimental "C++ replacement" languages, such as Carbon, CppFront, and Val.

The fundamental benefit that these languages have when compared to Rust is true interop with C++ (no need to go through C).

Don't get me wrong, competition is healthy. However, part of me thinks that these languages only exist because Rust - C++ interop is so painful.

Adoption of Rust would likely be much greater if the language had even unsafe first-class interop with C++. I think the project should consider prioritizing this. It would have benefits beyond just integrating Rust into existing code-bases (like browsers). I can see it having benefits for C interop as well.

I have no ideas for how this would work, but I think the ideal usage would be the following:

ability to "import" a C++ header file or module directly
and use the types within it as if they are Rust types
including calling methods, constructing objects, etc

HjVT · July 19, 2023, 4:09pm

Hot take: that would be bad, actually, as that would encourage use of unsafe c and c++ libraries without safe wrappers.

pitaj · July 19, 2023, 4:23pm

Not necessarily, just make all usage of a C++ type unsafe. Still makes building a safe wrapper around it easier.

djc · July 19, 2023, 5:41pm

What about the existing cxx and autocxx crates would be better if they got language integration?

kornel · July 19, 2023, 8:14pm

This is a huge ask. It's not a matter of just parsing a C++ header. C++ types use C++ features, many of which are messy and have no Rust equivalents.

C++ templates that use SFINAE depend on C++-specific method names and C++ syntax being supported by the types. Rust types don't have that, and it'd require creating a parallel universe for Rust types to look like C++ types.

Rust doesn't have move constructors. Just passing a C++ type from one Rust function to another would require supporting additional semantics that Rust has already promised to never have, and breaking that promise risks causing bugs in unsafe code and/or adding new level of complexity to Rust's generics (ubiquitous ?Move like ?Sized). Conversely, Rust doesn't guarantee that types will support a "moved-from" dummy state that C++ uses in its version of moves.

pitaj · July 19, 2023, 8:42pm

Absolutely it is, and I hope I didn't make it sound like I thought it would be simple or easy to accomplish. In fact I think it would be incredibly difficult, and I doubt it will ever actually happen.

But I do think that the benefits to Rust and to existing C++ projects would be enormous.

pitaj · July 19, 2023, 8:49pm

To be honest, I have not used either. But in my mind, the hypothetical C++ interop in Rust would entirely replace the need for cxx. autocxx could still have some role in generating bindings like bindgen currently does for C.

djc · July 19, 2023, 9:50pm

cxx seems to be (haven’t used it, since I don’t need it) the best/most Rust-friendly to integrate C++ and Rust. If you haven’t even tried it, why are you here to complain that it’s too hard to integrate code written in these languages?

pitaj · July 19, 2023, 10:12pm

I'm not here to complain, I'm here to share my thoughts on the strategy that the Rust protect should take. I think C++ interop should have first party support at the language level because it would be good for Rust, not because I personally have a problem with existing tools.

HeroicKatora · July 19, 2023, 10:28pm

Parsing C++ requires implementing full C++ constexpr evaluation¹. No chance that happens. It would essentially require rustc being a full interpreter for another language.

¹To explain more, it is well-known to be undecidable but this is an understatement. The reason for this trivia knowledge is that one must distinguish type and non-type names. Yet in particular when those names appear after path resolution, path resolution may require template instantiation and disambiguation rules of those, disambiguation relies on parameter equality checks, those template parameters can be arbitrarily computed from any constexpr computation, hence one must implement the operational semantics to properly distinguish type from non-type names. It also makes it clear this task of trying to parse C++ is not going to have an end.

Or by words that showcase the extent of awareness of the C++ specifications of this problem:

The disambiguation is purely syntactic; -- found in N4860, draft for c++20

Addendum: And to make matters worse, the operational semantics are of course full of implementation-defined details. This is not unlikely to be used either, one can 'query' what layout the compiler chose to give certain types. So parsing C++ ends up having to implement each compiler's operation semantics instead of a singular one and for the user to somehow specify which c++ host they meant. Of course, repr(C) has a highly related problem at scale and in practice.

jhpratt · July 19, 2023, 10:30pm

Generally speaking, proposals exist to solve problems. If there's no problem, the proposal is bound to go nowhere.

comex · July 20, 2023, 8:00am

No need to be snide. Rust-C++ interop would solve many people's problems, as demonstrated by the number of people doing this already with tools such as bindgen, cxx, autocxx, etc., despite significant limitations and awkwardness. The amount of problem-solving may not be worth the cost, but that's a different question.

Personally I'm closely watching Swift, which just recently came out with a beta form of C++ interoperability based on Clang. In theory it could be more powerful than bindgen-like tools due to using a forked version of Clang; it could properly integrate the two type systems rather than just having one compiler spit out output for the other. But in practice, the documentation mentions some limitations that make it feel more like a built-in version of bindgen. You apparently can't arbitrarily instantiate C++ templates from the Swift side (definitely not in generic Swift code, and not even with concrete types if the template wasn't already instantiated with those types on the C++ side). You can access Swift types from C++, but only because Swift can generate an actual C++ header, not because the C++ compiler can query Swift's type system.

That said, the fact that Swift's C++ interop is built-in and (eventually) well-supported may end up mattering more than how many type system features it supports. Swift also benefits from a better 'impedance match' with C++ than Rust, in the specific but important respect that it doesn't require types to be movable by memcpy.

But Rust's interop story is slowly improving as well; both cxx and autocxx are 'only' a few years old, for instance.

And then there are the other languages @pitaj mentioned, which might support the deeper kind of interop I mentioned, but I'll believe it when I see it.

matklad · July 20, 2023, 9:53am

Isn’t there another important angle here, aliasing model? My understanding is that C++’s default pointer behavior, “shared and mutable”, just doesn’t map onto Rust. Like, if you have an std::vector<T>& xs on C++ side, there’s no reasonable way to expose that to Rust? & and &mut would be wrong, and you can’t even wrap RefCell around it either. In contrast, I think Swift classes already have “usual” reference semantics.

programmerjake · July 20, 2023, 3:56pm

for shared and mutable, &Cell<T> gives correct semantics (except it probably needs pinning), Rust just needs a way to get a &Cell<FieldTy> from a &Cell<StructTy> which imho would be very useful in Rust by itself anyway

kpreid · July 20, 2023, 4:06pm

&Cell<T> gives correct semantics

Not in the presence of threads. C++ code is allowed to modify data through a shared pointer from another thread, but Cell is explicitly !Sync.

a way to get a &Cell<FieldTy> from a &Cell<StructTy> which imho would be very useful in Rust by itself anyway

Note that this would have to be restricted to structs only; it's unsound for an enum because you could write a different variant to the Cell<StructTy> and invalidate the field reference. That's reasonable but it's a little weird, since most things in the language don't care what kind of containing struct a place belongs to, and that type of invalidation is usually taken care of by having an active borrow of the parent.

It could be implemented by a derive macro on the struct that generates projection functions (much like pin-project).

HjVT · July 20, 2023, 5:13pm

I think userspace macro for Cell projection is impossible, because there is no way to know if type uses niche optimisations.

programmerjake · July 20, 2023, 6:39pm

hmm, yeah, i forgot about that...maybe just have the Rust view of any C++ type use &CppType:

#[repr(C++)]
struct CppType {
    // all fields that aren't themselves C++ types are wrapped in UnsafeSyncCell
    a: UnsafeSyncCell<u32>,
    b: UnsafeSyncCell<f64>,
    c: SomeOtherCppType,
}

dlight · July 21, 2023, 5:23am

I know this is about baking this semantics in the language, but, the moveit crate does a good job of representing C++ moves (while recognizing that they are not the same thing as Rust moves). It's so good that the autocxx crate uses it.

We could imagine lifting some or all of those facilities into stdlib proper, as a kind of complement to Pin. It's useful for self-refential values, even outside of the context of C++ interop.

SkiFire13 · July 21, 2023, 6:18am

TBH they are also made by the same authors (google), probably with the goal of using it in autocxx from the start.

CAD97 · July 21, 2023, 8:53am

Neither are Google projects, they're projects where Google owns the code (because it was made by a Google employee, presumably at least partially on Google funded time)

Topic		Replies	Views
Better C++ interoperability internals	30	26651	March 25, 2019
Interfacing Rust with C and C++ internals	7	19322	March 25, 2019
Thoughts on Rust stdlib and C interfacing language design	38	12192	March 25, 2019
Rust Version Attribute, based on the Haskell model ideas (deprecated)	5	1490	March 25, 2019
Don't make Rust bloated like C++ or C# (an earnest request) language design	12	3512	July 16, 2022

True C++ interop built into the language

Related topics