Interfacing D to Legacy C++ Code: A summary of a competing language's capabilities

Walter Bright at NWCPP: Interfacing D to Legacy C++ Code.

D is a systems language with a similar target user group to Rust. The general position on why they’re working on interfacing to C++ is best summed up with this line: “Because we keep hearing ‘I want to use D, but I’m trapped by my C++ code!’. … Having a C interface isn’t good enough.” Rust is, more than likely, going to hit the same complaints, so what D does in this space seems very relevant to Rust. A quick summary:

  • All efforts are geared toward just linking to C++ code: i.e. matching C++'s name mangling, layout and calling conventions.
  • D has to special-case it’s support on a per-compiler basis. Has extern (C++).
  • Only a few “magic” types such as a __c_long struct which has special mangling: long in C++ has a unique mangling, but no D equivalent. This is a byproduct of D having integral types which are similar to Rust’s: they have defined sizes.
  • Namespaces handled by extension to extern syntax: extern (C++, N.M) = “C++ declaration in the N::M namespace”.
  • Struct layout: already matches, due to common C ancestry.
  • Struct static member functions: fine.
  • Virtual functions: extern (C++) on a class, or inherit from IUnknown for COM classes.
  • Value/reference type duality for UDTs in C++: you have to pick one, as D is either/or (struct vs. class distinction).
  • Multiple inheritance: nope (actually, MI is OK if it’s interfaces)
  • C++ templates: oh dear. D treats this purely as a linking problem: replicate the templated declarations D-side (D templates are a superset of C++ templates), replicate the symbol’s mangled name, then link against existing code. Existing code doesn’t exist on the C++ side? What are you trying to call, then?
  • Linking against STL-using code is the goal. Proof-of-concept: successfully links against a call using std::vector<T>. It’s… a little scary, but it works.
  • Unsolved problem: const. This is because const in D is transitive, and it causes type names to mangle differently.
  • Unsolved problem: catching exceptions. Possible solution is to restrict to just the exception heirarchy used by STL. Audience comment: “People who throw ints deserve what they get.”
  • Project to build a C++ -> D interface generator using Clang exists.

Personal thoughts for Rust:

  • This could be a killer feature for D. As far as I know, no language (which doesn’t just translate to C++) supports C++ interfacing in any non-insane manner. It would significantly reduce the barrier to incrementally replacing C++ with D code. I believe that if Rust wants to compete in this space, matching D’s capabilities here would be of considerable value.
  • Rust is not as well-off as D in some areas: D has the advantage of its template system actually being a superset of C++'s. It also has conventional OO features. Finally, it has a several-year head start.
  • Rust has two advantages I can think of: it doesn’t have the struct/class split that D does, and it has non-transitive *mut and *const (at least, I don’t think they’re transitive in Rust).
  • This isn’t a question of all-or-nothing. Clearly, interfacing with any possible C++ code is just complete lunacy. D has thus far demonstrated that there’s a decently-sized proportion of C++ code that can be bound to without needing to integrate a full C++ compiler.
  • With the Servo team looking to start adding components written in Rust into Firefox, might be an excellent opportunity to look at exactly what features would be most useful.

Video index:

  • 00:30: C is the Lingua Franca.
  • 01:00: C Interop is trivial.
  • 01:55: C++ Interop? (HAHAHAHA).
  • 03:00: Problems with the above: name mangling, templates, SFINAE, namespaces, overloading, argument-dependent lookup…
  • 04:05: …overloading (again), virtual functions, exceptions, koenig lookup, operator overloading, const.
  • 04:40: You’d have to build a whole C++ front end into the language.
  • 05:10: Or maybe not…
  • 05:20: You don’t have to compile C++, just have to link to it.
  • 06:05: D doesn’t have an analog of everything C++ has, so if we can be a bit plastic on both sides…
  • 06:30: Example: extern "C++" unsigned foo(char*& p);
  • 06:55: D/C++ basic type “equivalences” (sizeof long LOL).
  • 09:05: What about: extern "C++" void foo(long x);? (name mangling hates you).
  • 10:10: D’s solution to long (struct wrapper with forwarding alias + evil compiler mangling magic).
  • 13:30 ish: how D deals with no standard C++ ABI/name mangling (lots of compiler-specific tables and reverse engineering and madness).
  • 15:25: Unsolved problem: const (D’s const is transitive, C++'s isn’t).
  • 16:55: Struct layout matches C++ (already doing the same thing as C).
  • 18:15: Struct member functions: the same.
  • 18:25: Polymorphism (virtual functions) (object and vtable layouts different).
  • 20:05: D supports COM interfaces (just inherit from magic IUnknown).
  • 21:40: extern (C++) class C { ... } in D.
  • 22:05: Multiple Inheritance (run away screaming).
  • 22:40: Value or reference type? (structs/classes are different in D, but not in C++; pick one, can’t be both).
  • 24:15: C++ Namespaces.
  • 25:10: D name spaces.
  • 25:35: Extend C++ Declaration (add namespace as extern (C++, N.M) { void foo(); }).
  • 26:45: C++ Templates: sfinae, partial ordering, dependent lookup, point of instantiation, primary template, template templates.
  • 27:25: (Psychotic screaming)
  • 27:55: Ignore All That: it’s just a name mangling problem. (D templates are a superset of C++ templates, so just fix mangling).
  • 28:25: Example. (Note: need an instantiation on the C++ side to link against).
  • 29:30: Template functions, too (with example and a doggie!).
  • 30:25: Now It’s Time To Justify My Existence!
  • 31:10: Interface to STL! (std::vector<T>)
  • 32:05: Example code.
  • 32:45: D-side code: vector.
  • 34:30: More D: allocator (so effort, much oww, such proof of concept, many works).
  • 37:00: Question: what about function objects, algorithms, etc.? “Probably can be done.”
  • 37:50: Question: if STL updates, do you have to update? “Yes. FML.”
  • 38:10: Question: can you explain the various parts of the example implemenation? (Does so).
  • 40:35: Question: (inaudible) “Why bother with [allocator] the D side? Because of the default argument and mangling.”
  • 42:25: Question: (inaudible) “If there’s no object code on the C++ code, you can’t link against it (RE: macros I think)”.
  • 43:45: Question: (inaudible) “Yeah, it does!” (think it’s about mangling template default arguments).
  • 45:35: Biggest remaining problem: catching C++ exceptions (C++ throws by value, D throws by reference; no idea how to solve that).
  • 47:55: TLDR, no, wait, Question: (inaudible) “Andrei says we have to support exceptions. okay.jpg”.
  • 50:00: Question: is the problem just that you can’t catch exceptions that the C++ code throws but doesn’t catch? “D -> C++ -> D -> C++ throws; what happens? How does it work? You can’t explain that!”
  • 51:15: Question: (inaudible) “Don’t know yet.”
  • 51:25: Question: if you restricted exceptions to subclasses of exception, would that make it easier? “Yes.” Would that be sufficient? “Don’t know yet.” People who throw ints deserve what they get. (Restricting to STL’s exceptions might be sufficient and doable.)
  • 52:50: Question: (inaudble, something about exceptions unwinding through different contexts) “Question is, who gets main?” (something about C <-> C++) “Doable, but you have to be careful due to GC.”
  • 55:10: TL… Question: (inaudible) “What I’ve shown works.” (more) “Because we keep hearing ‘I want to use D, but I’m trapped by my C++ code!’.” (presumably asking about motivation for this) “Having a C interface isn’t good enough.”
  • 57:00: Someone working on “Calypso” which is based on Clang that will output a D interface for C++ code.
  • 58:10: Question: (inaudbile) “Yes. If you’re interfacing with C++, those templates must have been instantiated on the C++ side.” (hands waving) “Yes, no…” (gist: that code relies on having a vector<float> ctor on the C++ side; if you don’t need the C++ bits, just write it in D. Perhaps asking about using STL containers in D without C++ code).
  • 60:00: Question: templates on C++ side that bind to D templates? “C++ code that’s sufficiently D-like (i.e. the DMD compiler) can be translated to C++.”.
  • 62:05: Question: (inaudible) “Well, who cares about iostreams?”
  • 63:00: Question: (inaudible) “D doesn’t support multiple inheritance, unless they’re interfaces.”
  • 63:45: Question: (inaudible, IO streams) “Didn’t like it then, don’t like it now.”
  • 64:30: Question: (inaudible) “There are multiple vtables for SI with interfaces. … It works with those, and the layout matches.”
  • 66:45: Comment: I use some multiple inheritance in C++, but all but one has no data. … Even in IO streams, it’s really only multiple inheritance of interfaces. “I’m not doing it.”
  • 67:55: Comment: (inaudible, about MI/interfaces) “It’s [MI] kind of an obsolete issue.”
  • 69:15: Question: (inaudible) “How do you debug a mixed D/C++ program? … Visual Studio just thinks the D code is C++ code, so you can debug it. Same with GDB, except it now has some debug support for D constructs.”
  • 71:10: Question: (inaudible) “Link-time optimisations. Unpopular: that’s the wrong way to build a compiler/linker system. Weds the two together. Unreasonable, unless Microsoft writes a D compiler. … Right way to do it: hand compiler all the source to your program, not one-file-at-a-time like C++. More practical and easier.”
  • 74:05: Question: (inaudible, something about compilation speed of D c.f. C++) “… when I designed D, I avoided things that make C++ slow to compile. It’s much faster. If you still run out of memory, you can hand the compiler a subset of source files.”
  • 76:05: TLDR: Can get pretty far, need to be flexible on both ends, interfaces to STL are not portable, requires non-trivial expertise.
  • 76:10: It’ll never be 100%, but it’s tractible, infinitely better than C wrappers, no longer locked into existing C++ code.
  • 76:15: Question: (inaudible) “It’s [D development] all volunteer work.”
  • 77:40: Question: do you expect a standard ABI for C++ any time soon? “I have no idea. I don’t care about it. The ABI is the C++ compiler you’re trying to be compatible with.”
  • 78:35: Question: what VCS do you use? “Git and GitHub.”
  • 81:20: The End.
7 Likes

Thank you for this detailed summary! I need to study this carefully.

C++ FFI has come up in the past, with specific mention of D. For the past discussion, see RFC issue #602.

Templates are going to be pretty hard, because C++ templates (like D’s, Nim’s, etc.) are all ad-hoc rather than strongly typed (based on typeclasses) as in Rust. But you might be able to do the same trick whereby you convert templates into generics and then specify how to do the name mangling, at least for a “well-typed” subset of templates. I dunno how much existing C++ code could actually be made to fit into Rust’s trait system.

1 Like

See also: cxx2rust

I haven’t watched the video yet, so the following might be redundant with what’s in there, but I thought a bit recently about how this might best be done.

  • Edit: Just watched some of the video. As far as I can tell, the D solution relies on some separate C++ file doing specializations and manually translating the C++ declarations into D; I don’t like that at all. I think it would be best to use libclang to import C++ headers automatically (including generating code for inlines if necessary), with tweaks; see below.

  • Since C++ templates allow so much crap compared to Rust, but are a relatively small portion of code in practice, I think it would be best to have the user specify up front which template specializations they want to use, have libclang monomorphize them, change angle brackets to underscores (or have the user specify an alias), and have that be the end of it. The only real alternative would be hooking into the compiler during type checking to generate the specializations then, but that sounds awful.

  • Once you get rid of templates, there is no need for direct Rust compiler bindings. The binding generator could just spit out externs corresponding to the mangled versions of all accessible functions and methods, plus inline(always) Rust wrappers to call them with a more desirable syntax. You could do this either with a pure external tool, or a compiler plugin.

  • In the same file as the specializations, the user could also provide binding-like annotations, e.g. while the default Rust translation of a C++ method would presumably be unsafe and take *mut self, one could mark it as safe and taking a borrowed self. Maybe try to guess this based on basic analysis of the implementations (I think you could get pretty far with that, considering the sheer amount of basic getter/setter type stuff in most C++).

  • Overloaded functions would also be good to annotate; functions with the same number of arguments could be exposed to Rust with traits (eww), but not different numbers. This is not Rust’s fault; overloading sucks. It just causes trouble with bindings. One possibility would be to simply append the number of arguments to overloaded functions’ names, OpenGL style, and do the rest with traits; another would be to expect the user to provide saner names using the annotation file; another would be to make every binding take a single tuple argument, and impl traits on that. Defaults are similar, though I really wish Rust had native support for those…

  • Classes would be handled by brute force - every class is a trait, and every concrete class is a struct with explicit impls for its trait and that of its the superclasses.

  • You couldn’t implement a subclass directly in Rust because there would be no way to express a good C++ API directly in Rust, but the binding generator could work in reverse too. Perhaps the annotation file could contain arbitrary C++ code, which would mostly be passed directly to the compiler, but with an added construct like

      virtual void foo() override = impl;
    
    which would generate a Rust trait, and always_inline C++ code to call into it.   Or something.  Would be nice to not do two virtual calls, but I don't think a code generator would be able to easily tell whether Rust actually implemented a function with the right signature...
    
  • Some stuff would still be ugly. No way around it: idiomatic C++ and Rust look very different. But it would still be miles better than having to manually create C bindings; much easier to have Rust burrow deep into a codebase and slowly take over from inside. ;p

  • Incidentally, I’ve been using Swift a bit recently. swiftc binds to C and Objective-C, not C++, but it’s somewhat comparable - Objective-C code doesn’t really feel “right” from a Swift perspective (no structs, no generics, can sometimes require awkward syntax to deal with enums, out pointers, etc.), but Swift benefits enormously from the compatibility. And it feels fairly magical to be able to just type C declarations/inlines into a .h file and instantly use them from Swift. (While I’ve used rust-bindgen on various occasions, I think I get this feeling with Swift more than Rust, in large part because calls to foreign functions are particularly easy to screw up, and the IDE helps get the types right. Just saying. On the bright side for Rust, when you do screw up, swiftc error messages are awful. ;p)

  • As a fairly simple request, it would be nice to have a way to link LLVM IR files directly into a Rust rlib. In addition to C++, I was also interested in the idea of Swift bindings - I’ll probably never get to that project, but both languages can generate LLVM IR, and bindings would be much improved if Rust could easily do LTO with that IR.

5 Likes

I think to handle C++ class inheritance it might make sense to wait until we’ve figured out whatever we’re going to do with “virtual struct” (i.e. the inheritance proposal). The issue keeps coming up and it’s worthwhile to consider at least some measure of C++ compatibility as a possible goal.

1 Like

I’d argue that it should be considered as part of what’s to be done about inheritance, along with COM compatibility, though I suspect that’s already the idea.

Walter focuses on interfacing on the link level, because he outright dismisses having a C++ compiler in his D compiler. However, Rust is using LLVM already. To me using clang as an optional dependency for interfacing with C++ (perhaps on LLVM IR level?) doesn’t sound too bad.

2 Likes

D's current prototype is

... a fork of LDC [the LLVM D compiler] which allows you to directly import/include a C++ header files and use the declarations from within D.

What's hard is bridging to C++ without becoming a superset of C++ semantics.

Achieving this would provide a path to freedom for programmers stuck with C++ libraries, even if it only worked for the core C++ features.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.