Request for an RFC: pimpl for Rust

Nitpick: The pimpl pattern per se is suboptimal and is basically a hack to work around a limitation of C++. Specifically:

  • It's suboptimal from a performance perspective because there's an unnecessary double indirection. The caller of next_token passes a pointer to lexer as this, and the lexer in turn contains a pointer to the lexer_impl.
  • It's suboptimal from an ergonomics perspective because you need to define two structs and write a wrapper for each method.

The limitation in question is that C++ requires you to put a class's methods and its fields (including private ones) in the same class definition. In C, there are no methods, but you can write:

struct lexer;
struct lexer *lexer_new();
token lexer_next_token(struct lexer *lexer);

This uses only one struct and no wrapper functions. And there's no double indirection: the caller passes a pointer directly to the implementation struct.

Users of the header don't see the fields of struct lexer. They don't know its size, so they can't allocate lexers on the stack or evaluate sizeof(struct lexer) But they can still reason about pointers to struct lexer, and call functions like lexer_new that do the allocation for them. This is somewhat similar to the concept of an unsized type in Rust.

Translating this into Rust, perhaps you could write something like

#[publicly_unsized]
struct Lexer {
    // fields...
}

For code outside the crate, the compiler would refuse to establish Lexer: Sized, so you could not call size_of::<Lexer>() or pass Lexer to generic type parameters that require Sized. You might also be forbidden from passing or returning Lexer by value, as is the case for the C equivalent, although it might be possible to allow it as part of unsized_locals (it would translate to a dynamically-sized stack allocation).

In return, this would guarantee that code generation in client crates does not depend on the layout of Lexer (except for the offsets of public fields, if any are present), so private fields could be arbitrarily added, removed, or changed without requiring them to be recompiled.

This would be a superset of the functionality proposed in the original post. If, say, you wanted to expose a sized type so client crates wouldn't have to care about unsized types, you could still have a wrapper struct containing a pointer to an inner "impl" struct; you would just mark the inner struct as #[publicly_unsized].

Another alternative

We could just bite the bullet and have the equivalent of header files. Something like:

pub extern struct Lexer;

impl Lexer {
    pub fn new(src: String) -> Box<Lexer>;
    pub(crate) fn next_token(&mut self) -> Option<Token>;
}

...where the layout of Lexer, and the implementations of the methods, would be in separate file.

That would maximize the programmer's control over the compilation firewall, as well as the compiler's ability to perform separate compilation. But... supposedly Rust doesn't need header files. :wink:

22 Likes