Consider shipping libclang with Rust toolchain?

The issue I see with shipping libclang with the Rust toolchain is the instability of the C interface. Meaning that upgrading your rustc might lead to your compilation breaking because the shipped libclang upgraded but your bindgen doesn't support it. The big difference to an Ubuntu upgrade updating the libclang and breaking the build is that it falls under the responsibility of the user.

The optimal solution would be to rewrite the C/C++ parser in Rust but of course that's a huge amount of work, will introduce bugs, etc.

The libclang C interface is considered to be fairly stable, so I'm not sure upgrades will actually be a major issue.

However, the C++ libtooling API that I would like to use is unstable (requiring maybe 3-5 #ifdef changes per major version), so it would necessitate monitoring the toolchain and preparing for upgrades as they happen. This doesn't affect users using bindgen as a CLI tool, since it would be statically linked and not change until rebuilt, usually when being upgraded. It might be more of an issue for library users, as the library is rebuilt regularly. We could prefer system libraries if present and fall back to the toolchain libs only if necessary, which would be strictly better than the current situation.

The optimal solution would be to rewrite the C/C++ parser in Rust but of course that's a huge amount of work, will introduce bugs, etc.

Yep, this would be a herculean task as the frontend parser for C/C++ is insanely complex and a moving target. Getting to exact compatibility with existing compilers is a necessity, which took Clang something like 10 years with a large development team. Unfortunately there's only a handful of options if you want to parse real-world C/C++, and clang is certainly the friendliest.

1 Like

Unless I'm mistaken, bindgen supports every version of libclang from 6.0 onwards, which is a pretty wide range of support. So, at the very least bindgen doesn't view the API as too unstable. Additionally, if libclang is shipped with the Rust toolchain, bindgen can start to specifically support the versions of libclang shipped with the Rust toolchain.

1 Like

I'm basing the 6.0 support based on the activation of the clang_6_0 in clang-sys by bindgen: https://lib.rs/crates/bindgen

As bindgen increases its MSRV regularly, it doesn't need that much of backward compatibility. In other words, for older rustc's, bindgen wouldn't compile anyways. Instead the main case I'm worried about is forward compatibility: an old bindgen using new libclang/rustc combination and failing because of the libclang.

Since I made my comment above, I've seen this comment in the llvm docs (link):

LibClang is a stable high level C interface to clang.

But I'm still not sure what "stable" means in this context. The issue is what happens if they interpret it differently, or abandon it.

As nice as this sounds in theory I can't see this actually being useful for projects like Firefox or Chromium that care greatly about the compiler version they use to the point that they build and distribute their own toolchains. If you're using bindgen then presumably you need it to generate correct output in that the way it interprets the C code needs to exactly match the way the C compiler the rest of your project is using interprets it. I'm not sure how you'd square that with having a libclang shipped with Rust without making those types of users build their own custom Rust toolchain where the bundled libclang matches the version of clang they want. (Last I knew Firefox was using stock upstream Rust binaries.)

It would certainly be nice if clang and libclang weren't treated as separate things and it was easier to use any arbitrary version of clang you're using as a compiler as a library as well, but that feels like more of a question for the LLVM project.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.