Solve `std::os::raw::c_void`

The types std::os::raw::c_void and libc::c_void are incompatible. (They are incompatible by being different types). The other std::os::raw types don’t have this problem since they are just type aliases.

It’s a minor issue that savvy users will brush away with a simple as *const _ cast or similar, but hopefully since the c_void type(s) will be used extensively in the rust ecosystem, also for cross crate interoperability, a solution can be found.

There’s already a rust issue for this: #31536

A user showed up with a real library incompatibility due to this yesterday. IRC log from #rust

Alternatives for Solution

  1. Deprecate std::os::raw::c_void. It is actively harmful to the ecosystem, which is serious enough that deprecation of a stable feature should be on the table.
  2. Move the definition of c_void to libcore. Both libstd and libc reexport the same type, so they are compatible again.
  3. Reexport libc::c_void as std::os::raw::c_void

Risks of deprecation

  1. Leaves the issue unsolved in a sense, since the type still exists
  2. Deprecation of stable features leaves a wart

Risks of reexport from libcore to std and libc

  1. Requires version bump of libc. However, it is already the case that libc 0.1’s c_void is incompatible to libc 0.2’s c_void and so on for each major version. libc will version bump at some point anyway.

Risks of reexport from libc to std

  1. ?

Risks if left as status quo

  1. libstd contributes to ecosystem friction, not unification
2 Likes

I actively switched several libraries (including gl) from libc::c_void to std::os::raw::c_void after the libcpocalypse.

For reminder, the libcpocalypse was exactly that: some libraries were using *const libc_version_0.1::c_void in their signature, while some others were using *const libc_version_0.2::c_void.

Please don't do this. winapi takes advantage of this by using std::os::raw::c_void as its c_void, which means that the RawHandle of std's AsRawHandle is the same type as winapi's HANDLE. Even if I do eventually make winapi #![no_std] in a future major version, I'll still provide a cargo feature to enable this std compatibility.

I think the whole std::os::raw module should also be available in libcore.

C compatibility is an important part of Rust, and even has its own syntax (extern "C"). It’s unfortunate that it’s currently not really possible to talk to C APIs without adding a dependency on either std or libc.

4 Likes

I'll repeat what I said in #31536:

I think there is a good case for moving these types into libcore to support FFI with freestanding C code which does not use a libc. This is mainly relevant for embedded and/or kernel code, which will be using no_std and will not be able to use the libc crate.

This will also help with removing libstd's dependency on libc on Windows, since most remaining uses of libc in libstd/sys/windows are just the basic C types.

5 Likes

The libs team discussed this issue during triage, and the conclusion that we reached was that we’d actually just prefer to jettison these entirely from the standard library. Along those lines, we toying with the idea of deprecating std::os::raw entirely in favor of just using the libc crate.

We’ve already accepted RFC 1415 which deprecated all of std::os::$platform::raw, so this would in theory just be an extension of that. Note that deprecation here is just that, deprecation, no removal would ever take place until a theoretical Rust 2.0.

There are currently no public interfaces in the standard library that continue to export these types. After RFC 1415 all OS extension traits were switched over to concrete u64 etc types rather that returning some C equivalent. In that sense, these types are only really available purely for consumers, not for the standard library itself.

We do realize, however, that there was a recent shift in parts of the ecosystem away from libc and towards the types in the standard library, and this would essentially be reversing that. The libs team, however, believes that FFI really should be done through libc, the crate dedicated to this purpose. The standard library in theory wants to never tie itself to C in any way, shape, or form to allow for a maximal number of reimplementations on top of whatever underlying runtime/kernel/etc.

A change such as this would of course require an RFC, but we wanted to try testing the waters here first. What do others think of a strategy like this? Is it perhaps too radical?

This sounds good, however I still think there is a case for making C types available without linking to the C standard library. This could be done by splitting of the types of libc into a separate libctypes crate which doesn’t link to the C library. These types would then be re-exported from the libc crate to maintain API compatibility. With this setup, libctypes becomes the central source of “truth” for all C-related types.

5 Likes

I think it is important to not conflate the system ABI with libc.

The C ABI is the most widely used ABI on most platforms, even for programs not written in C. ABI means more than just calling conventions. The FFI abstractions (in particular the c_* types and zero-terminated c_char[]) are necessary to communicate with other parts of the system such as libraries, or the kernel.

libc on the other hand is just a bunch of utility functions wrapping OS functionality.

For example, a future Linux-syscall based std will still rely on the FFI abstraction, but not on libc.

I don’t really have an opinion about whether FFI should be in std or a separate crate, but it should definitely not go into libc.

3 Likes

Please, no!

The standard library should expose all types that allow libraries to communicate with each other. If you look at any library’s public API, it is made of String, Vec, Iterator, etc. and almost never exposes types from a third-party library. In fact exposing types from third-party libraries in a public API has been the major cause of library breakages since Rust 1.0. If c_void gets removed, that will leave a huge hole in this domain.

As it was said in one of the issues, the Rust language already has first-class support for C through the extern keyword and through the improper_ctypes lint. I don’t see why c_void would be removed in the name of purity when these other mechanisms exist in the core language.

1 Like

Thank you libs team.

I think deprecating all the types is going to be hard to sell to the community. Deprecating just c_void is easier, since we are solving a real bug.

I would be happier with a deprecation of c_void than no action at all. Stable deprecations suck, but this is exactly where they should be used, when the API item is actively confusing or harming users.

But, I would prefer the move to libcore solution. It breaks no stable rust and it’s a pragmatic solution. I think it’s a better engineering solution.

Will c type aliases be mis-defined for some platform, and then change in a later version again? I think that this will happen somewhere in the long history of rust. Would a libc crate do the fix in a minor version, or would it be a major version bump? I think a major version bump is unreasonable in an ecosystem integration crate; in fact the version bump causes more amounts of version disparity disruption than the amount of breakage a single breaking bug fix would do. It’s a hard choice for libc.

It is then much better if the basic types are defined in libcore and tied to the Rust version, instead of a crate version. libcore and Rust are privileged to be participating in builds in only one unique version, unlike crates, so we have no conflicts among types defined by them.

The solution could also be more minimal. libcore could provide a single void type that is intended for c_void use, without providing the rest of the type aliases.

I’m definitely opposed to removing the basic C types from std. There is a difference between the C ABI and the C standard library + friends. Someone can interface with C code and write entire programs without any interaction with the C standard library, Windows being one such example. std::os::platform::raw is specific to interfacing with libc stuff. std::os::raw applies to any interaction with C, even that which doesn’t touch libc.

If std::os::raw::c_void were deprecated, what about RawHandle from AsRawHandle since that too is a *mut std::os::raw::c_void? It definitely can’t be changed to be a pointer to some other type, as that would break a lot of code out in the wild that uses the handle from AsRawHandle in other winapi functions without casting it. If you do end up making the poor decision of deprecating the std::os::raw types, I’d sooner define them myself in winapi than depend on libc, resulting in yet another incompatible c_void.

I agree with the above posters. The C types should be in libcore so that libcore-based, non-libc-based libraries can use the FFI correctly, and so that they work correctly and uniformly in libstd-based and libcore-based programs. Nobody should have to depend on the libc crate unless they actually use libc (call a function or whatnot in libc).

1 Like

Thanks for the comments everyone! It sounds like the main objection to deprecation is that there's a use case for using these type definitions without using the libc crate in the sense that it's bringing in some C runtime library. Along those lines I'd like to address a few points:

  • I don't think that std::os::raw has anything to do with ABI. I don't think the C ABI defines what it means for a function to accept a value of type int, but rather what happens when a function receives a 32-bit integer or 64-bit integer. In that sense these types truly are only in existence for compatibility with the C language itself. FFI with any other language (or even C!) could use u32 or other similarly concretely sized types just fine.

  • It sounds like there's no technical reason why there can't be some crate on crates.io providing the fundamental C types for various platforms. Conventionally there's a question as to whether they belong in the standard library or not, but technically they can go anywhere. This library would likely be tagged with #![no_std], of course.


I think this is a bit mis-representative of the current ecosystem and is also missing a few critical points. It's not an antipattern to export another crate's types in your API. For example all *-sys crates likely export libc types in their API. Additionally, many crates provide an implementation of rustc-serialize traits as another example.

The crux here is that there are some dependencies which are widely shared throughout the ecosystem for types to pass between API layers. The standard library is currently at the core here (as there's only one), but other very core crates like libc and rustc-serialize serve this purpose as well.

Essentially, this is just the same issue the @bluss opened this thread for. There is no canonical definition for the core C types, and there is obviously a desire to have one. Whether or not this is in the standard library or in an external crate I believe does not pose a conventions or technical issue either way. If the os::raw module is deprecated, it will most certainly be done in favor of a shared and canonical definition of these types.


@bluss

I agree that only deprecating c_void is likely easier to swallow, but I would personally like to be more aggressive here. We've already been bitten with wrong definitions of C types in the standard library, and in some sense there is no definition of C types. We fundamentally cannot provide one definition of c_char, for example, on ARM due to compilers implementing flags like -fsigned-char.

These sorts of questions, in my opinion, are simply just best handled in an external crate which is more flexible than the standard library. I disagree that tying these to some version of Rust is a benefit because you're guaranteed to "one unique libcore". To me that's actually a downside because it means that iteration time is slower and it's impossible to duplicate where in some situations having multiple libc versions can be legitimate.


It's true that is an interesting question. Similarly we return c_char from the CString type (now that I think about it). It may be the case that this forces our hand one way or another, but I would prefer to select where we want to end up and then we can figure out solutions for problems like this.

I do think it's an antipattern. The fact that many crates do it wrong doesn't convince me that it's right.

The problem is that right now if I'm not mistaken there is only one type available in core/stdlib that is suitable for representing raw pointers to objects-of-undefined-sized, and that's c_void. See rust/src/libstd/os/raw.rs at 8d2d2be6c61c17da8027a72da91f87a0e2487f74 · rust-lang/rust · GitHub.

If I understand correctly *const () is bad, while *const u8 would work (I think) but lets you think that the data is a u8.

This problem is not related to C, but to ABIs in general.

4 Likes

There are technical reasons. Every time there is a type or trait that is used for interoperating between crates, all interoperators need to use the same major crate version.

We are struggling with this in the ecosystem right now, I'm sure we will come up with a solution. Please don't wave this concern away. (And this concerns me in the crates I develop both when I depend on traits and when I make such available. In fact, most of my experience has nothing to do with libc at all.)

Example situation:

  1. My Library1 wants to allow serializing its types, so it depends on serde = 0.6 (optionally, probably)
  2. Serde 0.7 is released
  3. Library2 integrates with serde = 0.7
  4. MyApp wants to use both Library1 and Library2
  5. Library1 needs to upgrade its serde version

Should Library1 use a major version bump when it upgrades an optional supporting dependency? If you say yes, the above scenario repeats itself recursively with the replacement of serde with Library1. All dependencies on Library1 now have interoperability problems. Library1 0.1's traits are not the same as Library1 0.2's traits! Those that use Library1 for interoperation (traits maybe), now have their subecosystem split into the parts that use Library1 0.1 and Library1 0.2.

This is an example why I think @tomaka is rightly very careful with non-standard types and traits. It requires a lot of tinkering to make sure everyone is on the same version.

Any crate that exposes types or traits used for interoperation should version bump as seldomly as possible to limit this.

For example, if num would release a 0.2 version soon, we will have interoperability problems between those using num 0.2's Float trait and those old useful crates that are stuck on num 0.1 Float -- even if the trait is only ever implemented by f32, f64!

Edit: My post is maybe not so on topic for the libc decision, but it explains what problems I see in the rust ecosystem at this point. These arguments elaborated there deal with the concerns of large crates. If you write a "FancyRng" crate, your single concern is Rngs, and it's very logical to version bump to follow the rand crate's changes if you need to.

If you have a big enough crate, you want to integrate with many small moving parts. I think serde is a good example, since it's widely used, in quick development, and it's almost never the core part of a major user. It's a supporting feature. Using libc is of course the same way.

If we respond by bumping the major version whenever one of these integration crates change, libc, serde, num, what they may be, then we are just amplifying the problem by ensuring there are even more separate major versions of crates.

5 Likes

From x86_64 SysV abi page 12, Figure 3.1: Scalar Types:

type     | c    | sizeof | aligment (bytes) | AMD64 Architecture
...
integral | int  | 4      | 4                | signed fourbyte
...

In short: what int means is part of the ABI. Of course, just being part of the ABI doesn't draw an immediate conclusion (perhaps we don't need core to know about the whole ABI), but:

Not without manually encoding the ABI themselves (by using some #[cfg] to pick the appropriate fixed size type). I suppose that could happen in an external crate, as long as libc is not that crate (as has been noted previously, the ABI exists independently of the library).

The problem I'd see with an external crate is that we have some things in std that need to know parts of the ABI (CString, due to the unfortunate business of c_char). As a result, if we go the external crate route std still needs to encode some part of the ABI, so we'd have part of the ABI encoded once place (std) and other parts (or perhaps a complete duplicate) encoded another. This bit isn't relevant to resolving the c_void issue, though.

For that I would prefer something in core to lower the amount of convention necessary to implement bindings to C correctly and composably. And given that choice, combined with the above (and a desire to have 1 source of truth), migrating all the c abi types into core would be preferable.

2 Likes

Correct. And while int is fairly straightforward on most platforms, for example long int definitely has very different definitions on many different platforms. Also part of the ABI: repr(C) alignment.

@bluss, @tomaka

What I think you're touching on is the idea of public/private dependencies that @wycats and I have been intending to write an RFC about for some time now. There's a notion of a public dependency, one which is exported through your API, and a private dependency, one which is purely internal. Obviously if you change an API you need a new major version, and consequently it should be obvious that if you upgrade a major version of a public dependency that you just changed your API and therefore need a new major version as well. Private dependencies can be upgraded/duplicated whenever as they're not part of your public interface.

This is not currently implemented in Cargo but it's the essential principle that the community needs to follow. We hope to provide tools to help with this, but that's how I believe we should be framing this discussion of moving types to libc. So along those lines...

The point I'm getting at is that this is not wrong at all so long as it's done properly. This is essentially having a public dependency which is a serious API commitment on behalf of a crate, and it's just unfortunately not clear today that this is happening.

Making a blanket statement that essentially all dependencies should be private dependencies is unfortunately a non-starter for an ecosystem.

Otherwise yes, the problem is that there are multiple definitions of c_void, and my claim is that it doesn't matter whether it's in core, std, or libc so long as it's only in one. If we have to make a choice then the libs team currently prefers libc as the location for c_void and we'd deprecate the definition in libstd.

This is correct! Following this logic, however, we should put everything in the standard library which we certainly don't want to do. The real conclusion to draw here is that public dependencies need to be very stable very quickly. Due to the need for all crates to agree on versions, upgrading does indeed cause a lot of breakage (see the points about public dependencies above).

This is why the libs team invested in the libc crate early on after 1.0 and have now migrated the crate to the first official rust-lang crate. This crate is the foundation for any and all FFI and needs to be stable essentially immediately, which is what we've now done.

The rest of your comment and thoughts are indeed correct, but in the interest of staying focused on libc I don't think we have to worry much. We've vetted a design for libc 1.0, we're just waiting for the right moment to release 1.0, and we most certainly understand the cost of a 2.0.

I think it'd be useful to shy away from "problems at large" in the ecosystem today and continue to remain focused on libc and where c_void is going to be located. Otherwise we may derail ourselves easily!

As far as I know these cases are very few and far between. I would much rather invest effort and cognitive thought into rectifying this situation rather than view it as a reason for why these types should be pushed into core, no less.

What you say is true, but in practice what I saw happen in the ecosystem made me lose all hopes that things would be done properly.

In my opinion the only proper way out of this situation is to have some kind of tool to automatically detect whether semver is broken between two versions. Otherwise semver is going to break.

1 Like

I would love this! I suspect there are more ways to break semver than I realize, even when I'm being very cautious, and it would be extremely valuable to have a tool compare this automatically.