usize
is not size_t
Brief
Currently, while rust does not explicitly guarantee compatibility between usize
and size_t
, some rust FFI code assumes this to be the case. This impacts ABIs such as w65 and CHERI, where the size type is a different size from the pointer-sized type. Thus, to support development for rust targetting these platforms, it is sought to explicitly declare this to not be guaranteed.
Background
In rust, usize
is an integer type which is compatible with (same size as) thin raw pointer types. The standard library docs describe it as
[A primitive for which the size] is how many bytes it takes to reference any location in memory. For example, on a 32 bit target, this is 4 bytes and on a 64 bit target, this is 8 bytes.
The Unsafe Code Guidelines notes that
The
isize
andusize
types are pointer-sized signed and unsigned integers. They have the same layout as the pointer types for which the pointee isSized
The Unsafe Code Guidelines also notably defines that usize
and isize
are respectively compatible with uintptr_t
and intptr_t
defined in C.
This has interesting implications, most notably for FFI. This Pre-RFC is brought to formally declare that the C types size_t
and ptrdiff_t
are not necessarily compatible with usize
and isize
respectively; that is, size_t
and ptrdiff_t
aren't required to be compatible with uintptr_t
and intptr_t
on the platform C abi in order for it to support Rust.
This has been discussed on zulip (archive), and briefly in ABI discussion for w65, which demonstraits potential issues where compatibility is assumed.
Where this Applies
On most platforms, and all that rust currently supports, size_t
and uintptr_t
are the same type, likewise ptrdiff_t
and intptr_t
. However, as rust evolves to support new platforms, some may not make such a guarantee.
For example, in the w65 abi defined by the SNES-Dev Project (which is designed to allow users to develop SNES Homebrew in modern languages, and intends to eventually include Rust), defines the type size_t
as an typedef for unsigned int
and uintptr_t
as a typedef for unsigned long
. These types notably have different size (2 for size_t
and 4 for uintptr_t
). As a result, under this abi, the usize
type is not compatible with size_t
(indeed, the types use two entirely different parameter and return value conventions - the former passes in a hardware register, and the latter in a specially designated memory location). This was initially noted in ABI discussion for w65.
As another example, the CHERI platform, which stores machine-level provenance information in pointer values, has a similar issue - uintptr_t
, which stores a capability, is 128-bit, and size_t
, which does not, is 64-bit.
Why this is a problem
FFI code currently assumes that usize
is "the size type" and is compatible with size_t
, despite nothing explicitly guaranteeing it.
In libc's api definition, the size_t
type is defined as identically usize
. Additionally, bindgen has a flag for this, --size_t-is-usize
, though size_t vs usize · Issue #1671 · rust-lang/rust-bindgen · GitHub notes this exact issue, and seems to rationalize the flag not being default from this exact thing.
While I am not specifically aware of other crates that have this issue, I wouldn't find it hard to believe that it is relied upon elsewhere, or that something in the future may come to rely upon it.
Note: While this can be considered a de facto breaking change, there are a couple reasons it can be justified:
- No existing platform that rust supports has this issue - only new platforms.
- FFI code typically needs to be tailored for the specific platform in other ways, so I find it less likely that some old code wouldn't otherwise have issues when ported to a new architecture.
- In the case of w65, which is primarily a freestanding architecture with only freestanding targets, much of it's FFI will be specific to the platform, and thus written specifically for the platform, with the ability to account for this abi difference. I cannot speak as to whether or not the same reasoning applies to CHERI.
Alternatives
There are a number of alternatives to consider:
- Rust could do nothing explicitly, leaving it as is. This is potentially the most dangerous option, as rust abis for platforms where this the types are not compatible may emerge and come into use, while FFI code may continue relying on this,
- Rust could explicitly declare that
usize
andsize_t
are compatible. This would establish a de facto guarantee as an official guarantee. This would result in platforms, where possible, needing to ensure this is the case. In the case where it would be impossible (the abi already exists and is stable) or infeisible, those platforms simply cannot be supported with rust. I would prefer this option not be taken, as it would likely rule out rust support for the w65 platform (makingsize_t
4 bytes may imperissibly penalize both abi and codegen involving the type) - A hybrid approach of this could be taken, where hosted targets (ones that run with the benefit of an operating system, and, in particular, have
std
available, rather than justcore
and/oralloc
) guarantee the compatibility, and freestanding targets explicitly do not.