There has been a lot of discussion about Rust’s integral types, and in particular a lot of questions about what type to use for integer fallback and what to call the “pointer-sized” integer type. We (the core team) have been reading these threads and have also done a lot of internal experimentation, and we believe we’ve come to a final decision on the fate of integers in Rust. The purpose of this post is to clarify that decision and explain our rationale.
Integral types. As today, Rust will continue to offer a wide variety of fixed-sized integral types (
u32, etc) as well as two integral types whose sizes varies depending on the architecture (
uint are always defined to have the same number of bits as a pointer on the target platform (we assume a flat memory space).
Guidelines and fallback. To save memory and ensure consistent behavior across platforms, users are encouraged to use fixed-size types where possible. However, it frequently happens that you have integers that are tied to the size of memory: for example, indices into an array or the size of an allocation. In these cases,
int are an excellent choice, though it may make sense to use smaller, fixed-size types if you know that the length of the array (or size of the allocation, etc) is limited. In accordance with these guidelines, we are accepting RFC 452, which says that integer literals whose type is not otherwise constrained will fallback to
i32. As part of the stabilization process, we plan to examine integer types appearing in libstd APIs to ensure conformance with this guideline.
Overflow. Our plan for handling overflow is to adopt a variation of RFC 146. Roughly speaking, the idea is that after every integer operation, there is a
debug_assert! inserted by the compiler that checks for overflow. Because this is a debug asserion, it will typically be compiled out when performing optimizations. Since overflow cannot cause crashes or data races (outside of unsafe code), we can skip these checks without endangering Rust’s core value proposition of safe systems programming. Whenever checks are disabled, overflow will yield an undefined value (this is distinct from–and much more limited than–the “undefined behavior” you get in C).
This design aims to balance the benefits of overflow detection with the performance cost of checking for overflow on every integer operation. Also, by ensuring that every integer operation is checked for overflow in debug builds, we should be able to avoid people relying on the behavior of overflow, giving ourselves room to ratchet up the safety checking in the future (as well as possibly adding other ways to control when checks are compiled in).
For those cases where overflow is actually desired, such as hash computations, we will provide an explicit
WrappedInt type that can be used to request wrapping semantics. This has the advantage of clarifying to the reader that overflow is an expected part of the calculation. Finally, the
CheckedInt type (which exists today) can be used to guarantee that overflow checks are performed, if desired. In the future, we may provide more nuanced means of enabling overflow checks also for normal integers (such as scoped attributes).
Frequently Asked Questions
How are these details different from the status quo?
Why not use the name
int for 32-bit values?
int be an alias for
uint an alias for
u32) would create two names for the same type; moreover, the names
u32 clearly communicate the size of the type. This does mean that, on 64-bit platforms, we differ from many C compilers, which use the name
int to refer to 32-bit values; this can be a hazard when writing FFI declarations, and for this reason we have lints that warn about the use of
uint types in such cases.
What about renaming
uint to something more explicit, such as
There have been numerous requests to rename the
uint types. The primary concern is that the current names suggests that these types ought to be a user’s “default” choice, when in fact a pointer-sized integer is often larger than is necessary (not to mention that it will cause program semantics to vary depending on the target). We spent quite a lot of time deliberating this point and exploring alternative names.
Ultimately, however, we have chosen to leave things as they are. Given that changing the name of the type
int would affect literally every Rust program ever written, the bar for making such a change, particularly at this point in the release cycle, is quite high. There seem to be several strong arguments in favor of the status quo:
- We believe that adjusting the guidelines and tutorial can be equally effective in helping people to select the correct type. In addition, using
i32as the choice for integral fallback, in particular, can help to convey the notion that
i32is a good “default” choice for integers.
- All of the alternate names also had serious drawbacks. For example, the type
iptrstrongly suggests that the value is a pointer cast to an integer. Similar concerns apply to
offset, and other suggestions. Ultimately, pointer-sized integers are used for many purposes (indices, sizes, offsets, pointers), and we don’t want to favor one of those uses over the others in the name. We did produce alternate branches using the names imem/umem as well as isize/usize, but found that the result was fairly unappealing and seemed to create an unnecessary barrier to entry for newcomers. Ultimately, whatever benefits those names might offer, they don’t seem to outweigh the cost of migration and unfamiliarity.
If integers silently oveflow in optimized builds, won’t this mask bugs in shipping code?
While integer overflow cannot lead to crashes or data-races, it is of course true that if we don’t check for overflow, it may lead to other sorts of bugs. This is unfortunate but we have to balance the real-world performance factors with the possibility of bugs. Also, we reserve the right to make checking stricter in the future. Finally, if a strange bug is encountered, at least a developer will quickly notice the overflow if they attempt to reproduce on a debug build.
Can’t overflow sometimes cause crashes?
There are corner cases in which incorrect codegen can cause LLVM optimizations to drop bounds checks, leading to segfaults. We consider any such case to be a bug in rustc. Also, one should take extra care in unsafe code that is doing pointer arithmetic or bypassing bounds checks, as overflow can violate invariants that you expected to hold. Of course, the fact that unsafe code can cause crashes is nothing new, that’s why it’s labeled as unsafe.
Aren’t you concerned about de facto lock-in for overflow?
There is certainly a concern that we will not be able to make overflow stricter without breaking code in the wild. We believe that having overflow checking be mandatory in debug builds should at least mitigate that risk, as intentional uses of overflow should be detected long before the code ever comes into widespread use (and redirected to the suitable wrapper type).