Is two’s complement still guaranteed, e.g. how do signed shifts work now?
How does unary minus work on unsigned types? Does let a = -1u still wrap, or it panics/gives unspecified result now, like let a = 0u - 1u, or unary minus can’t be applied to unsigned types at all?
What optimization opportunities unspecified values (overflow/underflow results) provide for a compiler?
Also, unsigned wraps -1 + 1 -> 0 and 0 - 1 -> -1 are used quite often in C in scenarios like:
// Reverse loop
for (size_t i = size; i != size_t(-1); --i) { /**/ }
// Loop counter
size_t count = -1;
while (/**/) {
++count; // Always incremented
/*Some code with breaks and continues*/
}
It will be a pedagogical problem to teach people not to rely on them.
This is surely the last time to have a proper integer name reflecting what it is in Rust (unlike other languages) and to not reproduce the same mistake.
Far less ambiguous than int/uint, especially for imem/umem.
The unfamiliarity is one of the reason for the renaming. The int in Rust is not like the int in C or other languages. The beginners (or not) will surely fall in that trap.
Moreover, like already said, this is the unique opportunity to fix the current code base from integer misuse (like found in stdlib, Servo and a lot more current code).
As seen in the RFC 464, the vast majority of interested people are in favor of this breaking change.
There are things in the standard library that currently use int or uint for no good reason. For example, enum_set::CLike or the exp argument of Int::pow.
Is there a plan to do a pass on the libraries to changes these to fixed-size types?
I’ve often did things a certain way just because the standard library did something similar. It should provide a good example.
I'm not entirely sure. The name of int is too tempting and education cannot perfectly change the people's mindset. I believe there would be endless questions of "why does this not work?"
I thought the name came from a precedent (intptr_t) in C? We've generally followed the precedents to name a thing, although I have a preference on imem.
I think using int as a different meaning from other languages leads to much higher barrier than using some explicit name.
It doesn’t invoke undefined behaviour itself. It produces an undef result, which can easily cause undefined behaviour because the value can vary between each read. For example, using it for indexing an array can result in a different value for the underlying pointer arithmetic than the value that was bounds checked.
That’s just semantics: in any sane definition of Rust, the function
fn saturating_divide(d: uint, s: uint) -> uint {
match s {
0 => uint::MAX,
n => d/s
}
}
should never panic, but if we have LLVM undef-s, you could easily get a divide-by-zero. While this isn’t undefined behaviour by LLVM semantics, memcpy-ing a carefully crafted ROP payload into the stack isn’t undefined behaviour by x86 semantics either, but it is not desirably safe behaviour for a high-level language.
I know, that’s why I reported these bugs as soundness issues. It could be fixed without adding a branch that can’t generally be optimized out. LLVM just needs to add an instruction attribute providing a way of having it return an unspecified value instead of an undefined value.
It’s a wrong name too since a word is not always the pointer size.
IMO iptr/uptr are the best name (with maybe index as an alias of iptr), but any name different than int would be better.
I think this looks great, but another +1 to renaming int/uint. As others
have said, these are too tempting as default types (I personally still use
them when I don’t want to bother thinking about sizes). Yes, it’s less
intuitive to new users to rename them. But the types themselves are
unintuitive - we shouldn’t make them look like they’re not. Making things
less approachable is not an inherently bad thing, because the questions it
forces users to ask are worth asking (and answering). In reality, there
really isn’t a catchall default-appropriate integer type, and we shouldn’t
pretend there is when it’s misleading to do so.
As for breaking literally all rust code, that’s what pre-1.0 is
specifically for: not worrying about backward-incompatible breakages when
there’s a valid reason to break things. I contend that here there is.
fail! to panic! and namespaced enums both landed recently, both broke the world, and both seemed less necessary to me than renaming int. Add me as one more person asking to please not use breaking the world as an excuse not to do the right thing.
IMHO, fallbacking to i32/u32 without renaming int/uint will give the wrong impression that int/uint are synonyms of i32/u32, because that’s what some popular languages are doing. (In practice that means at least C/C++ on 32/64 bit systems, and C#.) And yes, int/uint are too “default looking” anyway.
I don’t see how giving wrong impressions is beginner friendly. If it is different, it should look so.
Another +1 to renaming. FWIW, I now prefer imem/umem.
I agree with OP that iptr/iaddr/ioffset/index etc are all too specific. C/C++ have specialized types like intptr_t/offset_t/size_t, even though some may be aliases of each other. But Rust has only two arch-dependent integer types, and they are used in many different contexts, so we should use names that are generic enough, but still different from int/uint.
If we must rename int/uint, I’d prefer iptr/uptr or ix/ux (here x means non-fix-sized bits, maybe 8/16/32/64, depends on platforms/machines. think how we named i8/i16/i32/i64 and u8/u16/u32/u64).
I want to thank the core team for giving these integer proposals in-depth consideration, experimenting with names, and deciding! I'm not thrilled with the int/uint names, but such is life. (Would the names ix/ux or i/u have fared better?)
On the topic of breaking changes, note that this was not a risk-reduction choice. (It's more risky than having to revisit all uses of int/uint to find those that should become WrappingInt or int32.)
@arielb1 could you please add background for those of us who're unfamiliar with LLVM and x86 internals? ROP = Raster Operations Pipeline?