We do have the FFI lint for this reason.
+1 for Design 2. Size of the integers should not change just because it is compiled to a different architecture.
The entire discussion here is framed in the wrong way. The int
and uint
types are not convenient âmachine sizeâ types. They do not correspond to the general purpose register size (word size). These types are pointer size, which is different than register size on some architectures.
There are other platform dependent integer types that are useful⌠and the int
/ uint
naming is only sensible for word size at best - which is not pointer size.
If weâre keeping the int
type as an alias for i32
, we might as well
do the C# thing and rename the sized integers accordingly instead of
having multiple names, for maximum intuitiveness for people coming from
languages with 32bit int
types.
Few languages have a 32-bit type called int
. Itâs not how the type is defined in C - itâs how itâs implemented on a subset of platforms, but the only guarantee is that itâs at least 16 bits. The long
type is at least 32 bits, not int
.
So how is offset
and the built-in indexing going to be defined? Rust has clear use cases for pointer-size integer types, and making it a separate type on different platform will greatly hurt portability. The default would be that code simply doesnât work on other architectures.
I donât think any of the 4 proposals here makes sense. It makes distinctions between choices that arenât exclusive and is missing the most prominent view points on this issue.
The first design implies that int
is a default, but there is no current default in the language. The i32
type is being added as a fallback for inference which is the closest thing to a default. It claims that thereâs an opportunity to provide a good decision / guide design but fails to substantiate that claim. The usual design suggestion is to use the integer type thatâs large enough for the use case - and while you can provide guidelines for some common cases, there is no sane âdefaultâ choice.
The second proposal simply creates duplication in the language. The listed pros and cons make no sense at all because programmers need to be aware of the definition of the type regardless of the name. Choosing an integer out of a hat and hoping that it doesnât overflow is ridiculous.
The third proposal makes some of the same mistakes and misses the point of having pointer-size or register-size integer types (not the same!) entirely. Again, the integer type needs to be chosen based on the needs of the use case. Youâre not going to catch most of these issues simply by running the test suite - thatâs not a typical integer overflow bug. They are going to occur in the rare edge cases you didnât think about when choosing the size (picking an arbitrary type doesnât work).
The RFC proposing that the pointer-size integer types - which we do have, and do need - be renamed to an appropriate name was what the community was behind, and this is a poor substitute for that. It doesnât address all of the concerns and is full of inaccurate claims.
I vote for option #1. I donât quite see how not having an int type would be a show stopper for people new to rust. For instance thereâs no float
type, only f32
and f64
.
In particular, in the âProsâ for option #2 I read âThis design encourages people to use 32-bit integers when they donât have a better idea in mind.â
Is that really a âproâ though? As you mention in option #4 unless youâre using bigints you probably canât ignore the width of the type. So Iâd say that might just be encouraging sloppy programming.
Maybe Iâm biased because Iâm mostly doing low level programming these days, but are there really cases where some of you write code in C, C++, rust or whatever and donât care about the width of the integer type? I donât have a concrete use case for the âGood enoughâ integer and what does âGood enoughâ even mean? If you tell the user (especially those coming from languages like python which use bigints for basic integer types) that the rust int
is good enough youâre giving them a gun to shoot themselves in the foot.
Iâve made a quick unscientific survey of the C and C++ code lying on my hard drive (both mine and third party code). The only uses of int
s I see are either:
- code assuming itâs at least a certain size (C garantees itâs at least 16bits, Iâve seen a bunch of âassert(sizeof(int) == 4â as well), but in this case you can easily just use an u16 or u32, theyâre probably just using int because it might look nicer or doesnât have to include stdint.h or other headers.
- iterating through an array (and thatâs arguably dangerous if you arenât making sure the size of the array fits the int)
- signaling errors in return values but thatâs not really a use case for rust and itâs still about assuming a minimal range for
int
When doing maths with loosely constrained ranges I usually end up using floating point, not integers (or bigints if theyâre available). And when I need to do arithmetics with integers Iâm very careful about not overflowing. The soon to be added debug checks for overflow would help with that though.
Perhaps less importantly, making int
an alias for i32
would happen to match the C int
type on x86 and amd64 but that wouldnât be true on all other architectures. If support for one such architecture (where the C int is not 32bit) is added at some point then existing broken FFI code that uses the rust int
type to match the C int
type would break. Again, not really a major concern at that point but I thought it was worth considering.
However Iâm in favour of adding some form of integer coercion to limit the use of casts, although maybe simply allowing slices/arrays to be indexed by any unsigned integer type would be enough? That seems a bit more conservative than allowing coercion in the general case. Implicit type conversions is one thing that I really donât like about C, although I suppose itâs not so bad if you only allow it to a bigger type.
But then as you mention that would make coercing to and from the isize
type change from architecture to architecture and that sounds pretty nasty to me. If I understand correctly that would make this code build on amd64 but not on 32bit architectures:
fn foo(index: u64) -> T {
some_slice[index]
}
Iâm not really sure I like the sound of that.
Itâs funny that the person claiming that my comments are âtechnical sloppyâ is so clueless about an issue where theyâre presenting themselves as an authority. I suggest reading the in-depth discussion on this issue and the well-written (unlike this noise) RFC by Jerry Morrison. Starting a whole new discussion thread when you have little grasp of the problem area isnât helping anything.
There is actually a similar suggestion about introducing multi dispatched integer indexing to the core datastructures that may fix the ergonomics problem without introducing coercion in this comment thread.
Yeah, I agree it is better than adding general coercions especially when âu64 -> usizeâ may or may not work depending on architectures.
The term âdefaultâ here is not, as you say, a technical âpartâ of the language. Nonetheless, it is common for people to have a âgo toâ integer type that they pick first. This is what the default is, and Yehudaâs point (I think) is that people will pick int
or uint
, whatever we say, so we might as well align int
/uint
with what we think the best overall choice would be. (I think integral fallback is mostly a red-herring, since it really only applies in small one-off programs or other random integers floating about, typically 0 or 1.)
Also, a point of clarification. By âregister-sized integer typesâ, I presume you mean âfastest sizeâ? I mean, on an x64 system, there are at least some registers with every possible width, so the term is not particularly precise. In any case, it is certainly true that none of the listed proposals included variable-size integer types except for pointer-sized (and they all assume a flat address space as well). This is no accident. Speaking for myself, I think we need to keep the zoo of integer size types to a minimum, and options like âfastest integer sizeâ donât carry their weight. Every machine-dependent type carries overflow and portability hazards along with it, and if you really really care about using the âfastestâ possible type, itâs easy enough to define your own aliases with a #[cfg]
switch.
While I think the idea of parametrize the core data structures over their index type has potential, it is not a panacea. This indexing would have to spread very far â for example, iterators would also need to be parametrized so that calls to enumerate
know what type to yield. In general, I think we should be wary of using type parameters to address every problem. I think permitting coercions from small to bigger integer types seems useful and harmless in any case, and might go a long way towards improving ergonomics (though I know itâs only half the problem).
+1 to this gentleman and his reasonings.
(Except for integer coercions. The problem can be solved with existing multidispatch traits without adding another complexity to the language.)
From my C++ experience int
as a "default integer type" is rarely used in professional code, discouraged by guidelines and isn't really needed. size_t/ptrdiff_t
and fixed-width types are used instead, and Rust already have them all (modulo renaming). Even addition of int
as a simple alias to i32
(a feature aimed solely for beginners) creates more problems than solves and beginners will have to relearn in the end.
This is the exact bias that is good and needed, Rust is supposed to be a low-level language after all : )
In general, this restart looks more like an attempt to disregard the arguments and the consensus from the previous discussions, than something constructive.
Nonetheless, it is common for people to have a "go to" integer type that they pick first. This is what the default is, and Yehuda's point (I think) is that people will pick int or uint, whatever we say, so we might as well align int/uint with what we think the best overall choice would be.
I see this repeated by several people but I still don't get it, so I'm going to repeat myself untilI get an answer: why would you encourage people not to care about the integer width if you're not using bignums? And even if you do, what's your rationale for choosing that 32bit is the right default? I still don't get the motivation, can we get some concrete examples of what this mythical "good enough" integer would be used for?
I think permitting coercions from small to bigger integer types seems useful and harmless in any case, and might go a long way towards improving ergonomics (though I know it's only half the problem).
I gave an example of problematic cast in my post above, coercing u64 to usize wouldn't work on 32bit architectures. Neither would u32 to usize on 16 bit architectures. That would be an easy way to write non-portable code.
This is actually a fairly good point. It limits one of the implicit pros of having int be an alias for 32 bit -- we'd still want to lint ints out of FFIs, though it would certainly mitigate the harm of being sloppy in practice.
Who said anything about encouraging it? We're just recognizing reality.
I do not feel that 32 bit is a good choice, though I find some of the arguments made in favor of 32 bit somewhat persuasive. My feeling has been that pointer-sized is actually a pretty good choice. Anecdotes are not data etc, but looking briefly through my code, I see a fair number of uses of uint
where I haven't thought deeply about the range of values they will take on. Almost invariably, they are counters, either for recursion depth or indices of some kind. For these cases, choosing the size of address space is a safe upper bound. So, for the way that I write code, uint
is a safe "go to" choice. Now, in practice, I doubt most of those values will exceed 32 bits, so I think one could argue that u32 would have served as well, and given me smaller data structures to boot (though I doubt that this size different would be measurable in most cases).
Who said anything about encouraging it? We're just recognizing reality. [...] Now, in practice, I doubt most of those values will exceed 32 bits, so I think one could argue that u32 would have served as well, [...]
So, would you say that having uint
named that way encouraged you to use it instead of u32?
Ah, I see your point. Youâre arguing for âno type named uintâ. Fair enough. Yes, I think perhaps having uint
encouraged me somewhat. On the other hand, in writing C++ code (where using a naked int
or unsigned
is very gauche), Iâve certainly seen that int32_t
and uint32_t
become the reflexive âgo toâ choice instead, and it certainly happens that those types are used where (imo) a wider type might have been a safer choice.
Also, in my comment, I didnât mean to imply that I think I should have used u32
in those cases (though from your quoting it looks like thatâs how it sounded). I think uint
/usize
/whatever was probably the right call. Iâd say itâs good to substitute smaller integral types where the domain allows, but itâs only worthwhile where it will make a big difference in memory usage or performance (premature optimization and all that). I guess that one of the things that is unclear to me is how frequently this is the case. Microbenchmarks are (typically) not very representative. I know that itâs been argued (e.g., by Valloric, on this thread) that larger integer types are a kind of hidden tax that has a bigger effect than we realize.
I think I overlooked the âRust on 16bitâ use case. With that considered, keeping int/uint
would definitely be the wrong way to go.
Existing code is abusing int
and uint
as âwordâ sized types. For example, the pure Rust big integer types were done this way. A larger hardware type means more work can be done via one instruction in cases like that. The fact that the language itself refers to pointer size as target_word_size
is a strong indication that thereâs a lack of understanding.