Restarting the `int/uint` Discussion

tupshin · January 2, 2015, 9:12pm

I believe it would be a huge mistake to not vary the availability of auto-conversions depending on the target platform. Auto-widening is safe, and auto-narrowing is not. If something was written for a 32 bit platform, it should work with no modifications on a 64 bit platform. But if somebody wanted to make it work on a 16 bit platform, then any conversions from from a i32 to a 16 bit isize should fail unless somebody explicitly goes through and audits them and adds explicit conversions where it is safe to do so.

Not all code can be expected to work on all platforms, and this is exactly how it should fail when it was coded in such a way that won’t be safe for a given size platform.

aturon · January 3, 2015, 12:31am

This has been an amazing discussion. Thanks everybody for helping dig deep here, and especially @valloric for the “laughably small” phrasing which was so clarifying.

I’ve posted a comment on the current open RFC suggesting how to move forward on the clear consensus for Design 1: we need to choose a name. Issues of conventions and possibly widening/ergonomics considerations will be tackled in follow-up RFCs.

wora · January 3, 2015, 5:37am

I would like to propose a slightly variant of the design of “int/uint”. Here are the details.

On AMD64/INTEL64 architecture, the lower 48-bit of 64-bit address are actually used for memory address, and the upper 16-bit are just sign-extended. We can consider to use similar design with int/uint, e.g. int/uint consume 64-bit space inside struct, but only the lower 32-bit are used for computation on 32-bit architecture. When the int/uint values are using for local variables and parameters, only 32-bit values are used on 32-bit architecture, so there is no performance penalty for most cases.

On a related reference. JVM uses 32-bit fields/locals to store 8-bit and 16-bit values. The physical representation and logic semantics are different. I don’t remember it ever causes any real issue.

With this design - 64-bit storage size with intptr/uintptr logic - we can keep the same struct layout on both 32-bit and 64-bit systems with almost no performance penalty. We may not even need to rename the types.

CloudiDust · January 3, 2015, 11:46am

@wora, I’m not familiar with the specifics, but this seems quite architecture-specific right? Rust is intended to be used on a large variety of architectures including non-x86/x64 ones and/or 16-bit ones (where int/uint will be 16bit). So is this design widely applicable?

l0kod · January 3, 2015, 2:30pm

I'm in favor of the design 1 (No int Type), and to remove the u/i suffixes.

A name like imem/umem is unfamiliar with newcomers and that's a good thing! It's not like other language and shouldn't look like. An uncommon name would encourage to read the doc or, at least, to not use it (which is better than misusing it).

About the default integer type (if any), if a user doesn't want to bother choosing an integer max value then he should get a (bullet-proof) BigInt. So, for now, I prefer to not have implicit type at all. When a performant/fast BigInt (for values ≤ 64bits) will come up, it would not be too late (nor a breaking-change) to pick it as a default integer type.

The RFC 464 is a must-read for a nice name pro/con summary (plus this and that).

All this int/uint RFCs and discuss threads are interesting and seems important for Rust users but heavy to follow. The documentation should summarize the arguments/experiences from all this threads to enlight newcomers if they want to know the integer type problems/reasoning in detail.

Uther · January 5, 2015, 7:52am

This is a pretty strong argument indeed. For most people (Java / C# / VB/ …) : simple integer is 32bit. And since it is the fallback type in Rust too, any other choice would be really confusing to me.

I support #1 too since there is no good default (it really depend of the programmer priorities), so we should let the programmer choose : 64bit for reduced overfly risk or 32 bit(or even lower) for performance. Rust is a language designed for system programing so people who should know the difference. Even in Java, the size of the types is one of the first things you have to learn, if you don’t want to make mistakes or poor performance code.

aturon · January 5, 2015, 8:56pm

(cross-posted from the RFC; we’re going with Design 1 with names isize/usize)

The core team met this morning to coordinate on the alpha release and to finalize a decision here. The discussion on the RFC thread, on discuss, and elsewhere has been vigorous and dug deep; thanks everyone who participated!

Summarizing the discussion

I won’t try to fully summarize everything that has been said, but single out the most salient points:

@Valloric laid out a vivid description of Google’s guidelines: use i32 for numbers “laughably smaller” than 4 billion; otherwise use i64. We will need to tweak these guidelines to also cover our pointer-sized types, but they greatly clarified that having a “default” int type is problematic.
The discussion on the RFC thread has covered the pros/cons of various renamings of int/uint. This is a difficult question at the intersection of clarity, learnability, and friendliness to newcomers.

Of course, many many other interesting points were made along the way, and I encourage those interested in the topic to read back through the threads for more.

The decision

In the end we chose isize/usize, partly due to arguments summarized by @1fish2: keeping the i/u prefixes makes it easier to understand that they are part of Rust’s family of integer types and follow the same rules. Similarly, @iopq’s point that “having isize/usize seems like ONE additional type with a prefix” makes a compelling argument against size/offset, which appear to say that the two types have a different relationship than our other prefixed types.

On the other hand, seeing usize in the context of slice indexing or as the return of the len function is unlikely to lead to too much surprise for newcomers. A type like umem, on the other hand, is likely to raise eyebrows. Since “size” is general enough to refer to both the size of the address space and the size of a container and its indexes (which are, of course, closely related), and reasonably intuitive, we feel it is the best choice.

The plan

Here’s what needs to happen next:

@CloudiDust, can you please update the RFC a final time, giving isize/usize as the “Detailed design” and leaving the others as alternatives? Once that’s done, I will merge the RFC.
We will introduce isize/usize before the alpha release, and deprecate int/uint.
I will very soon post an RFC proposing formal conventions around integers, based partly on @Valloric’s comment above. These do not need to be approved before alpha, but should be approved ASAP.
During the alpha cycle, we will revisit all uses of int/uint and change to a specific integer type based on the conventions above. This is one of the few places where we anticipate changes to #[stable] APIs during alpha. (The broader story about the alpha cycle will be detailed in the upcoming alpha announcement.)
By the beta release, we should be ready to remove int/uint.
Separately, we should consider ergonomic improvements to ease the pain of having no “default” integer type. See this thread for early discussion of a number of ideas. These changes are backwards-compatible and can be made more slowly over time.

FlaPer87 · January 5, 2015, 9:25pm

FWIW, I’m very happy with this decision.

gulbanana · January 6, 2015, 6:33am

This was a pretty impressive example of community input and collaboration. Nicely done!

simias · January 6, 2015, 10:03am

Awesome! Thanks for taking the time to reconsider this.

liigo · January 6, 2015, 11:26am

This is a better choice.

MichaelGG · January 8, 2015, 2:19am

I strongly agree - just having something like “Type int not found, did you mean i32?” Should be enough to get beginners or people “expecting int to just work” nicely on their way. If the compiler gives them the exact answer they’re looking for, that should be rather friction free.

Voltasalt · January 9, 2015, 6:38pm

Perhaps take it even further and give a long detailed message? Something like:

The 'int' type has been deprecated.  You will have to explicitly specify how large you want the integer type to be, for example:
- i16: A 16-bit integer type that can fit numbers up to 32,767, or 65,535 for the unsigned version.
- i32: A 32-bit integer type that can fit numbers up to 2,147,483,647, or 4,294,967,295 for the unsigned version. This should be your default choice unless you need larger numbers.
- i64: A 64-bit integer type that can fit numbers up to 9,223,372,036,854,775,807, or 18,446,744,073,709,551,615 for the unsigned version.
- isize: Either i32 or i64, depending on your system's pointer size.

Obviously more professional than that.

MichaelGG · January 10, 2015, 12:59pm

Seems a bit overly verbose? Explaining how big 16-64 bit integers are is probably not the best thing to shove into an error message that you want people to read.

f2u · February 1, 2015, 1:02pm

I think the options presented here are lacking. The size of the type matters less if you are more like to write code that is actually size-agnostic. If I understand the current language constructs in this area correctly, it is about as difficult as in C (or Java, without resorting to the Math.…Exact() methods) to write size-agnostic code. For Rust-only code, this does not have memory safety implications, but when interfacing with unsafe code (or writing low-level memory-management code), it does.

Programmers really want to write code like this:

if a + b < c * d {
      // Only executed if the inequality holds over the integers.
}

More language support is needed to actually cover the overflow cases, but I expect that this has a decent chance to help with settling the int/uint issue.

Design 4 goes in that direction, but one would have to avoid those explicit casts because they imply truncation.

drewm1980 · February 3, 2015, 4:47pm

Shouldn’t this thread be closed?

I am in favor of the status quo, which is, as I understand it:

No default int type. Users use explicitly sized integers 100% of the time. Overflow errors are their fault, and are architecture independent.
Code correctness does not depend on architecture, unless your correctness depends on pointer size, which is again your fault since you used a pointer sized int for something you probably didn’t want to be pointer sized. (In the docs we should remind people who are new to system programming that pointer size could be very small or big depending on the architecture)
Some fixed size int type(s?) are getting used in various parts of the standard library, as appropriate. So sometimes you need to cast explicitly, which is as it should be. If I chose a mismatched integer type that is resulting in a lot of conversions, I ~want to know about. The tendency of users to avoid typing will naturally result in avoided integer conversions.

So, my understanding is that the status quo is a well thought out good one, and this discussion was already decided about a month(?) ago. ( Please do correct me if I said something wrong)

If there are people who ~really want to re-open the discussion on ints, it should probably be a new thread, as the context for most of the (almost 200!) comments in this thread were made obsolete by the removal of int/uint.

nikomatsakis · February 4, 2015, 1:39pm

This discussion is settled, yes.

Topic		Replies	Views
If `int` has the wrong size ...? bikeshed (deprecated)	26	5704	March 25, 2019
Call for consensus, if we do want rename `int/uint` bikeshed (deprecated)	19	4210	March 25, 2019
Document integer types documentation	3	1913	March 25, 2019
A tale of two's complement	62	24922	March 25, 2019
Integer types internals	5	3453	March 25, 2019

Restarting the `int/uint` Discussion

Summarizing the discussion

The decision

The plan

Related topics