Restarting the `int/uint` Discussion

Between the designs? I like Design 1, followed by (depending on how much I’ve had to drink) aliasing int to BigInt (Design 4), which works but has poorer performance that people learn about (and are taught to think harder), or aliasing int to i8, which has hilarious consequences that people learn from.

Let’s say Design 1, then.

With respect to i64 vs i32, I think not having a system-supplied default is the best. Example code can certainly go one way, but anyone using Rust for systems-like code is going to need to decide for themselves. Once they have, it is super-easy to either typedef int if you have trouble remembering which you use.

I personally prefer u64, and unsigneds generally, but have some spots where I use u32 due to the TLB savings. And, I don’t appear to have negative numbers anywhere in my code (seriously, no ints that I can see, just uints).

I might be answering the wrong question, sorry. :smiley:

2 Likes

That's actually an excellent argument. No matter what we alias int to, a good Rust style guide would recommend against its use (i32/i64 have the bit size right in the name). This reminds me of something a Very Senior™ engineer once told me, which is (paraphrasing) "any language design decision that leads to a style guide entry could have probably been done better."

9 Likes

Most of a typical program’s code is plumbing, which takes a little amount of time, but contains most of the bugs (low-level bugs especially). The security risk is always there, but the performance cost typically isn’t.

I'm increasingly coming around to this, personally. It seems pretty clear that there are domain-specific tradeoffs here. If you're doing IO-bound work, u64 is often a reasonable choice, the perf hit won't matter. If you're constrained by memory bandwidth, you need to be very aware about these sizes in at least some of your code for perf reasons. (And of course in principle you should always be thinking about overflow.)

Having no language-sanctioned default will also give us the freedom to explore Rust's use-cases and common patterns in a wide range of programs while we evolve guidelines. Guidelines and the ecosystem can evolve, but committing to int now means that if the common default ever changes, you have to fight against what the language itself is telling you.

My main hesitation about this is, without recommending a default, crates from different sources may use incompatible sizes, leading to a lot of casting when you mix them. Implicit widening would help, but may not be enough.

And of course there's the original worry of the post, that people not thinking clearly about overflow will hear "use i32, it's faster" and do so blindly. But again, that could be mitigated later by introducing a different default if it becomes a widespread problem.

2 Likes

Implicit widening combined with explicit build time failure on smaller architectures is the correct solution.

1 Like

This attitude bothers me a little bit. Systems programmers are not magically perfect programmers. The whole reason we're using Rust is to get help when doing low-level programming, and one of Rust's biggest opportunities is to enable a whole new generation of people to learn systems programming by helping them with ownership.

So part of the question here is, can we provide some guidance for when people (beginners and experts) inevitably screw up, to help ameliorate the damange? Can we do so without undue cost on everyone else?

The answer isn't totally clear to me, which is why not having a default for now seems like the best choice. For example, it may actually be quite reasonable to use a fast BigInt much of the time when you don't want to bother thinking about size, and use specific ints only in crucial places. It depends on the kind of software you're writing.

4 Likes

I phrased this a little glibbly, but I really think of it from a perspective similar to yours - what can we do to help? For me, making integer sizes explicit and in your face and making the programmer always think about overflow is something that helps, and makes it easier to avoid bugs.

The really useful thing we could do is static and dynamic analyses to prevent overflow (in much the same way that the borrow checker and RefCells, etc. help prevent memory errors), but I think that is not on the table at the moment. Other than that I think being the most explicit is the best we can do.

I’m not 100% sure what you mean by no default. I agree that we shouldn’t have a default, which to me means - no int type, and not recommending a particular type in tutorials etc. without considering the width.

I agree it is probably possible to use BigInts a lot more, but I don’t think we can make them the default in any way without really turning our back on a lot of systems programming. We could (and should) do as much as we can to make them as ergonomic to use as built in integers though. That (and the promise of no overflows) should make them a much easier choice in more situations.

2 Likes

There’s one point I want to make a little bit more crisply.

Throughout this discussion, I’ve talked about users who “don’t want to care” and just want an int that “works”. I think that isn’t the best explanation for what I have in mind.

Often, when I start working with an integer, I simply don’t know yet how big the integer will become. I could, of course, stop what I’m doing and think about it and try to figure it out, but I usually pick a really big integer size and restrict the size later if I have a better sense of how big the integer might become. In this sense, i32 isn’t much different than i16 or i8. If I have the cycles to think through it, I try to figure out whether the number I’m working with can fit into a smaller integer.

This is, in my mind, the use-case for a default integer. Not a sloppy “I don’t know anything about sizes” type, but a “I don’t know how big this will be, so let’s go with something reasonable” type.

I think if your experience is that 32-bit overflows are unicorns, the answer you’ll give to that question is i32. If you’ve experienced gigabytes of data loss due to corruption caused by 32-bit overflow, you’ll tend towards i64.

Also, if your program is IO-bound, you’ll still get a lot out of the very strict memory management in Rust (making the costs of Rust generally worth it). You still might not care very much how big your integers are, and will be more willing to go with a suitably large integer if you aren’t yet sure how big they’ll get.

If your program is CPU-bound and integer-heavy, you probably care a lot more about integer sizes, and will be willing to spent more time up front carefully thinking about precisely how much space you actually need, and therefore will find the defaults far less useful.

With all of that said, that mostly argues for revisiting this question once we have reasonably fast BigNums, which are possibly the right answer to the question: “I don’t yet know how big this will become but I need an integer now”.

2 Likes

I mainly mean not having a type called int -- not committing to a default in the language itself. As far as the guidelines, I think this discussion has revealed that it's a nuanced topic and I suspect the guidelines would reflect that. But they can also evolve over time, unlike the definition of int.

I absolutely agree about ergonomics (and perf as well); BigInts should have first-class support in Rust.

You may be also right that sanctioning them as a default in any sense would be a bad move in terms of appeal; I don't really know. But if we sanction no language default now, we at least leave the door open.

3 Likes

While a good amount of the integers used in a program are bounded by some fraction of the available address space, some aren’t, and this would inject subtle bugs, which is made more dangerous because developers are more likely to compile for 64-bit, and use smaller datasets, than real users (and, more significantly, attackers). C’s traditional integers are semi-deprecated for a reason.

The trouble with BigInts is that, irrespective of their speed, all structs containing them become non-Copy, which Rust doesn’t handle prettily (because you need lots of calls to clone). u64/i64 should be hard enough to overflow in practice – it should be the default integer for code that isn’t performance-critical. If one is chasing performance problems, bounding integers and using smaller types should be done, hopefully with after bounds’ correctness being checked throughly, at least at runtime.

3 Likes

The main problem with BigNums isn’t that they are slow, but rather that Rust makes them ugly to use, as they require calls to clone everywhere, which won’t be going away any-time soon.

Design 1

There is absolutely no reason to continue the legacy of the “int” type name. We might as well keep calling i16’s shorts, i64’s longs and, for fun, let’s call i8’s petites.

Programmers new to the language will quickly and easily adapt to the more straightforward naming. Let’s continue on the path of making the costs of using one type over another obvious. Instead of Int, go ahead and add an “i0”, “iM”, or “iWhatever” type. Make it a bit more awkward to use so that it is known exactly what is going on.

Rust is going down the track of making abstractions and hidden costs obvious. Let’s continue this ride. Why hide the true identity of a type behind an alias when the straightforward method is so much more powerful?

To echo @loonyphoenix, I came away with this exact same understanding after reading the guides/examples. As a beginner option 1 seems to be the best choice. Grooming beginners to safe practices at entry is what I expect with Rust. And while I doubt it can be strictly enforced like the borrow checker, I would prefer to be guided in the right direction, rather than just sweeping the underling "problem" under the rug, (option 2).

1 Like

That is an excellent point; no matter how nice we make pure BigInt computation, the fact is that they're still non-Copy data which is a pretty fundamental difference from e.g. i64, and has (IMO) unavoidable ergonomic consequences.

Actually, this isn't quite right: you could imagine marking certain types as "implicit Clone". But that would be a massive departure from Rust's current design and conventions.

This means that there's no point in future-proofing int for a faster BigNum.

Yes, I agree.

Please no "implicit Clone." We have that in C++ and expensive copies you can't easily see is still a big problem. Far less now with C++11 and move ctors, but avoiding hidden expensive copies is still a big part of C++ cruftiness.

Knowing that deep copies only happen explicitly after a .clone() call is a massive benefit of Rust over C++. Every time I mention this to other experienced C++ devs their eyes light up.

‘int’ as an alias for ‘i64’ falls on the wrong side of both reasons ‘int’ as a name is desirable - it is neither a real integer nor is it familiar to users coming from other languages. It also still assumes there’s something magical about the name ‘int’ that’s more desirable than ‘i64’, but what is that magical thing? It’s the same to type, it’s less explicit, it carries historical baggage, it’s redundant with an existing type and creates confusion about which between the two you should pick. If you think the convention should be 64-bit until proven otherwise, there’s no reason that convention couldn’t equally apply to the name ‘i64’ as to ‘int’.

What is the actual concrete advantage of aliasing?

If this line of reasoning proves to be flat-out wrong, adding it as an alias can be done backwards-compatibly post 1.0 (if you keep ‘int’ reserved). There’s no reason to commit to this now.

1 Like

Yes, I agree, I was just saying it's possible to improve the ergonomics this way, but it'd be a huge change to Rust, and by no means a clearly desirable one.

The Guide uses int simply because that was what fallback used to be to, and then it was removed. It’s always been pending this decision for update.