Bigint as a default fallback for integer type?


#1

I believe this is correct default. This avoids overflow errors, but also lets people declare their i32 types if they need them.

The default type for division results should be rational (I mean represented by two bigints in a ratio to be exact). That way you can represent invalid values (1000/0) as error values without panicking. When they are printed they could be “not a number” or whatever you want to do with that.

That means that you can do any simple arithmetic without overflowing or getting weird results. This would prevent more logic errors for people who are writing application code. If you want to write fast code with exact sizes like f64, you’re welcome to do that. If you want to do square roots, you can get your result in some f64 type or whatever you want.

But the default fallback type should really be fool-proof. You don’t want space shuttles crashing because of integer overflows. It costs you almost nothing when the numbers are small. You get correct results when they are big. Seems like a win-win. And if you need to overflow for hashing, again, you have your i32/i64 types.

What do you guys think?


#2

This violates the principal of fail fast. It’s better to panic when the error occurs than to accumulate error and panic (or, worse, do something wrong) at some later point. If you want to avoid dividing by zero, just check.

  1. You’d always need to check for overflows and check to see if your big int is really a big int (at least three extra instructions per math operation by my count).
  2. You’d either need 128bit wide numbers (64 bits for a small number and 64 bits for the pointer to the rest) or a 64bit wide number and a lot of tagged pointer magic.

You should take a look at this related RFC: https://github.com/rust-lang/rfcs/pull/146


#3

Yes, you’d do a jump on overflow flag. The numbers by default would be 31 bit ints with a 1 bit tag. Pointers end with 0, so the least signicant digit being a 1 is a tag bit.

When you do any number operations you’d shift do the corresponding arithmetic and shift back and fill a 1. This is super cheap because it’s done in registers. Or if it’s a pointer obviously you’d do a more expensive operation (which is a good thing because it saves you at runtime, otherwise you’d crash).

You can’t really check at compile time if you’re going to overflow because you don’t know what your inputs are most of the time. But when you need a bignum and you overflow and your program crashes - that’s already on the user’s computer. It’s too late. Your shuttle already crashed.

Also, the benefit of fractional numbers is that they can represent currency correctly. Safeway just had a price advertised of three items for $5. When I bought exactly three items, I got charged $5.01. This is possibly false advertisement. It wasn’t that the unit price was $1.67, the ad clearly stated I get 3 items for $5

and who knows how many people implement things dealing with currencies in FLOATS… I think the default type should be a rational number that allows you to store $96.73 as 9673/1000 internally instead of a binary floating point number (again, if you want 96.73f you can have it)


#4

I agree with the sentiment - people should use BigInts more - but making them the default might be taking it too far.

Notably BigInts allocate memory. How is that memory managed? Options are (a) like Box, in which case unlike the built-in integer types they are not Copy, (b) with tracing garbage collection, in which case they incur tracing garbage collection (an easy choice in Haskell, less so in Rust), or © they use reference counting and have a compiler special-case to let them be Copy nonetheless, like the old Gc.


#5

That’s a good point. Then the default type of an integer literal would be 'static i32 if it fits, and Integer if it doesn’t fit. If you need to use a dynamic integer, everything uses Integer, including arithmetic operations (unless you want to do some at compilation time)

This would mirror String and str


#6

IEEE 754 floats do that, and it’s in my opinion often the right thing to do (e.g. if you want to calculate the relative frequency of two types of event when you might sometimes have no events), you just have to have some special “not-a-number” value. Of course whether 1/0 should be considered infinity or not-a-number and how to handle “x > y” when one might not be a number (or indeed infinity, where the sign is not always clear) is debatable.

I don’t this should happen by default (often enough when I do a/b with integers I want this to round down), but easy ways to use such numeric representations could be useful. I can also imagine one might want some way of ensuring that all literals are considered bigints or some such and not ints.

There will always be bugs and lazy approximations. BigInts and fractions cannot assure correct arithmetic in any case, but even if they could this kind of thing is not sufficient rational to force them on everyone.


#7

They do in the case of addition/multiplication/division/subtraction. You only need floating point for irrational constants like pi or taking roots, so something that you would find in a math library.


#8

Rational arithmetic can very easily have linear space blowup – e.g. the k-th harmonic number is more than k bits long, which can easily become an obscure leak – you can accidentally get 10000-bit denominators, and then arithmetic becomes super slow.


#9

If you need fast, nobody is stopping you from using floating point numbers. In the case of 0.1 + 0.2 you don’t really worry about performance. But floating point numbers are a premature optimization, since 0.1 + 0.2 doesn’t give you 0.3 which is surprising and unintuitive.


#10

Assuming the starting conditions are correct, assuming one never has infinite sequences, assuming one either spends a lot of time proving that run-time and memory use have certain bounds or that one has unbounded memory and doesn’t care if in some cases operations take years to complete, assuming that 0 is never used except as a numerator… wait, were you not trying to make an absolute statement?

Jokes aside, that’s a very limited set of operations you’ve allowed and you’ve already made the trade-off of allowing unexpectedly long run-times. I doubt you’re going to win many people over like that.

Anyway, you’re ignoring a common use case: integers as array indices. I might well want to do something like this: event_counts[seconds/60] += 1; — and in this case I definitely do not want a fraction.

You might better argue that all monetary units should be stored in units of 1/60th ¢ or similar, so that common fractions can be represented exactly (e.g. 5/3 € = 10000 units). Obviously it’s not a perfect solution, but lets face it there isn’t one — and at least this doesn’t cause any major slowdowns or behavioural changes to other uses of numbers.


#11

then you can do floor or introduce \ or // for integer division like many higher level languages

honestly event_counts[floor(seconds/60)] += 1; looks completely reasonable considering that most of the time 5/2 should be 2.5 and only in SOME cases you want to round it (and I’d rather have the method of rounding be explicit and not implicit floor() because it’s i32 division)

5/2 = 2 and 5.0/2 = 2.5 is brain damage from C, nobody expects math to work this way except people who are familiar with programming languages that do this.

If your divisor is 0, then that’s great! You can easily propagate the error as 1/0 and do nothing with it. When the value is printed it will just say not a number and that’s it. What do you do when you do division of 2i by 0i? The program panics.


#12

Hmm, that computer numbers act like the natural or real numbers known to mathematics has always been an illusion with known limitations. Perhaps you’re right though that deepening the illusion is a good idea (in principle, at least). Now you just need to find a way of doing that which doesn’t cause too much breakage, slowdown, or irritation from developers expecting different behaviour.


#13

Insert a bigint library inside your binary just because you forgot a suffix? Overkill.


#14

If you’re not telling the compiler how wide your variable is, then why should you have any expectation of it being a certain size like 32 bits? In a few years even mobile will switch to 64 bits and this choice would be wrong as a default.

I think the compiler should be free to give you any size of integer it wants if you don’t care to specify. Maybe it can prove that let i = 1 never grows beyond 32 bits and it can just make it 32 bits because it’s immutable anyway.

Now if you have a number that you can’t predict the value of (user inputted or calculated based on something at runtime), why would it be wrong to assume that the safest size is bignum? I mean you could just turn the feature off by saying #[no_bignums] or something and it will make you put a suffix on every number. Or maybe it will just fall back to i32/f64 as defaults.


#15

32 bit is a sensible default for 64-bit machines too, see the integer fallback to i32 RFC.