Subscripts and sizes should be signed (redux)

tczajka · August 9, 2023, 11:23am

You mean modular arithmetic? Not really. That would be Wrapping<u64>, not u64.

I don't think this makes sense. p-adic integers are an extension of rational numbers. For instance, in 2-adic integers, 1/3 = ...0101011. This doesn't have much to do with u64.

u64 represents natural numbers, with a resource limit.

user16251 · August 9, 2023, 11:50am

The distinction between Wrapping<u64> and u64 is only salient in debug mode, but point taken. Anyway, for p-adics, I wrote that it's useful to think of u64 that way sometimes and I stand by that.

jdahlstrom · August 9, 2023, 12:02pm

There’s also the important pragmatic difference that the former actually reflects the programmer’s intention (to do modular arithmetic), whereas in the latter case it doesn’t really matter what happens on overflow because any value returned is erroneous and the program almost certainly has just entered an invalid or at least unexpected state. Even though behavior on overflow is well-defined, it is not a salient part of the program’s semantics. That is, unless the programmer has intentionally turned overflow checks off also in debug mode to get ergonomic wrapping integers…

newpavlov · August 9, 2023, 12:10pm

Yeah, I think Rust developers recognize it as one of mistakes which got baked into the language. BTW it's not only about experimental architectures, the Elbrus architecture has had "protected mode" (quite similar to the CHERI experiment) for quite some time and it even has helped to find a bug in tar. Unfortunately, there is a lot of software which can not run properly in protected mode (and it also hurts performance a bit), so it's used quite rarely in practice.

tczajka · August 9, 2023, 12:26pm

Usually if you do that with an index, then what you want is i > 0. If you aren't quite sure what the range is, and it so happens that i == 0 sometimes and you do that by accident without realizing what you have done, a panic is helpful to debug the problem.

Yes, in some scenarios you may want to do that deliberately and natural numbers aren't sufficient for some intermediate calculations that eventually come back out as natural numbers. That's not the common scenario.

In those rare cases where that's what you want, it's not a big problem to convert to whatever you need explicitly. I'd even say it has value for explicitly documenting what you're doing. Maybe you want isize. But maybe you want something else, such as Wrapping<usize>, or i128, or Rational etc.

A similar scenario is when you're solving 3rd degree polynomial equations in real numbers. You have to move to Complex numbers in your intermediate calculations. Well, so you convert. That doesn't require that you use complex numbers everywhere by default.

CAD97 · August 9, 2023, 4:54pm

This is inaccurate. Slice indices can validly be as large as usize::MAX just fine. The isize::MAX limit is to the maximum size of any single allocation. Thus slices of zero-sized types may have a length greater than isize::MAX with no issues.

Furthermore, the actual upper limit on valid slice length is lower than isize::MAX when the element type is larger than a single byte, and on many architectures the actual limit to allocation size is multiple orders of (binary) magnitude below isize::MAX, so even byte slices can't validly be that long. (Try using an array type that big, and rustc will complain that the type is to big for the architecture.)

Vorpal · August 9, 2023, 7:00pm

Isn't this whole topic kind of pointless? None of this can be changed now without breaking backward compatibility if I understand things correctly.

And even if an edition could be used for this (which I doubt, how would indexing a type from another crate compiled with a different edition work?), it would be a massive change that wouldn't be worth it.

T4r4sB · August 9, 2023, 8:12pm

In math the result of subtraction of two natural numbers is an integer number. In Rust the result of subtraction of two natural numbers is a natural number or a program fail with big chance.

If you wanted to represent math abstraction in the language - you made it wrong.

I can continue.

"RAII is a bad conception because it's not a big problem to delete whatever you need explicitly. I'd even say it has value for explicitly documenting what you're doing."

It is an absurd. Of course, I think, that RAII is very good conception, but using " it has value for explicitly documenting" you can proof any absurdistic sentence.

tczajka · August 9, 2023, 8:56pm

No, there are different versions of each arithmetic operation, defined in different domains.

A more commonly seen example is the square root: you can define it only for x >= 0 and leave sqrt(negative) undefined (and that's how f32::sqrt does it), or you can define it for negative numbers also, or you can define it for all complex numbers, or even for whole matrices of numbers.

Similarly, when dealing with arithmetic operations on natural numbers, even in pure math, one commonly defines a subtraction operation that is only defined in the domain a >= b since that is the inverse of natural number addition. Extending it to give you negative numbers only really is useful if you also extend the domain to make it work on all pairs of integers.

tczajka · August 9, 2023, 9:00pm

I think there is value in discussing whether decisions already made make sense. It's useful for how you advertise a feature: "it's a good feature" vs "we're stuck with it only for legacy reasons". You can also change documentation and give advice -- for instance, OP suggested advising users to use isize as the default integral type. One could also slowly phase out standard arrays and slices etc in favor of some alternative type. C++ has basically done this over time, replacing C-style arrays with std::array.

T4r4sB · August 9, 2023, 9:19pm

Ok, nice catch, I tell wrong. So math natural numbers is bad type for programming language because nobody expect that subtracting two numbers can fail the program.

tczajka · August 9, 2023, 9:32pm

You missed the "rare" part, and skipped the part where I said that the panic is desired and useful in most cases.

The difference between negative indices and RAII is that with RAII you want to release the resources by default (if you use RAII, releasing the resources eventually is what you normally want), whereas for indexing you don't typically want the index to go negative by default.

T4r4sB · August 9, 2023, 10:35pm

What does mean "rare"? How ofter do you have a Rust' program crash because of signed numbers? Infinity times rarer. Panic is bad when logic really works good with negative numbers.

According to absurdistic logic: "without RAII you need to think about architecture very well, so non-RAII lang makes your program better".

Whereas unsigned numbers are bad, because for arithmetic operations I dont typically want to have a program crash in simple arithmetic logic.

user16251 · August 10, 2023, 12:37am

Is there any code out there that uses giant slices of zero-sized types?

user16251 · August 10, 2023, 12:53am

This has come up before in this thread. Panicking is not necessarily what you want. With signed indices you could write something like x.get(i - 1).unwrap_or_default() when you need this operation. What makes you think this scenario is uncommon? Some people use & and count_ones() all the time; other people probably go years without using them. Some Rust programmers probably exclusively use iterators and never use index notation.

I agree that it's not a big problem because you could write x.get(i.wrapping_sub(1)).unwrap_or_default(). But it is a little annoying. For the kind of code I write small negative numbers come up a lot more than integers near 2^64. Both of these are illegal indices most of the time and I'd rather my index type have the former rather than the latter.

toc · August 10, 2023, 1:43am

There is a path forward though. A lot of the breakage is around type inference, we need this code to compile:

let a = [1, 2, 3];
let i = 1; // what is the type of `i`? (It needs to be `usize`)
a[i];

If there were a way to do

impl<T, I> ops::Index<I> for [T]
-     where I: SliceIndex<[T]>,
+     where I: Into<SliceIndex<[T]>> ≋ default usize,
{ ... }

Then this discussion becomes more reasonable.

tczajka · August 10, 2023, 7:45am

It's obviously less common than the scenario where you don't want to use negative indexes in a Vec or a str. The question isn't "are negative indices more common than numbers near 2^64", the question is "are negative indices more commonly valid, or are they more commonly a bug".

For your example, I don't know your exact use case, but let's say you want to have a data structure that can be indexed infinitely to the left and to the right, with a signed index (say isize), and starts out with Default values everywhere, kind of like the tape in Turing machines.

What you can do in that scenario is implement this abstraction as a new type, struct InfiniteTape, using VecDeque internally, and implement Index<isize> and IndexMut<isize> on it. You will need one or two type conversions for this in your whole program, rather than having to repeat this logic in many places.

jjpe · August 10, 2023, 2:18pm

I don't buy this at all. I fact, I don't even buy the premise of the title of this thread.

Conceptually array[-1] is incoherent. That e.g. Python uses it for RTL indexing is a convenience, but doesn't change that base fact.

But even if it was coherent: Rust is stable. It will not change this, because moving from usize to isize would eg mean that if I created a Vec of length between isize::MAX and usize::MAX, that would suddenly break.

notriddle · August 10, 2023, 2:31pm

This is arguing against a straw man. They’ve already pointed out that they think RAII is a good thing.

If the discussion is going to go anywhere productive, you need to try to understand other people’s point of view instead of arguing against exaggerated, “absurd” versions of it.

user16251 · August 10, 2023, 11:38pm

I am not arguing that x[-1] should be legal. I am arguing that the type that represents indices should include the value -1 and that is more useful than usize::MAX. I'm not sure what it is you think is "obvious." The get example is just one example. For another, you can't represent the difference between two array indices with usize. I want to do more with array indices than use them as arguments to Index::index.

Topic		Replies	Views
Subscripts and sizes should be signed language design	151	6736	December 7, 2022
The problem with array/slice/vector indexes language design	14	9879	March 25, 2019
`u32` as a second fallback type language design	31	1857	June 16, 2021
A tale of two's complement	62	25011	March 25, 2019
Restarting the `int/uint` Discussion internals	197	25437	March 13, 2015

Subscripts and sizes should be signed (redux)

Related topics