Subscripts and sizes should be signed

It is rather difficult task. If we collect points about unsigned advantages, I can divide them in these categories:

  1. Myths. About "signed requires 2 cmp instructions", or about "compile-time invariant".
  2. Theoretical conceptions, and ideology. "It is better to bound possible values on type system level", my practice tell me, that in this context it is wrong.
  3. Interop with unsigned code. For example, history with pointer.add and pointer.offset. Offset (with isize arg) is not usable when u need to use unsigned number.
  4. Rare cases, when we compile to 16 bits and 32k elements are not enough.
    Most of comments are "I have no troubles with usize indices, and it's to hard to change legacy". But if Rust were isize-indexed from beginning, the same people have had no troubles with isize indices too.
2 Likes

Or we can make vectors, points, and translators all different types. The translation representing a point is then T = sqrt(p * origin), while the direction vector would be d = p - origin.

Signed indices imply signed length and signed capacity. So what does a length of -5 and capacity of -22 mean?

It means that object is broken.

This seems wrong to me. Some previous comments show that it makes sense in some contexts a center-based indexing where the index 0 means the center of the array. In that case you clearly can index with a -n/2..n/2 range but the length is always positive. And furthermore, structures such as HashMap are indexed with keys of arbitrary types with no implication on size or capacity. Moreover, a Vec is not going to be indexed at usize::MAX any more than it would be at -1. The normal range of indexing lies in both isize and usize, so both are reasonable choices.

I believe the arguments for negative indexing are actually arguments for some cursor type over arrays. For example, an array with the cursor at the center to get the above. Or a cursor at -1 to get a 1-indexed vector to copy from a MATLAB, Maple, or similar language without need to re-index everything. But this can be done at a crate. And to be included in the std, that crate should be very successful.

2 Likes

I would avoid using length math as much as possible, instead relying on pattern matching.

    match elems {
        [start @ .., _] => {
            println!("The first {} elements are: {:?}", start.len(), start);
        }
        _ => println!("none"),
    }
4 Likes

If you had signed index, signed length and capacity are good, because otherwise u need to write more ugly conversions. I dont know, why do you like conversion boilerplate. Just to be not as C++?

I say the signed indexing does not imply signed length or capacity. Perhaps it could be a good decision, but I have not seen strong arguments for either. I have not seen here performance tests nor crater-like analysis to know how many projects would be affected positively/negatively affected. If you are so much interested in having this you should make a crate with a well-designed vector/slice/cursor or whatever you want. And then, if that design results convincing, the std could be adapted.

I feel like this thread has been going around in circles for quite some time and I think this is the crux of the issue: You, @T4r4sB, have been writing code that needs a lot of conversions because Rust uses unsigned indexing (to first order).

Almost no one else in this thread writes code like that.

I cannot remember the last time I, personally, needed to do any sort of type conversion in the context of indexing. Not in Rust and not in C or C++ either (N.B. I have written a whole lot more C-family than Rust). I use usize / size_t index variables like the language wants me to and everything Just Works.

(There is one specific case that came up recently where I needed to write *(uintptr_t *)((char *)ptr - sizeof(uintptr_t)), but that was low level wizardry—an allocator implementation—and I expect to have to jump through some hoops for that sort of thing.)

Point being, you, @T4r4sB, are writing an unusual kind of code and you haven't shown us enough of it for us to understand why it makes most sense for it to be written the way you want to write it. The fragments you've shown are not sufficient, we need to get a sense of the overall design. Can you please show us the entire program? A link to a public VCS repo would be ideal.

5 Likes

It's because you cannot compare isize and usize in Rust with < and == and such. So if you want the obvious while i < len loops to compile, then the index and the length need to have the same signedness, thus the implication.

1 Like

So in your case it does not matter, which type use to indexing? Also I see nobody, who really need usize (not isize). But in my cases isize is much more useful. Sorry, I can not share the project, but it is the case were I need math with indices and sizes. So there were no real reason to make index usize, only idiomatic and false theoretical words like "compile-time invariant". We need remember this thread and save it for future big redesign of Rust.

I have been following this thread loosely, and I have a question. What is your desired outcome of this thread? Is it what you stated originally?

If so, this simply isn't going to happen.

  1. You could potentially add impl Index<isize>, but then you need to handle the ambiguity (what is the type of 0 in x[0]?). That is a problem in and of itself that no one has cared enough about to attempt to my knowledge.
  2. This is a breaking change.
  3. Trait implementations cannot currently be deprecated. Unlike (1), this is something that is desired but no one has yet implemented it.
  4. This is a breaking change.

Setting aside what I believe is correct, (1) and (3) could theoretically happen if the requisite functionality in the compiler were implemented. You will likely have an uphill battle to fight if you want it to happen, but it would be within Rust's semver promises if it were to be done. (2) and (4), however, are breaking changes that would cause monumental disruption across the ecosystem, and as such have zero chance of being accepted. Not close to zero, but actually zero. If you're arguing for Rust 2.0, you will need something far more important than this. The mere fact that people disagree over whether this is something that's desired should be indicative to you that there is not the appetite to go to Rust 2.0.

11 Likes

There is a rule, that if we can use any integer type in calculations, use i32 as default. We can go deeper: if we has many variants, try ti use i32, if we cant, try to use isize.
There could be potential problems only if [ ] for usize and isize has radically different semantic, but it is just a wrong class realisation.

Current Rust is 2.0, because 1.0 is a version with GC and managed pointers, isn't it?
Index type is too small to switch to Rust 3.0, but if there will be many points to create Rust 3.0, we can add type index to wishlist.

No, current rust is 1.65.

2 Likes

There is a rule, that if we can use any integer type in calculations, use i32 as default.

Yes we use i32 as a default. If there is no good reason to use anything else. Notice that arrays are also not indexed by u32 but by usize.

We can go deeper: if we has many variants, try ti use i32, if we cant, try to use isize.

I have never seen this particular rule. Why should you pick isize in particular? You quantitiy is either a number or and index, if it's a number, I would use i32. If it is an index, I might as well just use usize for this, why should I use isize? There are plenty of reasons

Current Rust is 2.0, because 1.0 is a version with GC and managed pointers, isn't it?
Index type is too small to switch to Rust 3.0, but if there will be many points to create Rust 3.0, we can add type index to wishlist.

Rust is 1.x. What you are talking about was 0.x, it was never seen as a finished language.

Current consense is that making a maior version bump on an allready finished language is a desaster (see Python 3) and should be absolutly avoided unless absolutly necessary. This stability is also a promise a lot of trust into the Rust project is based on. The best you can realistically hope for is a new edition.

1 Like

Thanks, I dont know Rust's history as well, but I remember, that in some old versions was 4 kind of pointers, which are deprecated now: The Rust Pointer Guide. Which version numbers did they have?

If we can use any integer type for index, it is ok too. In my practice there was only one case when 32-bit signed integer was not enough, but it was due to error in algorithm generated too many output results. Anyway, I think, i32 is not universal.

There are some languages that simply designate isize as the "default" integer type. AFAIK Swift does this. If you designate a 32bit fixed size integer type as the "default" integer type (and there are many good reasons to do so.), you absolutly must have one other integer type for array indexing, otherwise you program will not be very portible. All the people that had to "port" their software to 64bit a couple of years ago lerned that the hard way. Even worse, when someone tried to run you program on a 16bit architecture. (They are rare nowerdays, but Rust does supports one of them, which is actually used.)

1 Like

We can look to Java's experience. They use signed types only, and i32 array index. And as I can see, they still dont need unsigned types, but they need i64 index on modern 64-bit machines.

Java decided not to have unsigned integers in general.

1 Like

Java has decided to add unsigned integer arithmetic in Java 8, so now int basically represents either signed or unsigned, so there is no type safety.

2 Likes