Extend range notation to allow the equivalent to matlab's "1:(end - 5)"

oli-obk · January 16, 2015, 1:28pm

In matlab you can say

my_array(1 : end - 5)

and it will give you all the elements in the array except for the last 5. In rust this is a little bit more verbose, , and is not usable to obtain mutable references (the latter is is not a good reason, as this will be fixed in the future).

&mut my_array[0..(my_array.len() - 5)]

I have a few ideas how to allow such a notation.

The “simplest” one (which cannot work as far as i know, and noone will understand what’s going on, but feel free to prove me wrong):
```
 my_array[1..-5];
 my_array[99..-5]; // very odd thing, especially if compared to my_array[99..0]
```
Add a generic End helper type:
```
 my_array[1..End - 5]
```
This is like the matlab version. The actual type of stop will be derived through the - operator. Requires std::ops::Range to have two type parameters which are comparable. Although this would break consistency of the .. operator since 1..End would actually run to the end, instead of stopping one element before. Maybe this should only work with the (not yet existing for ranges) ... operator. Alternatively (once the inclusive range operator exists), both could be allowed, where 1..End would not include the last element and 1...End would.
Add a generic End newtype:
```
 my_array[1..End(5)]
```
Requires std::ops::Range to have two type parameters which are comparable, not sure how intuitivly readable this is.

comments? ideas? reasons this is totally riddiculous?

DanielKeep · January 16, 2015, 2:45pm

D supports something along these lines.

Originally, it had magic support for the length identifier. If used within an index or slice expression, it would read the length property of the array. Later, the $ “operator” was introduced, which actually invoked the opLength method on the thing being sliced. So a[0..$] was rewritten to a[0..a.opLength()].

I feel like the Rustiest solution would be (all names bikesheddable) to first introduce a trait for accessing a thing’s length:

trait L: ?Sized {
    type A;
    fn len(&self) -> A;
}

impl<T> L for [T] {
    type A = usize;
    fn len(&self) -> usize { self.len }
}

// ...

Next, introduce a length “operator”. Let’s just use $ for now (although I think # is a slightly nicer choice). Define it such that it walks up the expression tree to the first index expression, then is substituted with a call to L::len on the subject of said index expression.

The alternative to the above is to introduce, into the prelude, some kind of End marker object and define Index overloads that work with that. But that seems like a fairly finicky solution which will require more code for users.

mdinger · January 16, 2015, 11:07pm

1. is how Python (search for Indices may also be negative numbers) works and the notation is really nice and probably really popular.

Here’s an old proposal to adopt Python like range syntax by a core developer. He states negative should count from the right as well. So you aren’t the first to want this.

oli-obk · January 20, 2015, 12:07pm

@mdinger: it’s only problematic due to the fact that -100…-42 is a valid range in rust, which would be rather ambiguous

mdinger · January 20, 2015, 3:15pm

@ker: I hadn’t noticed that. Thanks for pointing it out.

rkjnsn · January 20, 2015, 6:39pm

This doesn’t seem ambiguous, to me. v[-100..-42] would return a slice starting at the 100th element from the end and ending with the 43rd element from the end (since the range is exclusive). Similarly, v[-5] could be used to directly access the fifth element from the end.

oli-obk · January 21, 2015, 8:26am

and what would

for i in -100..-42 {
    println!("{}", i);
}

do in your opinion? (playpen does what i think it should: http://is.gd/sqgJhJ)

i still think using negative numbers would create more confusion than it’s worth. rust does lots of things explicitly, why not this, too?

kennytm · January 21, 2015, 11:00am

I don’t see any ambiguity between for i in -100..-47 and s[-100..-47]. One could simply implenent Index as something like this (ignore the syntax details):

impl<T> Index<Range<isize>> for [T] {
    fn index(&self, range: &Range<isize>) -> &[T] {
        let start = if range.start < 0 { 
            range.start + self.len() 
        } else { 
            range.start 
        };
        let end = if range.end < 0 {
            range.end + self.len() 
        } else { 
            range.end 
        };
        self.slice_from_to(start, end)
    }
}

That said, I still prefer an explicit length operator than negative index because it is (1) more explicit, (2) more flexible like allowing s[..$/2], (3) does not require polymorphic indexing yet.

oli-obk · January 21, 2015, 11:29am

the length might not be known while the end of the range might be, that would speak against an length operator. also in non-indexing ranges the operator makes no sense.

kennytm · January 21, 2015, 12:16pm

Just define a

#[lang="end"]
trait End {
    type Output;
    fn end(&self) -> Self::Output;
}
impl<T> End for [T] {
    type Output = usize;
    fn end(&self) -> usize { self.len() }
}

and then desugar any $ inside an indexing context to a call to End::end():

g(s)[f($)] => { let ref _temp = g(s); _temp.index(f(_temp.end())) }

It is already explained in Daniel’s reply above.

Edit: Just to clarify, the $ above can be replaced by anything, e.g. “End”, in which case it would be the same as OP’s suggestion in syntax, except that the compiler will automatically convert it into a suitable number. We cannot use the symbol $ like D because this is already used for macros.

rkjnsn · January 21, 2015, 9:14pm

I’d expect it to behave exactly as it does. I’m not sure where the ambiguity is to which you referred. Given the concept that negative numbers count back from the end, having -100..-42 yield -100 through -43, having v[-100..-42] return a slice containing the 100th element from the end through the 43rd element from the end, having v[-100] return the individual element that is the 100th from the end, and having

for i in -100..-42 {
    println!("{}", v[i]);
}

give the same output as

for v in test[-100..-42].iter() {
    println!("{}", v);
}

all seem perfectly consistent.

comex · January 22, 2015, 12:15am

FWIW, I generally appreciate this functionality in Python, but it sometimes causes problems for me due to the possibility of getting negative integers by accident. For example, a lazy way of extracting a given substring plus all following text from a string would be s[s.find('sub'):] – this works fine if the substring is there, but if it’s not, find returns -1 and the expression silently extracts the last character. (I was ignorant to write code like that, since I should have used the index method, which does the same thing but throws an exception if the substring wasn’t found; but there are other cases too, just not as obvious.)

In Rust, I’d be moderately afraid of large unsigned integers becoming negative after being truncated to a signed type for indexing, though that would depend on the design.

phaux · February 7, 2015, 9:20am

The only problem with negative numbers meaning index-from-end would be the inconsistency when operating on a map with signed integer key type:

fn collections(v: Vec<&str>, m: HashMap<i32, &str>) {
    println!("{}", v[-1]); // the last element
    println!("{}", m[-1]); // element whose key is the number -1
}

The other option is to use a special type to describe an index:

enum Index {
    FromBeginning(usize),
    FromEnd(usize),
}

fn function(v: Vec<&str>) {
    v[FromEnd(0)] // last element
}

I’m in favor of using negative numbers unless someone can come up with a syntax sugar for specifying FromBeginning(n) and FromEnd(n). Maybe prefixing a number with # would turn it into a FromEnd variant?

bluss · February 7, 2015, 9:38am

Our vector doesn’t even allow sizes greater than isize::MAX if I understand correctly, which makes this all the more tempting. cc @Gankra

Rust as a language does already support the notation and first-class range values to make this happen, so it seems like an API question.

oli-obk · February 9, 2015, 7:31am

wow, i didn’t know that. not that it matters on 64bit machines. might be an issue for regular arrays on 8bit machines. But that could be fixed with a lint.

nagisa · February 9, 2015, 9:19pm

This is an implementation detail which will become a public API if we expose negative indexes.

bluss · February 9, 2015, 9:33pm

It's basically not Rust's choice, it comes from LLVM: (rust bug/discussion)

system · March 25, 2019, 8:23am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Dollar syntax for Rust language design	8	3466	March 25, 2019
New range operator `..+` language design	16	920	October 17, 2023
Pre-RFC: Range Extension Syntax and Offset Method (Draft) language design	24	1424	February 4, 2024
Inclusive ranges with RangeFrom (making 0.. work like 0..=255 for u8) language design	6	1653	March 25, 2019
Enhancing Rust Range?	34	2747	December 5, 2023

Extend range notation to allow the equivalent to matlab's "1:(end - 5)"

Related topics