`...` vs `..=` for inclusive ranges

We’ve currently adopted .. as syntax for exclusive ranges and ... as the syntax for inclusive ranges. In the current implementation, .. is only legal in expressions and ... is only legal in patterns, but that’s expected to change. However, there are some very valid concerns about using the number of dots to distinguish these two cases:

  1. Hard to remember.
  2. We do the opposite of Ruby (because “more dots more numbers” seems strictly better, but still).
  3. Easy to make a typo.
  4. Easy to overlook when proof-reading code.

It is my opinion that inclusive rangers are a relatively rare thing. They are used primarily for match patterns (e.g., match ch {f’ … ‘x’ => … }and possibly some edge cases involving integer rangers (e.g.,0…uint::max`). But in general they might make code cleaner on occasion so it seems like a fine thing to have.

Swift adopted ..< as the notation for an exclusive range. I don’t like this because I see exclusive rangers as the common case and that syntax is somewhat awkward. However, I find ..= for inclusive ranges to be reasonable – inclusive rangers are uncommon so the fact that it’s a bit strange doesn’t bug me so much. And it is quite clear (or at least obviously distinct from ..).

Normally I’m pretty opposed to arbitrary syntactic changes at this point, but given the fact that this syntax is relatively young and seems to have some relatively serious downsides beyond aesthetic preference, I’m inclined to consider a change. I’d like to get a feeling for what other people think, though.

(If I can ask a favor, please restrict your comments to the specific question at hand. I would prefer not to have this thread derailed with a generic discussion of whether .. or ... is preferable in other cases (e.g. patterns), nor the question of whether .. in patterns ought to destructure a range vs matching a range. Those seem like independent things to consider.)

9 Likes

Although it might be way too late for this, I’ll propose having only .. and having it represent an inclusive range. Using .. for inclusive ranges covers strictly more cases than any notation of exclusive ranges. For example:

  • current (exclusive) syntax does not allow to make an array which contains {i,u}*::MAX as a last element;
  • ∀x x..x makes zero sense as it has no elements whatsoever.

Otherwise I’m for ..= because it is easier to notice than ....

2 Likes

+1 for having syntax for inclusive ranges in expressions, instead of using an adaptor a..b.inclusive() (which I think is too long for its own good, and people will tend to write a..b+1 even if it overflows the upper limit).

And a..=b looks good IMO, and it’s easy to distinguish it from a..b when reading code.

I like this syntax. Better visibility is a good argument and I find the correspondence between “=” and “inclusion” quite intuitive.

Any ambiguity / operator precedence problems with binary, unary, nullary forms of this notation (begin ..= end, begin..=, ..=end, just ..=)? ..=end looks like an assignment to full (exclusive) range literal.

I favour …= if only because it sets a precedent of >… and >.=, which I believe we will want in the longterm for range queries (having both of these with a Range trait we could more cleanly support range queries)

1 Like

-1 for me.

  1. I disagree that it is hard to remember. Quite the opposite since the rule : “more dots more numbers” just work fine.
  2. Considering the syntax of a family of languages may be relevant, but I don’t think considering a single language is.
  3. I don’t think it is easier to make a typo than with +/- or </> and IMO it is easier to notice with a fixed size font.

The point 4 may seem valid to me, but the equal sign is too much disturbing to me. When I see a..=b, It’s hard to convince by brain I am not facing an assignment or a comparison.

6 Likes

I agree with the proposition that having distinct symbols is important. I disagree with #1, and don’t care about #2 (it’s so obviously the wrong behaviour, someone should be flogged for that). #3 I’m fairly unconvinced by. #4 is the one I think is the most important, and the best reason for doing this.

Summary of meandering: However, ..= feels ugly and semantically confusing: it looks like it should mean something else. Sadly, the only vaguely viable alternatives I can see are ..~ and ..>. I lean slightly toward “yes” to making a change of some sort, but hope that ..= is only used if nothing better can be found.

Disjointed meandering: …but I just can’t convince myself that ..= isn’t hideously ugly. That, and every time I look at it, my brain just refuses to associate it with “inclusive range”. My first reaction is “it’s like += for an exclusive range, wait, how does that make any sense?” My second reaction was “so it’s an assignment to a captured group, a new ..ident pattern?” It just can’t see anything other than a tenuous association between its appearance and its proposed semantics.

I tried briefly playing with some code, and I’m also not convinced that .. and ... are as easy to confuse as suggested. To be clear, I’m not saying they are clearly distinct… but I also don’t perceive it to be that bad.

What about ..~ instead? The ~ doesn’t have any extant meaning in Rust (any more), so it can’t be confused for something else… unless the programmer is coming from another C-derived language. 1..~10 meaning “from 1 to bitwise complement of 10, inclusive” is a bit of a stretch, though. I concede that it’s a bit of a wash.

The only other possibility I can see from staring at my keyboard, which has a kind of tangential precedence via Swift would be ..> for inclusive, as an inclusive range goes “past the end” of an exclusive one.

I just can’t like the proposed syntax. I’m insufficiently convinced that 1..9 and 1...9 are similar enough to cause significant enough issues. But at the same time, I do agree that they’re a bit close for comfort. I also have to concede that aesthetics shouldn’t stand in the way of clear reading. If the consensus is that a change has to be made (let’s call me 60% agreed), I can only plead for a different syntax.

3 Likes

What if we changed to <.< for [a,b), >.< for (a,b) and >.> for (a,b]. We could call them the "shifty look" operators. <.> could be the "trying to look at a magic eye" operator. :3

1 Like

other languages (Ada, Matlab) have solved this issue by only having a single syntax (for inclusive) ranges and starting their indexing at 1. By no means do i support this, i like my zeros. But these languages have another feature that makes inclusive ranges actually usable.

In my opinion, that is a bias introduced by the fact that there is no last_idx() function and only a len() function. I find 0 ... vec.last_idx() much more intuitive than 0 .. vec.len() (I always get the feeling I'm running one over the length of the array).

Ok, back to objectivity: I did a search for [^\.]\.\.[^\.\}\]\)] in the rust repos and went through the first 25% of the results (skipping all tests and benchmarks which just had random numbers in there).

Places where inclusive and exclusive ranges would result in the same effort (and subjective readability)

libcollections\bits.rs:323
libcollections\dlist.rs:595
libcollections\dlist.rs:602
libcollections\ring_buf.rs:561 -> just a matter of how head/tail are used
libcollections\ring_buf.rs:584 -> just a matter of how head/tail are used
libcollections\str.rs:147 -> change iteration together with line 149
libcore\iter.rs:1517
libcore\iter.rs:1519

Places where inclusive ranges would be better

libcollections\btree\node.rs:1553
libcollections\ring_buf.rs:486 -> write as 1..len
libcollections\vec_map.rs:492
libcore\fmt\float.rs:245

Places where exclusive ranges are better

libcollections\bits.rs:323
libcollections\bits.rs:356
libcollections\bits.rs:663
libcollections\bits.rs:833
libcollections\btree\node.rs:1555
libcollections\btree\node.rs.1556
libcollections\string.rs:175, 192 -> that function could be prettier
libcollections\vec.rs:609
libcore\fmt\float.rs:182
libcore\fmt\float.rs:335
libcore\fmt\mod.rs:488
libcore\fmt\mod.rs:597
libcore\fmt\mod.rs:599 -> better would be repeat
libcore\fmt\mod.rs:605 -> better would be repeat
libcore\fmt\mod.rs:721
libcore\iter.rs:727
libcore\slice.rs:134
libcore\slice.rs:215
libcore\slice.rs:312
libcore\slice.rs:404
libcore\slice.rs:483
libcore\slice.rs:484
libcore\slice.rs:485
libcore\slice.rs:958
libcore\slice.rs:985
libcore\slice.rs:1181
libcore\slice.rs:1272
libcore\str\mod.rs:771
libcore\str\mod.rs:819
libcore\str\mod.rs:831
libcore\str\mod.rs:972
libcore\str\mod.rs:978
libfmt_macros\lib.rs:246
libfmt_macros\lib.rs:251
libfmt_macros\lib.rs:412

Why would you ever do &x[..0]? I'm sure there's a better way to create an empty slice with the proper lifetime (isn't there sth like &x[]?

libfmt_macros\lib.rs:291
libfmt_macros\lib.rs:400

So... overwhelming number of situations where exclusive slices are "better" (in the current way stuff is done idiomatically in rust). Most of these situations are x .. v.len().

My suggestion stands: stop using the length of anything to get an index, and instead have a function returning the index of the last item in a sequence. Then inclusive ranges can replace a lot of exclusive ones.

6 Likes

Maybe it’s the font here but it doesn’t look that strange with proper spacing.

match c {
    'a' ... 'z' | 'A' ... 'Z' | '0' ... '9' => { ... }
    'a'...'z'|'A'...'Z'|'0'...'9' => { ... }

    'a' ..= 'z' | 'A' ..= 'Z' | '0' ..= '9' => { ... }
    'a'..='z'|'A'..='Z'|'0'..='9' => { ... }
}

Or perhaps use some syntax other than dot-dot-something?

match c {
    'a' till 'z' | 'A' till 'Z' | '0' till '9' => { ... }
    'a'till'z'|'A'till'Z'|'0'till'9' => { ... }

    'a' ~~ 'z' | 'A' ~~ 'Z' | '0' ~~ '9' => { ... }
    'a'~~'z'|'A'~~'Z'|'0'~~'9' => { ... }

    'a' ~ 'z' | 'A' ~ 'Z' | '0' ~ '9' => { ... }
    'a'~'z'|'A'~'Z'|'0'~'9' => { ... }
}

The problem with “last index” is you need conditional logic to handle empty ranges. Similarly inclusive range iterators require an extra flag to know if you’ve yielded the “included” element. Exclusive ranges are simply more natural for what we use ranges for.

3 Likes

I would consider a keyword. I also considered -- (from Latex ;). I hadn’t considered ~~. To the extent there is any precedent, I would say that ~ has been used as a “regular expression match” operator in other languages.

I am sympathetic with your position here. I also agree that .. vs ... is not the end of the world (that's why I didn't object to it in the first place).

Right... although I'm sure those are not the common situations.

For integers a..b starts with a-1 (wrapping and stuff) and ends at b. It's just an implementation detail. For all other types I agree.

That's why i did my survey of rust code before posting...

Some bikeshedding, I hope this doesn't cause the same issues as unmatched square or round brackets,otherwise the swift range won't work anyway. Might cause trouble if a space is added between the dots and the bracket...

(a, b) => a..b
[a, b) => a>..b
[a, b] => a>..<b
(a, b] => a..<b

Another use for inclusive ranges is when you’re doing things starting from 1 and ending with a number.

Think: factorial, counting things, numbering things, etc. Unfortunately mathematicians realized that counting from 0 makes sense fairly recently.

I don’t mind it being fairly verbose. That said, I don’t think ... would be a huge problem either. Now that I heard about the reasoning “one more dot so one more number” I’m pretty sold on it.

The other alternatives are really ugly, though. If I had to ask myself “What don’t I hate immediately?” I would say:

  1. a...b
  2. a--b but this might be confusing since a-- looks like it’s being decremented
  3. a till b but why is a..b exclusive and a till b inclusive? I mean why not a until b and a till b then? But then the exclusive one is longer…

I’d rather just go with ../... unless someone comes up with something better than ..=

Here is Dijkstra’s argument for why you should have an exclusive range notation: http://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html

note that things starting from 0 are much more common than things ending in int::MAX, starting ranges from -1 when you want to exclude the first element to ACTUALLY start with 0 is madness

if you want to match on values then you really want to do something like

match i {
    0..10 => foo(),
    10..20 => bar(),
    20..30 => baz(),
}

notice how if you had INCLUSIVE ranges you’d have to manually adjust this to

match i {
    0...9 => foo(),
    10...19 => bar(),
    20...29 => baz(),
}

it’s harder to see that we didn’t miss any values because you have to check that 9 is followed by 10, 19 is followed by 20, etc. whereas in the other example we can visually match 10 with 10 and 20 with 20 knowing that these are exclusive ranges also it looks like 0…9 is nine elements when it’s actually ten

so it’s a pretty good reason to prefer exclusive ranges in most cases but then you still need inclusive ranges for the other cases like 'a'...'z'

4 Likes

I consider ~ to be a symbol for approximation so it could fit with exclusion (a .~ b not exactly to b). The theoretical most common case would lose the ...

(a, b) => a ~~ b
[a, b) => a .~ b // The common case
[a, b] => a .. b
(a, b] => a ~. b
1 Like

I like the current .. and ... notation. The syntax 1..=10 doesn’t even look like a range literal to me.

1 Like

What about ..^ meaning “up to”? That would require some weird xor overloading though.

I really liked this passage from the link:

Which really urges to consider harder whether we really must any have extra syntax for ranges at all.

How about changing ranges to not be of type Range<X, X> but of type Range<Ending<X>, Ending<X>> where

enum Ending<T> {
    Inclusive(T),
    Exclusive(T),
}

Obviously writing stuff by hand would be more verbose, but most of the cases are things like len() or modifications thereof. In the few cases where you need to take care of inclusiveness or exclusiveness, verbosity definitely does not hurt.

let v = [99, 42, 33, 5, 88...];
// literals get converted to Inclusive(lit)
for i in 0..Exclusive(3) {
    // do stuff for the first 3 elements
}
for i in 3..v.end() {
    // v.end() returns Exclusive(v.len())
}

or would that just end up being the function-mess we had before range syntax? It would at least solve the “+1” issue, as it would increase clarity

2 Likes