Range syntax is confusing

Regarding 0..256 as a range of numbers from 0 to 255.

That is an unfortunate choice of syntax. When used in casual discussion it is always wrong and will always need to be qualified or will always be just a little wrong. And it is increasing jargon / specialised use "oh its just how Rust defines ranges". Ouch. Not so fun times ahead if I start teaching Rust.

This forum software is painful: it is not clear which replies go with which comment. Another "ouch" moment. I feel like I'm walking across a grassy field covered in rakes left on the ground ready to slap me in the face any direction I move.

Do note that python’s range generator is the same, range(0, 256) is from 0 inclusive to 256 exclusive, same with C#'s ranges as well. So there is precedence in other languages. Rust slso provides the 0..=255 syntax which is included both bounds.

14 Likes

It’s not “casual” though. The ellipsis would be, maybe. But the actual Rust syntax, as well as the syntax used throughout Ralf Jung’s blog post, is two dots.

Apart from this fact, basically every mainstream language defaults to half-open ranges, because its superior mathematical properties make it much more convenient and less error-prone to use than closed ranges. E.g. the length of begin..end is exactly end - begin as opposed to end - begin + 1 for a closed interval; splitting half-open intervals into two disjoint intervals in the middle is as easy as begin..mid and mid..end as opposed to having to manually add or subtract 1s at the appropriate places. This removes several opportunities for an off-by-one error to sneak in.


Please, let’s not repaint the bike shed every few weeks.

22 Likes

The motivation, of course, comes from the fact that traditional mathematical ranges are usually [1..N], but C and similar languages use zero-origin indexing. Rather than having to always write [0..N-1], Rust and other languages extend the .. syntax to permit explicit specification of both the inclusive lower bound and an exclusive upper bound. Rust takes this further by permitting elision of either or both bounds if they are the end of the range. Thus a vector slice with subscript range [0..vec.len() - 1] inclusive can be expressed in Rust as vec[0..vec.len()] or vec[..vec.len()] or vec[0..] or simply vec[..].

3 Likes

Anecdata: in Kotlin, 0 .. 10 actually means inclusive range, and that still confuses me from time to time. Additionally, the much more common exclusive range is spelled in a more verbose way, 0 until 10, and that is also annoying.

Note, however, that inclusive ranges are more general, than exclusive ones: 0u8..=255u8 exists, while 0u8..256u8 doesn’t.

4 Likes

This is how Rust defines range syntax, and when I see it used in a Rust forum or a blog post about Rust I know what it does.

When someone uses this syntax in casual discussion and I don’t know the context I ask. Chances are they are not talking about Rust, and then the whole point here is moot.

Ouch. Not so fun times ahead if I start teaching Rust.

How is teaching this hard? I’ve taught this, and I never had to explain anything beyond "a..b is a half-open range, and a..=b is a closed range". There is nothing to understand or debate here - this is just syntax, doesn’t even deserve its own slide, at most a footnote the first time you use it. If someone wants to know why, you can point them at the RFCs, but that has never happened to me when teaching this. People just learn it, and move on to more important stuff.

16 Likes

Minor note to those here: this potential confusion point is worsened by many software, including Discourse, as a literal low..high in the markdown post source renders as

low…high

with the full ellipses. And the three dots was used for inclusive range in Rust in patterns, and the Swift syntax is ... for closed, ..< for half open.

Context is important for interpreting range syntax. It doesn’t help that English is ambiguous: working 9-5 is half open (you’re done at five, not still working through all of five), a budget of $200-$300 is usually closed; I think the pattern is for discrete versus continuous measurement?

There’s a reason that range syntax half open and off-by-one errors are such a huge deal in introductory coding classes. You have to choose one way or another, and it’s a toss up which people will expect.

Even mathematical syntax of [0, 10) (or is it [0…10[, or…) requires introduction to a reader that hasn’t seen it before. If the context is clear, e.g. when talking about a programming language, use the clear range syntax and semantics. In any other case, and maybe even for introductory articles where it might be clear, introduce the range syntax with a footnote or sidenote or such to clarify what it means. There’s no perfect solution because everyone brings their own expectations.

10 Likes

They are [1, N] actually. No .. . At least that’s what I have seen. :wink:

OMG I had no idea. Why would the forum software (or markdown renderer in general) insert a dot?!?

3 Likes

It's one of many typographical tweaks that Discourse tries to do:

Please consider turning that off. It’s painful in a forum that regularly discusses code.

(Yes, people can avoid it by writing code blocks in backquotes, but sometimes people forget to do that in running text.)

5 Likes

I can’t, but @carols10cents or @erlend_sh can?

To be frank, I have never understood why the range syntax even exists.

At least to me, it feels like these well-known issues could have been avoided by only providing plain functions with good names instead of adding special syntax to the language.

Range syntax has multiple uses, including slicing and pattern matching.

7 Likes

A good case for more functions with good names, in my opinion.

I have to admit I do like the emdash part—as someone who regularly uses those dashes in text.

But turning two dots (..) into a three-dot ellipsis seems like a bad idea indeed in a forum where code is frequently the subject of discussion.

Ranges can also be used as match patterns, which is not (currently) possible with functions.

5 Likes

I would have no problem with Markdown replacing three adjacent periods ... with the unicode horizontal ellipsis character … (U+2026), but I have a big problem with Markdown rewriting two adjacent periods .. into either form of horizontal ellipsis.

9 Likes

You know, we might be the first group who’s actually complained about the .. -> … rewrite. Most people wouldn’t care, even actual programmers since .. isn’t valid syntax in a lot of languages. Not to mention, I’ve complained about --help being turned into –help.

3 Likes

FWIW, the most common mistake you could make is caught and receives a suggestion:

error: range endpoint is out of range for `u8`
 --> src/main.rs:2:13
  |
2 |     let x = 0..256u8;
  |             ^^^^^^^^ help: use an inclusive range instead: `0..=255u8`
  |
  = note: #[deny(overflowing_literals)] on by default
5 Likes

You could replace them with match guards.

I don’t think the special range syntax is pulling its weight here, or elsewhere – like in the examples here: https://doc.rust-lang.org/core/ops/struct.Range.html

assert_eq!(arr[ ..  ], [0,1,2,3,4]);
assert_eq!(arr[ .. 3], [0,1,2    ]);
assert_eq!(arr[ ..=3], [0,1,2,3  ]);
assert_eq!(arr[1..  ], [  1,2,3,4]);
assert_eq!(arr[1.. 3], [  1,2    ]);
assert_eq!(arr[1..=3], [  1,2,3  ]);

If I had projects where I had concerns that people would write code like this, I’d ban these constructs on day one; thankfully I haven’t had this issue until now.

Not strictly true. If the type you’re matching over has a nontrivial destructor (i.e. isn’t Copy) you can’t use match guards as that would require moving/consuming the value to test the guard. Example.