`Iterator::nth` is badly named

TL;DR: from the documentation of Iteratror::nth itself: " nth(0) returns the first value, nth(1) the second, and so on". If this is not a foot-shooter, what is?

As we know, Rust consistently uses 0-based indexing in arrays, slices and iterators. While many beginners struggle with this at first, there are very good reason to use 0-based indexing, and even better reasons to be consistent about that.

Having a method for skipping to an arbitrary element of an iterator is a good idea. And having this method use the same convention as other methods, 0-based indexing, is also good. But the name of this method should not give the impression that it uses a different convention (even if the documentation is clear about it).

Some people have already objected to me that developers are used to count from zero (with which I agre), and that the first ("1th") element comes after the 0th element. I disagree with this last point, and apparently, so do the designers of the standard library. See the documentation of Iterator::nth quoted above. See also the fact that s.first() returns s[0], not s[1].

My ideal solution would be to eventually deprecate nth and replace it with an identical method having a less confusing name, e.g. next_after.

3 Likes

It's called "zero-based indexing", and it's pervasive in most major programming languages. It's the same with slice/array indexing: a[0] returns the "first" element.

…the reason being? The point of 0-based indexing is that it's consistent with pointer arithmetic. Especially in a language that allows raw pointer manipulation, internal discrepancy between iterators' and pointers' convention would be a major footgun.

Likely Not Gonna HappenTM. It's used all over the place.

2 Likes

With all due respect, did you read the whole of my post? I don't suggest that 0-based indexing is a bad thing, nor that discrepancy should be introduced. Quite the opposite.

As for the reasons why "I disagree with this", I believe I provided two arguments just after that sentence. Actually, as we agree that discrepancies are hurtful, I argue that there is a naming discrepancy between slice::first() (for example), and Iterator::nth() -- assuming that "first" is equivalent to "1th".

Finally:

I know it would be disruptive, and so there is no perfect solution at this point. My goal was to open the discussion and have the community weigh the pros and cons.

5 Likes

I surely did, and I'm sorry if I misrepresented you. I though you were arguing against 0-based indexing (at least, but not necessarily only, in nth()) based on this part:

I agree that there is something not quite intuitive about .nth(i). I know the rule well, but it doesn't feel "native" to reason about it for some reason. I'm not sure it rises to the level that it is worth a deprecation (I'm generally against deprecating stable interfaces unless it is an actively harmful feature - harmful to safety of code). A do-over is expensive and we can't spend it on frivolous stuff.

One name I could imagine would be something like .next_index() or .next_by_index(), i.e to explicitly say index to invoke that semantic.

7 Likes

The last sentence you quote was indeed ambiguous, apologies for that. I edited it to make it (hopfully) clearer.

If we're bikeshedding names, I like skip. But deprecating nth seems unlikely.

4 Likes

For what it's worth I've been thrown by this a few times myself. Back-compatibility wise we can't change that method, but something one-indexed would be nice to have.

My suggestion is not to change it, but to create an identical method with another name, then deprecate nth in favour of the other. Linters would then suggest to replace occurrences of nth by the other method, which would be a no-brainer change.

That is not what I was suggesting. I sympathize strongly with @H2CO3 (and other)'s argument that discrepancy in this regard would be hurtful.

2 Likes

Could we perhaps improve this from the other side? The documentation could perhaps read:

nth(0) returns the zeroth value, nth(1) the first, and so on.

or even more clearly

For an Iterator which yields "a" then "b" then "c", nth(0) returns "a", nth(1) returns "b", and so on.

I think this is the really unfortunate one - this is the one I'd be trying to change (maybe to front/back or zeroth or nth(0). Defining where "n" starts is relatively easy to do in prose - giving the context for what the word "first" means is more awkward, given it has a more natural meaning.

7 Likes

skip is already taken.

3 Likes

There is the added complication of "renaming" a trait method. Iterator is widely implemented. nth is optional to implement - but it's recommended to do so in certain situations. So what can libcore do, if it wants to add a differently named alias for this method?

From my quick view it seems like it would be the user's interest to always call the new method. Iterator nth would default to calling the new method, and the new method would have the actual implementation. (Which way this goes needs to take into account that the method is introduced as unstable, and then stable users can't call or implement the new method.) Those that implement the method would default to implementing nth to give users the best coverage (this decision depends on what libcore decides for its default methods). None of this will be obvious to users or implementors, it can be documented, but it's a small complication.

4 Likes

So you could say that .skip(i).next() is an existing and more intuitive way to do the same thing as .nth(i), maybe? With underlying implementation differences but the same returned value.

4 Likes

I personally don't see nth is badly named. Or at the very least, not badly named enough to warrant the churn and complexity of a deprecation.

35 Likes

I agree with @burntsushi.

If this were so confusing it was leading to real-world bugs in people's code, I think it might be worth considering. But is it?

Personally I never even considered these names might be confusing for people. All of them did exactly what I expected, despite a debatable inconsistency between first() and nth(0).

If there is going to be any deprecation, I'd probably suggest renaming first() and last() to something like head()/tail()

5 Likes

To add to the potential churn, there are also nth_back in DoubleEndedIterator (nth_back(0) == last()), as well as slice::select_nth_unstable and its variants.

Some thoughts on why I don't see the naming as a problem:

Method naming uses the natural language meanings, where ordinals start from "first". That is, "first" is an English word for the element before all others in a sequence just like "last" is a word for the one after all others. Adding in "zeroth" as a possibility can only add confusion.

If I see a method named "forty_second", I expect it to return the element at index 41, because in zero-based indexing that is the forty-second element. Similarly, in "nth", the name invokes reference to the ordinal numbers ("first", "second", ...), but as a crucial difference the parameter is taken as a number, where it would be quite surprising to not use zero-based indexing.

One way I can understand nth is that it is mapping the parameter value of type usize to the ordinals, and the zero-based indexing is part of that mapping. In pseudo-code:

fn nth(&mut self, n: usize) -> T {
    const ORDINALS: [fn(&mut Self) -> T] = [first, second, third, ...];
    (ORDINALS[n])(self)
}

To clarify, the only issue I see here is that nth(4) looks like it could mean fourth(), but one already has to read v[4] as the fifth element.

8 Likes

Please don’t do this if you aren’t planning on deliberately confusing all functional programmers.

the first component (or the "head") of a list

the list consisting of all the components of a list except for its first (this is called the "tail" of the list.)

14 Likes

@bluss It is true that I thought of people using the method in their code, but I overlooked the issue of people implementing it for their types... :-/ Also, I overlooked the fact that similarly named methods exist elsewhere.

I absolutely agree. But this very "tension" you very well describe is what makes it confusing.

But the big difference is that nth(4) literally reads like "fourth". The square brackets, on the other hand, are not intrinsically tied to the notion of ordinal number (e.g. in Rust as in other languages, you can also use them with other arguments types, as in map["foo"] -- which clearly is not the "foo-th" element of map).

1 Like

Hint: There’s two reply buttons near the bottom of the page, one it to reply to the previous (last) post (in the bottom right corner of each post), and one is to leave a general reply in the topic (at the bottom, in line with buttons saying “Share”, “Bookmark”, “Flag”. You seem to have hit the wrong one and replied to my post.

1 Like

I have an FP background, and duly noted.

2 Likes