`Iterator::nth` is badly named

I definitely don't think nth is anywhere near bad enough to try to go through a migration-and-deprecation cycle.

That said, note that nightly does have Iterator::advance_by, which can sometimes be more readable, depending what you're doing with nth. (nth is better if conceptually you're "indexing" the iterator, but historically nth has also been used as an in-place skip, and advance_by is much better at that. it.nth(i) is now moreso a convenience wrapper around try { it.advance_by(i).ok()?; it.next()? }.)

9 Likes

I'm firmly on the side of "nth is not the right name for this function, but it's not worth the churn needed to fix it."

4 Likes

I suppose your real contention is with this. But what you're really doing there is using 1-indexed terminology for technology that is 0-indexed, so there is a kind of conceptual mismatch. My personal solution has apparently (without overthinking it, or really thinking about it overtly at all) been to just see .nth(x) as analogous to (but asymptotically more expensive than) &array[x] and so in my mind there has never really been an issue in terms of terminology.

It definitely won't be skip, as Iterator::skip() already exists.

Interesting perspective, since for me it explicitly prevents confusion by keeping things consistent.

While I see the point you're making, it's a bit of a straw man argument. If I see something like that in real world code then it had better be about the numerical analysis of said number, or a test of some kind. If not, it's more than likely overwhelmingly poorly named because it's not abstract enough (which is exactly how .nth() improves on that of course). And because it more than likely wouldn't happen in real code, I don't believe it to be a good idea to use it as a frame of reference. It's akin to saying "if only we had wormholes, interstellar travel would be easy now": Sure that might be true, but crucially we don't have any workable wormhole technology so saying that is a moot point.

Indeed. But the issue there is the English language, not the concept of 0-based indexing (or 1-based indexing for that matter). It's the English language that introduces the inconsistency in this context by not having the convention of starting with zeroth. But it does that in oh-so-many places, because English itself, as the descendant of a pidgin language, is internally fairly inconsistent...

I agree with others here that nth isn't a problematic name; nth(0) is the zeroth element.

Personally, in a zero-based context, I sometimes use intentionally "incorrect" words like "one-th" when describing constructs in prose or aloud, precisely to avoid having saying "first" to mean element 1. Or, I just say "item 1", "item 2", etc. (And similarly, nth(x) is "the x-th item".)

I do think that first is slightly more problematic, for this exact reason. I don't think it's worth a deprecation, but if we had it to name over again, we might have wanted to use a different name that isn't ordinal.

4 Likes

I grant you that any natural languages, as a whole, contains inconsistencies. However, I'm convinced that we can, and should, strive to use an internally consistent subset of English used in Rust's standard library (or any library in any programming language, for that matter).

I disagree, but at least I appreciate your own consistency w.r.t. this issue :wink:.

This looks like a two-kinds-of-people situation, with those who don't see a problem at all, and those who see a problem not as bad as the trouble it would be to fix it. I must say I was quite convinced by arguments of the latter...

2 Likes

A subset means essentially "leaving things out". How do you, by sole means of leaving things out of the English counting system, take a consistent and workable subset of a system that is based on starting the count at a number that is just wrong for this purpose?

Come on... we manage in this thread to have a conversation in English about 0-based indexing, so English is not as wrong for this as you seem to suggest.

A method name like Iterator::skip(n) does not cause any trouble in how to interpret n. A method name like slice::get(n) is neutral as it does strongly hint at either 0-based indexing nor 1-based indexing. Iterator::nth(n) creates confusion, as very well described by @queternic above, and therefore should have been left out, IMHO.

But again, I got convinced by others that this ship has sailed.

1 Like

Next thing we'll be discussing is how nth doesn't account for the possibility or a mth or kth or ith element. Really, nth(45) is weird because it doesn't spell β€œ45th” directly but indroduces some notion of a mysterious β€œn”, nth(n) is redundant (saying β€œn” twice) and nth(k) or nth(i) doesn't take the possibly existing but unrelated variable n into account. Also, the -th suffix only works with 70% of all numbers, for the remaining 30% it's grammatically incorrect.

And don't get started on how the word nth doesn't describe how it consumes items from an iterator and takes linear time in general.

Really, jokes aside, I find nth to be an appropriate name because if I want to have the kth or 42nd element of an iterator, I can use the suggestively named nth method (with nth(k) or nth(k-1) depending on whether k itself is zero-based, and nth(41)). I've got to read the docs anyways if I don't know the method, everyone will expect zero-based indexing anyways (given that the argument type is usize, not NonZeroUsize)

By the way, I disagree with anyone who thinks that the "first" element of an iterator is anything but the first element, i.e. the next element returned by next.

13 Likes

The problem I see with "zeroth" is that then "first" becomes ambiguous. It is common in English text, and creating doubt on whether it actually means "second" in some context is a bad idea.

Nevertheless, I think it's relatively easy to grasp why nth(1) means the second element and is different from first(), as long as you know that indexing starts at 0. Iterator::{enumerate, position, rposition, nth} all use the same index-element mapping, reinforcing that convention for iterators.

Maybe someone could make a logical error like the following:

let n = iter.clone().count();
// If there are four items and you take the fourth,
// that's the last one, right?
iter.nth(n)

which is essentially the same mistake as v[v.len()], but the connection to zero-based indexing is less obvious.

I still agree that nothing really needs fixing in this. One minor change to consider could be changing the name of the parameter from n to index.

fn nth(&mut self, index: usize) -> Option<Self::Item>

Reading just the signature would then immediately nudge towards the correct idea. IDEs may also show this, e.g. rust-analyzer fills in the parameter names by default when completing.

However, this would obfuscate the rationale for the name, and doesn't work at all for nth_back (which counts from the back, contrary to e.g. rposition).

3 Likes

It's slightly odd, but not terrible. 0-based indexing it so prevalent, that I would be surprised if Rust had 1-based indexing method anywhere, so I can easily assume .nth() uses 0-based indexing too. It also takes usize instead of NonZeroUsize.

It works nicely with std::env::args().nth(1), where both gotchas cancel out :slight_smile:

6 Likes

Apart from that, it would be nice if Rust had a feature (even if libstd-private) to rename trait methods. Deprecate old & add new is highly problematic when default trait impls can be overloaded and call each other.

The other trait method I would like to rename is io::Read::read(), which is a footgun. It should be something like read_this_many_bytes_or_fewer().

2 Likes

IMO, the only thing we need for this is some way of specifying which method uses which default implementation depending on which methods are user-implemented and which aren’t. Then (provided we even want to rename this one) nth and next_after_skipping_this_many (⇐ insert proper name here) would both have a default method implementation in terms of each other if either one is implemented manually, or in terms of what’s currently the default impl for nth already if neither is implemented manually. One could even go further and add a way to forbit implementing both manually. A feature like this would play along nicely with a way of explicitly specifying minimal complete sets of methods you need to implement.

3 Likes

As an aside, I'd be very interested in getting a poll of if people think nth is confusing, and seeing if there's any correlation to

  • Being a native speaker of English or not
  • Length of time being a programmer

I've long since disassociated 1st and nth(0), but I don't know if that's because:

  • It's been ingrained in me through experience
  • I'm a native speaker and I don't super strongly associate "first" with "1" (first is more nuanced I guess? more independent of 1?)
  • It's just me.
5 Likes

Note that 0-based really is better. We may not have known that around ~45BC when making the Julian calendar, but now we do, and have for a long time. Note that the "first" minute of an hour is :00, as is the "first" second. We could have gone 2:58 β†’ 2:59 β†’ 2:60 β†’ 3:01 β†’ 3:02. But we didn't, because 1-based like that is worse.

The confusion isn't a programming-only thing, either. See, for example, how 1812 is in the 19th century despite not containing a "19" anywhere. And everyone who celebrated the "new millennium" when the year number went 1999β†’2000 is actually thinking 0-based, not 1-based, though they probably never realized that.

I quite like this example, too:

12 Likes

When I read "first", I think "there's nothing before that", just like "last" means "there's nothing after that". The notion of a "zeroth" element doesn't exist in English (or German, my native language). For example, before the first century AD came the first century BC.

The problems are that

  • The association between ordinal numbers (first, second, third) and cardinal numbers (... -1, 0, 1, 2, 3, ...) is usually 1-based in English and many other natural languages (because the number 0 didn't exist in Europe for a long time). Therefore people associate the number 1 with the first element.
  • The name of the nth method suggests that it accepts an ordinal number. However, ordinal numbers don't exist in Rust. If instead the function accepted an enum Ordinal { First, Second, Third, ... }, there would be no confusion.
2 Likes

I'm wondering if it would help to adjust the documentation of the whole standard library to always start lists with 0, as a reminder that Rust is 0-based :smiley:

Second, implicit methods on primitive types are documented here. This can be a source of confusion for two reasons:

  1. While primitives are implemented by the compiler, the standard library implements methods directly on the primitive types (and it is the only library that does so), which are documented in the section on primitives.
  2. The standard library exports many modules with the same name as primitive types. These define additional items related to the primitive type, but not the all-important methods.
4 Likes

Natural numbers also need to include zero if we ever hope to have an identity element over addition. When writing random-access, offset-based algorithms, having to add at least 1 and then subtract 1 all the time would be useless noise.

1 Like

Obligatory Dijkstra reference.

13 Likes

It's a tricky thing to get it right. I would suggest the API named at_index() , and keep the 0-indexed semantics.

For example: at_index(0) returns the first value.

1 Like

The problem with using the word "index" is that implies you will get the same result when calling it several times.

8 Likes