TrustedLen + ExactSizeIterator doesn't mean you can trust the .len()

Fixing a bug here, where I wrongly relied on ExactSizeIterator::len being correct, I noticed that ExactSizeIterator + TrustedLen doesn't allow you to trust .len() being correct.

Consider the following example:

#![feature(trusted_len)]

/// This module is sound.
pub mod a {
    use std::iter::TrustedLen;
    use std::marker::PhantomData;
    pub struct EmptyIter<T> {
        _phantom: PhantomData<T>,
    }
    impl<T> EmptyIter<T> {
        pub const fn new() -> Self {
            Self {
                _phantom: PhantomData,
            }
        }
    }
    impl<T> Iterator for EmptyIter<T> {
        type Item = T;
        fn next(&mut self) -> Option<Self::Item> {
            None
        }
        fn size_hint(&self) -> (usize, Option<usize>) {
            (0, Some(0))
        }
    }
    unsafe impl<T> TrustedLen for EmptyIter<T> {}
}

/// This module is safe.
pub mod b {
    use super::a::*;
    pub struct Unit;
    impl ExactSizeIterator for EmptyIter<Unit> {
        fn len(&self) -> usize {
            1
        }
    }
}

use std::iter::TrustedLen;

fn test<I>(iter: I)
where
    I: Iterator<Item = b::Unit> + TrustedLen + ExactSizeIterator,
{
    let (a, b) = iter.size_hint(); // we can trust `iter.size_hint()`
    assert_eq!(a, 0);
    assert_eq!(b, Some(0));
    assert_eq!(iter.len(), 0, "but we cannot trust `iter.len()`");
}

fn main() {
    test(a::EmptyIter::<b::Unit>::new());
}

(Playground)

Errors:

   Compiling playground v0.0.1 (/playground)
    Finished dev [unoptimized + debuginfo] target(s) in 8.64s
     Running `target/debug/playground`
thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `1`,
 right: `0`: but we cannot trust `iter.len()`', src/main.rs:49:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Module a is sound, module b is safe, yet iter.len() will return a wrong length.

I think the documentation is correct, but I still find this a bit surprising. Am I right that when I'm using TrustedLen, I cannot trust .len()? I guess I always have to use .size_hint()?

Maybe TrustedLen should be renamed to TrustedSizeHint or something like that? Or a more explicit warning in the documentation could be added. Currently it reads:

Trait std::iter::ExactSizeIterator

[…]

Note that this trait is a safe trait and as such does not and cannot guarantee that the returned length is correct. This means that unsafe code must not rely on the correctness of Iterator::size_hint. The unstable and unsafe TrustedLen trait gives this additional guarantee.

This might be technically correct, but I still find it a bit confusing.

4 Likes

This is an unresolved question:

  • Does TrustedLen pose any requirements on the ExactSizeIterator impl if one exists for the same type?

If I understand orphan rules right, then TrustedLen cannot enforce these requirements on generic types unless ExactSizeIterator is also implemented for all types.

How can module "a" in my example keep other modules from providing wrong implementations of ExactSizeIterator without module "a" implementing ExactSizeIterator for all EmptyIter<T> itself?

Therefore, I would conclude that either ExactSizeIterator must be a supertrait of TrustedLen, or TrustedLen cannot make you trust the .len().

1 Like

Browsing through the tracking issue @chrefr linked above, I feel like BoundedIterator might be a better name. The requirement that the lower and upper bound must be equal(ish) could be removed. This is the current requirement (which I think should be relaxed):

The iterator reports a size hint where it is either exact (lower bound is equal to upper bound), or the upper bound is None. The upper bound must only be None if the actual iterator length is larger than usize::MAX. In that case, the lower bound must be usize::MAX, resulting in an Iterator::size_hint() of (usize::MAX, None).

Then, a new unsafe trait TrustedLenIterator: ExactSizeIterator + BoundedIterator could be introduced which guarantees that lower bound and upper bound are equal(ish).

The only reason you can provide the ExactSizeIterator impl there is that you are in the same crate, which is the ultimate boundary to declare whether an API is sound or not. Orphan rules would prevent splitting modules a and b into two crates.

5 Likes

So apparently I'm understanding of orphan rules was wrong. You are right, and when I split the modules into two crates, I get:

error[E0117]: only traits defined in the current crate can be implemented for types defined outside of the crate
 --> src/main.rs:6:5
  |
6 |     impl ExactSizeIterator for EmptyIter<Unit> {
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^---------------
  |     |                          |
  |     |                          `a::EmptyIter` is not defined in the current crate
  |     impl doesn't use only types from inside the current crate
  |
  = note: define and implement a trait or new type instead

For more information about this error, try `rustc --explain E0117`.

I made the mistake of mixing up type parameters to traits (as in the orphan rules section in the reference) with type parameters to structs (as in my example). I thought I can implement ExactSizeIterator for EmptyIter<Unit> because Unit is a local type. But it was only because it was the same crate.

So the problem isn't really as bad as I thought.

Technically I think so.

It might be worth asking libs-api to make "and if ExactSizeIterator is implemented, it must also be correct" part of the safety condition for TrustedLen.

(It's a nightly trait, so that would be an allowed change. And I suspect that it's already the case for everything not being intentionally malicious.)

6 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.