TrustedLen + ExactSizeIterator doesn't mean you can trust the .len()

jbe · June 21, 2022, 9:40am

Fixing a bug here, where I wrongly relied on ExactSizeIterator::len being correct, I noticed that ExactSizeIterator + TrustedLen doesn't allow you to trust .len() being correct.

Consider the following example:

#![feature(trusted_len)]

/// This module is sound.
pub mod a {
    use std::iter::TrustedLen;
    use std::marker::PhantomData;
    pub struct EmptyIter<T> {
        _phantom: PhantomData<T>,
    }
    impl<T> EmptyIter<T> {
        pub const fn new() -> Self {
            Self {
                _phantom: PhantomData,
            }
        }
    }
    impl<T> Iterator for EmptyIter<T> {
        type Item = T;
        fn next(&mut self) -> Option<Self::Item> {
            None
        }
        fn size_hint(&self) -> (usize, Option<usize>) {
            (0, Some(0))
        }
    }
    unsafe impl<T> TrustedLen for EmptyIter<T> {}
}

/// This module is safe.
pub mod b {
    use super::a::*;
    pub struct Unit;
    impl ExactSizeIterator for EmptyIter<Unit> {
        fn len(&self) -> usize {
            1
        }
    }
}

use std::iter::TrustedLen;

fn test<I>(iter: I)
where
    I: Iterator<Item = b::Unit> + TrustedLen + ExactSizeIterator,
{
    let (a, b) = iter.size_hint(); // we can trust `iter.size_hint()`
    assert_eq!(a, 0);
    assert_eq!(b, Some(0));
    assert_eq!(iter.len(), 0, "but we cannot trust `iter.len()`");
}

fn main() {
    test(a::EmptyIter::<b::Unit>::new());
}

(Playground)

Errors:

   Compiling playground v0.0.1 (/playground)
    Finished dev [unoptimized + debuginfo] target(s) in 8.64s
     Running `target/debug/playground`
thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `1`,
 right: `0`: but we cannot trust `iter.len()`', src/main.rs:49:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Module a is sound, module b is safe, yet iter.len() will return a wrong length.

I think the documentation is correct, but I still find this a bit surprising. Am I right that when I'm using TrustedLen, I cannot trust .len()? I guess I always have to use .size_hint()?

Maybe TrustedLen should be renamed to TrustedSizeHint or something like that? Or a more explicit warning in the documentation could be added. Currently it reads:

Trait std::iter::ExactSizeIterator

[…]

Note that this trait is a safe trait and as such does not and cannot guarantee that the returned length is correct. This means that unsafe code must not rely on the correctness of Iterator::size_hint. The unstable and unsafe TrustedLen trait gives this additional guarantee.

This might be technically correct, but I still find it a bit confusing.

chrefr · June 21, 2022, 10:15am

This is an unresolved question:

github.com/rust-lang/rust

Tracking issue for TrustedLen (`trusted_len`)

opened 11:20PM - 03 Nov 16 UTC

bluss

T-libs-api B-unstable C-tracking-issue A-iterators Libs-Tracked

Implemented in #37306 Open Questions: - - [ ] Naming: something with Itera…tor in the name? - - [ ] Naming: size vs len - - [ ] Does TrustedLen pose any requirements on the ExactSizeIterator impl if one exists for the same type? - - [ ] Does TrustedLen pose any requirements on the Clone impl if one exists for the same type? - - [ ] Check all the specific impls, including complicated ones like Flatten on arrays or FlatMap that returns arrays.

Does TrustedLen pose any requirements on the ExactSizeIterator impl if one exists for the same type?

jbe · June 21, 2022, 10:24am

If I understand orphan rules right, then TrustedLen cannot enforce these requirements on generic types unless ExactSizeIterator is also implemented for all types.

How can module "a" in my example keep other modules from providing wrong implementations of ExactSizeIterator without module "a" implementing ExactSizeIterator for all EmptyIter<T> itself?

Therefore, I would conclude that either ExactSizeIterator must be a supertrait of TrustedLen, or TrustedLen cannot make you trust ~~the~~ .len().

jbe · June 21, 2022, 10:57am

Browsing through the tracking issue @chrefr linked above, I feel like BoundedIterator might be a better name. The requirement that the lower and upper bound must be equal(ish) could be removed. This is the current requirement (which I think should be relaxed):

The iterator reports a size hint where it is either exact (lower bound is equal to upper bound), or the upper bound is None. The upper bound must only be None if the actual iterator length is larger than usize::MAX. In that case, the lower bound must be usize::MAX, resulting in an Iterator::size_hint() of (usize::MAX, None).

Then, a new unsafe trait TrustedLenIterator: ExactSizeIterator + BoundedIterator could be introduced which guarantees that lower bound and upper bound are equal(ish).

Nemo157 · June 21, 2022, 11:16am

The only reason you can provide the ExactSizeIterator impl there is that you are in the same crate, which is the ultimate boundary to declare whether an API is sound or not. Orphan rules would prevent splitting modules a and b into two crates.

jbe · June 21, 2022, 11:25am

So apparently I'm understanding of orphan rules was wrong. You are right, and when I split the modules into two crates, I get:

error[E0117]: only traits defined in the current crate can be implemented for types defined outside of the crate
 --> src/main.rs:6:5
  |
6 |     impl ExactSizeIterator for EmptyIter<Unit> {
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^---------------
  |     |                          |
  |     |                          `a::EmptyIter` is not defined in the current crate
  |     impl doesn't use only types from inside the current crate
  |
  = note: define and implement a trait or new type instead

For more information about this error, try `rustc --explain E0117`.

I made the mistake of mixing up type parameters to traits (as in the orphan rules section in the reference) with type parameters to structs (as in my example). I thought I can implement ExactSizeIterator for EmptyIter<Unit> because Unit is a local type. But it was only because it was the same crate.

So the problem isn't really as bad as I thought.

scottmcm · June 21, 2022, 4:56pm

Technically I think so.

It might be worth asking libs-api to make "and if ExactSizeIterator is implemented, it must also be correct" part of the safety condition for TrustedLen.

(It's a nightly trait, so that would be an allowed change. And I suspect that it's already the case for everything not being intentionally malicious.)

system · September 19, 2022, 4:57pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
`size_hint`, correctness, reproducibility and documentation libs	6	2225	March 25, 2019
`ExactSizeIterator` and combinators libs	5	549	September 24, 2022
The Trusted Iterator Length Problem	29	7275	March 25, 2019
Why is `std::vec::Drain` not `TrustedLen`? libs	2	685	February 23, 2020
`ExactSizeIterator` for `slice::Iter`? libs	7	214	September 4, 2024

TrustedLen + ExactSizeIterator doesn't mean you can trust the .len()

Trait std::iter::ExactSizeIterator

Related topics