Pre-RFC: take_some as a compromise between take_while and filter_map

Summary

This RFC would propose take_some<'a, T>(&'a mut iter: impl Iterator<Item = I>, f: impl Fn(I) -> Option<T>) -> TakeSome<'a, T> iterator member function, that would take elements from an iterator while the values returned by the function are some and leaves the iterator be when the returned value is None.

Motivation

Current possibilities allow something like this:

let chars = "123 rest of the string".chars().into_iter();
let digits: Vec<u32> = chars.take_while(|c| c.is_digit(10)).map(|n| n.to_digit(10).unwrap()).collect();
assert_eq!(digits, vec![1, 2, 3]);

Alternatively, this could be implemented using .take_some like this:

let chars = "123 rest of the string".chars().into_iter();
let digits: Vec<u32> = chars.take_some(|n| n.to_digit(10)).collect();
assert_eq!(digits, vec![1, 2, 3]);

Maybe you should spend a little more effort in making sure your code example actually compiles and then also behaves as expected. (Neither is the case here.)


I’ll skip on the various compilation errors that need to be addressed here… the behavior is the main issue I want to point out. It’s going to see "postfix" left in the chars iterator, with the space character missing.

The way take_while can determine the end of the section of the iterator it wants to take is by consuming items until one consumed item didn’t match the predicate. Without some kind of peeking mechanism in the original iterator, there’s no way around it, that first non-matching item is consumed and gone; out of the original iterator. It is for this reason that a function like take_while should never be a &mut self method on Iterator, but consume the iterator. The distinction is not all that big, since you can always use an Iterator by reference anyway, but it’s at least not encouraging users to run straight into unexpected behavior.

That being said, here’s the other “issue” with your proposal, I’m just noticing: This kind of method already exists! :partying_face: (except for the &mut self signature, but I’ve addressed that point above.) It’s called map_while.

let mut chars = "123 postfix".chars().into_iter();
let digits: Vec<u32> = chars.by_ref().map_while(|n| n.to_digit(10)).collect();
assert_eq!("postfix", chars.collect::<String>());

Rust Playground

8 Likes

Yes, sorry for that, I don't know what was going through my head. Had I checked it with rust playground I'd see that the behavior I'm describing is not possible. I fixed the original post to compile and include the actual important part of the behavior for anyone who finds this post in the future.

4 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.