Pseudo-RFC: Cursors (Reversible Iterators)


#1

We currently have three main kinds of Iterators:

  • Iterator: Linearly consumes a “range” irreversibly
  • DoubleEndedIterator: Linearly consumes a “range” irreversibly from both ends
  • RandomAccessIterator: Yields elements from a range by index without consuming them

Iterator is great for for loops, and RandomAccessIterator is great for abstracting over indexable ranges of values. As far as I can tell, DoubleEndedIterator is basically just handy for consuming ranges in more elaborate ways; it’s primary use-case as far as I know is reversing a range.

However the restriction of consuming ranges irreversibly is often artificial. For a move_iter it is of course necessary, but for a range or ref-based iter or mut_iter, it is often trivial to reverse back to where you came from.

Cursors

Cursors allow seeking back and forth in a range without logically consuming the values of the range. MutCursors of course, by extension, provide methods for mutating the range. However, their precise implementation can take many forms.

Proposal 1: Cursors Are Iterators (Of All Kinds)

The perhaps most obvious option is to add Cursors as traits that extend Iterators. Minimally, we can introduce the following:

/// An iterator that supports revisiting old values. In contrast to a normal 
/// `Iterator`,  a `Cursor` is guaranteed to always yield None once it 
/// has seeked past either end of its range. 
/// However, one can always seek back into range by calling `next()` 
/// or `prev()` when before or after the range, respectively. 
/// When out of range, calling `next()` or `prev()` does not take 
/// the head of the `Cursor` any further out of range than one "step".
///  Logically, a `Cursor`'s head always sits between two elements,
///  or on either end of the range. 
trait Cursor<A>: Iterator<A> {
    /// Advance the iterator backwards, and return the previous value. 
    /// Return `None` while the beginning is reached.
    fn prev(&mut self) -> Option<A>;
}

/// A double-ended `Cursor`. In order to preserve the semantics of 
/// a `DoubleEndedIterator`,  from the perspective of each head of 
/// the cursor, the other represents an "end" of the list, 
/// and will return None whenever they overlap. 
/// The heads of the cursor cannot seek past each other.
trait DoubleEndedCursor<A>: DoubledEndedIterator<A> + Cursor<A> {
    /// Advance the back head of the Iterator backwards, 
    /// and return the previous value
    fn prev_back(&mut self) -> Option<A>;
}

However one big usecase of Cursors is mutating, so we might also want:

/// A Cursor that can mutate its Range
trait MutableCursor<'a, A>: Cursor<&'a mut A> {
    /// Insert an element into the range between the next element of the cursor
    /// and itself
    fn insert(&mut self, elem: A);

    /// Remove the next element of the cursor, and return it or None if there
    /// is no next element
    fn pop(&mut self) -> Option<A>;
}

/// A DoubleEndedCursor that can mutate its Range
trait MutableDoubleEndedCursor<'a, A>: DoubleEndedCursor<&'a mut A> {
    /// Insert an element into the range between the next element of the 
    /// back cursor, and itself.
    fn insert_back(&mut self, elem: A);

    /// Remove the next element of the of the back cursor, and return it, 
    /// or None if there is no next element. If the cursors overlap, this
    /// does nothing.
    fn pop_back(&mut self) -> Option<A>;
}

Most notable is that MutableCursor<A> is actually a Cursor<&mut A>. This seems necessary to correctly handle the desired usecase: collections. I might have messed up the type annotation here. I’m just writing this all out in discourse untested.

The Mutable variants are derived from the ListInsertion trait provided by DList. Interesting potential extensions I left out for simplicity, and to leave the core proposal minimal:

  • peek, peek_prev, peek_back, and peek_prev_back are conceptually simple and handy
  • update: can be emulated with pop(); push(new), or just abusing the fact that &mut A is returned
  • jump_to_front and jump_to_back are worth considering, especially when you only have an iterator
  • seek_with(fn) (and various directional variants) might be valuable extensions, although iterator adapters might handle this

If you’re wondering about the direction of insertion/deletion, it’s because it makes using Cursors to push/pop_front/back more natural. The operations also don’t move the cursors relative to their respective ends, which is nice. However, this direction means the iterators directly interfere with each other when adjacent, which is tricky and potentially confusing. The other direction of insertion/removal inverts these issues. It’s a toss-up IMO.

Regardless, I think this is probably the strongest proposal.

Proposal 2: Cursors Are Iterators (But Not DoubleEnded)

Basically, take Proposal 1 and ditch the DoubleEnded stuff. This removes the biggest source of issues/confusion when working with Cursors, but of course makes them less powerful. It also means that all doubled-ended structures, which are the primary candidates for Cursors, (e.g. DList) have to provide something different for Cursors anyway which sucks. Honestly, this is probably a worst-of-both-worlds compromise between Proposal 1 and Proposal 3.

However, I believe it leaves some mental headroom to add more functions like insert_before and remove_before to get finer control of mutation, which is cool.

Proposal 3: Cursors Are Their Own Thing

Simply make Cursors a distinct concept from Iterators. I have a PR in for doing exactly this on DList, which is what sparked this discussion. In this context, we are free to deviate from any iterator semantics. For the most part, the API looks a lot like Proposal 1. However, In my PR, I used this to have cursors heads point to elements, rather than between them.

This allows us to cleanly insert_left or insert_right, and pop_left or pop_right without interfering with what the Cursor is pointing at. However this makes being “off the ends” more of a special case conceptually, and leaves some room for ambiguity when you e.g. insert_left of the front of the list. At the time of my writing the PR, it made sense to me to just push_front, but that breaks the invariant that insert_left makes an element appear before it. The alternative is to push_front and then shift the cursor right one step. Neither is great, in my opinion. This problem doesn’t occur with any version of Proposal 1 or 2, because either you can’t insert in the “wrong” direction, or you can and getting shifted is standard procedure.

This also introduces more methods on collections, and has substantial code duplication. If we want to use this to provide DoubleEndedCursors, we could also potentially create some elaborate n-ary cursor structure, which would allow complex manipulation of collections. You could make a DList into an efficient Treque, for instance.

Miscellanea

Just some random stuff that didn’t really fit anywhere else

Primary Use Case: Collections, Particularly DList

The main motivator of this was Iterators making no sense for DList. The only reason outside of being super-resource-spike sensitive to actually use a DList is to freely manipulate its internals once you have iterated to there. Consuming iterators make this impossible.

Vec and RingBuf are certainly perfectly appropriate candidates for the immutable Cursors. The Mutable ones can also work, although they aren’t exactly the best for the job.

Tree* and Hash* seem perfectly appropriate for the Immutable Cursors, but the Mutable ones don’t make much sense.

Anything that’s RandomAccess like Ranges, could probably honestly be made into a Cursor (though making this automatic or mandatory is probably ill-advised).

How Other Languages Do It

  • Java ListIterator: Similar to Proposal 2, because Java has completely different iterator semantics. However its direction of mutation is stateful in that it always interacts with the last element yielded. I’m not a fan of this, personally. It doesn’t add any power that I can think of, and adds mental overhead. I get it, though. It makes a kind of sense.
  • C++ STL: Iterators are effectively references that a collection can take as an argument to an insert/remove call. This is a non-starter in Rust.
  • Qt: continues to take the Kitchen Sink approach and offers both Java-style and C+±style iterators.

Adapters

I haven’t fully thought out which Iterator adapters are compatible with cursors and which aren’t. This may effect the usefulness of keeping them compatible with Iterators, although worst-case they just downgrade the Cursor to an equivalent Iterator and call it a day. However, if not, what exactly “skip(10)” means to a Cursor is unclear. Is the Cursor blocked from walking back? How does “take(10)” interact with mutatation?

What Are Your Thoughts?

I was tasked with posting this here to get community feedback on the issue, so please: feed us backs!


#2

This reminds me of Zippers in functional languages like Haskell. They allow you to basically travel to a spot in a data structure and make local edits quickly, before rebuilding the data structure itself in one fell swoop.

This might be of interest: http://www.haskell.org/haskellwiki/Zipper


#3

A while ago, I drafted (but never submitted) an RFC for iterator inputs, which was my way to make iterating back and forth easier, along with a number of other uses. This actually extends the language and the Iterator trait to allow inputting values into an iterator while it’s iterating. That would make iterating over a DList even easier by letting you keep the normal for syntax. Unfortunately, it’s backwards-incompatible (as it changes the definition of the Iterator trait and its next method) so I didn’t make an RFC PR for it—I’m pretty sure it’s too late at this stage to add such a feature to iterators in a backwards-incompatible fashion. (If anyone actually would like this, I could change the RFC to make it backwards-compatible (except the added keyword (although that could probably be changed too (believe me, even with these parentheses, I’m not a Lisp coder)))).

Regardless, this looks like a good idea—especially the MutableCursor trait—although it could be a bit annoying not to be able to use the normal for loop to go backwards as well as forwards. Could there be some way to get something like this to work:

for (dir, i) in [1i, 2, 3].iter_directional() {
    if i == 3 {
        *dir = Backwards;
    } else {
        *dir = Forwards;
    }
    println!("{}", i);
}
// 1 2 3 2 3 2 3 2 3 2 3 2 3...

#4

Not being able to for backwards definitely sucks, but honestly how common of a use-case is changing direction in the middle of a for loop? If you only need one direction, you can just .rev(). I’m definitely not a fan of setting a stateful variable for that, though.


In other news, I’ve gotten some feedback on this elsewhere, and overall sentiment seems positive for some version of Proposal 1, I think.

One interesting usecase pcwalton mentioned was elaborate tree traversal and manipulation. If this is the case, then cursors pointing between elements might not make as much sense. Pointing to elements seems more natural in a tree, to me. Although if we’re preserving iterator semantics, the cursor would presumably do something like an inorder traversal, which means your “next” and “previous” node might not actually be adjacent to you. Only a depth-first traversal really makes sense to me, in that case. But then elements get repeated many times in a traversal.


#5

Using a single linear iterator / cursor type with non-linear structures is unnatural. I think it’s better to have a special “tree cursor” for a tree structure.

    struct TreeCursor {
        fn next_sibling(&mut self) -> Option<Node>;
        fn prev_sibling(&mut self) -> Option<Node>;
        fn parent(&mut self) -> Option<Node>;
        fn first_child(&mut self) -> Option<Node>;
        fn last_child(&mut self) -> Option<Node>;
    }

I do support proposal 1 for linear structures like DList though.


#6

Note that Cursor<&mut A> is wrong if it inherits from Iterator – it would allow duplicating &mut A pointers by getting multiple pointers to the same thing.

For mutable traversal, it needs to be exclusive, you know this pattern: fn next<'a>(&'a mut self) -> Option<&'a mut A>

Mutable iterators are sound today because the iterator elements can only be visited exactly once. This is also why RandomAccessIterator can’t be used with &mut A.


#7

Oh! You know, I had assumed that the reference usually was somehow tied to the iterator as well. But looking at most MutItems impls, this is not the case. This is a very good point.


#8

This change looks quite useful. I personally prefer the first proposal. Maybe you should use remove instead of pop though as that is similar to what is used in c++ and because pop is also a method of DList. Could the the cursors also support some sort of method to temporarily mutably access the underlying DList?


#9

Here’s an idea I tested out:

pub trait Cursor<A, Direction> {
    fn go(&mut self, Direction) -> bool;
    fn focus<'a>(&'a self) -> Option<&'a A>;
}

It makes it look more like a Zipper, and it’s generic over the various forms of traversals that exist. For example Direction could be an enum of Next, Prev. Or it could be a tree direction of Up, Left, Right.

Anyway, it doesn’t seem viable :slight_smile: I’m afraid a go/focus split will force the cursor to do more than one conditional per loop iteration, and that would be bad. Further, the direciton enum idea doesn’t seem to fit with Rust – you’d want enum subclasses so that { Next, Prev } can be a subclass of just { Next }.


#10

This could be made really nice to work with if we have associated types.

pub trait Zipper<A> {
     type Direction;
     // false if the move is impossible
     fn go(&mut self, Direction) -> bool;

     // Access the value at this position
     fn focus(&self) -> Option<&A>;
     fn focus_mut(&mut self) -> Option<&mut A>;
}

I’m still not sure how this works for persistent data structures - you need a zip and corresponding unZip to make that work.

It’s also tricky because I’m unsure how to extend this to edit around the focus or insert/remove at the focus but I think it’s possible, that bit actually works quite well with persistent structures.


#11

How about fn insert(&self, Direction, A) which instead of moving in that direction, inserts an element between current focus and the element there.


#12

I’m not sure what the point of the trait is if it relies on a type-specific enum, that naturally couldn’t be used in a generic context? Like, given a generic Z: Zipper, how do I go in any direction?


#13

It would be nice if we had something like haskell’s ~ (Z: Zipper + Direction ~ ForwardBackwards).


#14

Do fixed-size containers have a sensible implementation of this insert?


#15

I really like the idea of a DoubleEndedCursor (or even DoubleEndedZipper?).

My personal use-case stems from trying to write an in-place partitioning algorithm for slices. My first attempt used indices and entirely safe code, but the indexing and bounds checking made it a fair bit slower than the C++ iterator-based implementation.

For my second attempt I tried to use a (mutable) DoubleEndedIterator, but it was a non-starter because for this algorithm you have to be able to access the value you’re currently at without going to the next value. So in the end I wrote it with unsafe code and raw pointers (and its performance is pretty comparable to the C++ version!).

Being able to write a version of that which is both safe and efficient would be awesome.


#16

We often just want restartable iterators, ala

pub trait RestartableIterator : Iterator {
    fn restart(&mut self);
}

We could pass IntoIterator<Iter: RestartableIterator> perhaps.

We cannot use IntoIterator alone of course because it must consume by value to support Iterator<Item = &mut T>, but we could add a trait like:

pub trait AsMutIterator : IntoIterator {
    fn iter_mut(&mut self) -> <Self as IntoIterator>::IntoIter;
}

Edit: Oops that second idea won’t quite work.


#17

Great idea, although I’m really not a fan of proposal 3. I’d be rather surprised if a cursor, which allows you to go forward (and backward) over a view of a collection wasn’t a proper Iterator. It would reduce their usefulness because they couldn’t be used in many places they otherwise could and it feels like an arbitrary restriction. Cursors being Iterators also avoids having to implement the provided methods twice in the standard library.


#18

How would map interact with Cursors? I’d imagine proposal 1 requires replacing the FnMut closure with an Fn closure:

impl<B, I: Cursor, F> Cursor for Map<I, F> where
    F: Fn(I::Item) -> B 
{
    fn prev(&mut self) -> Option<A> {
        self.iter.prev()
    }
}
impl<B, I: DoubleEndedCursor, F> DoubleEndedCursor for Map<I, F> where
    F: Fn(I::Item) -> B
{
    fn prev_back(&mut self) -> Option<A> {
        self.iter.prev_back()
    }
}

It sounds doable, but interacts poorly with interior mutability, like maybe fn types or const Fn is more the goal, but w lack traits for those. Also, algorithms accessing a data structure this way sound rather niche. It’s more common algorithms want to repeatedly do the same or reverse iteration over some view of some data structure. We might address this with closure that return iterators, like FnBorrow traits

After this, algorithm want more powerful data structure access with specific properties, like random access, or certain insertion characteristics, or whatever. We could again wrap such properties inside views, but exactly how gets tricky. We could discuss some hierarchy of traits for specific data structure properties.