The return type of the "Index" trait has a big restriction

In the current, the trait Index has the following behavior

pub trait Index<Idx: ?Sized> {
    type Output: ?Sized;

    fn index(&self, index: Idx) -> &Self::Output;
}

This means, the return type is at least a reference type. This make it hard to implement Index for a type A that the operation A[x] would have some exception. In other words, if we want the Output to be Resut<T> or Option<T>, in this cases, the return type would be the reference to Output, which does invovle the lifetime issue. It is no way to implement such a behavior if A does not have a inner field of type Resut<T> or Option<T>. Consider we want to implement Index for a file which might have io error, and we want the operation [x] on file can emit that error out.

For example:

use std::io::prelude::*;
struct FileReader<'a> {
    file: std::fs::File,
    buff: Vec<u8>,
    _marker: PhantomData<&'a u8>,
}
impl<'a> Index<std::ops::Range<usize>> for FileReader<'a> {
    type Output = std::io::Result<&'a [u8]>;
    fn index<'b>(&'b self, index: std::ops::Range<usize>) -> &'b Self::Output {
        unimplemented!()
    }
}
impl<'a> IndexMut<std::ops::Range<usize>> for FileReader<'a> {
    fn index_mut<'b>(&'b mut self, index: std::ops::Range<usize>) -> &'b mut Self::Output {
        let begin = index.start;
        let size = index.end - index.start - 1;
        self.buff.resize(size, b'\0');
        self.file.seek(std::io::SeekFrom::Start(begin as u64));
        match self.file.read(&self.buff[0..]) {
            Ok(_) => & mut Ok(&self.buff[0..])  // error
            Err(e) => & mut Err(e) // error
        }
    }
}

In this case, we do not have a way to construct a Result<&[u8]> that outlives b if we do not have a field of type Result<&[u8]> in Self.

Is any possible to let user designate what is the return type of the index method through just specify the return type is Self::Output rather than &Self::Output?

1 Like

I don't think it would be possible to do in a backwards-compatible way. Anyway, why do you want to use Index to begin with? Just add a couple of methods which return whatever you want. If your worry is that you can't be generic over those methods, well, you can technically be generic over Index, but you get absolutely no semantic guarantees while doing it. Anything can implement Index in arbitrary ways. You would be better off making a specialized trait for your use case.

There could be, perhaps, some value in generalizing Index to return something like impl Deref, but I don't see any gain from returning Option or Result. The resulting trait would be unusable in all contexts where Index is used today, so would require some more specific trait bounds, making it effectively several distinct traits dumped into one.

No. The way Index trait is designed requires the object indexed to already exist in self before the index was called (with small exceptions like interior mutability of once_cell to lazy initialize).

The Index trait is just not a syntax sugar for [], but a limited array access operator (it'd need a different trait that doesn't exist yet). It won't let you return any newly-created object. Rust's references are temporary loans of existing already-stored data, not a general-purpose return-by-reference mechanism.

You're going to have to have a regular method that can return a non-loan.

1 Like

The thing that makes Index special is that self[index] does not evaluate to a value like a method call does; instead, it evaluates to a place. (In C++ terms, these would be an rvalueC++ and an lvalueC++, respectively.) To make the difference obvious: you can never write self.method() = 0;, but you can write self[index] = 0;. (This split requires mut access to demonstrate, but applies the same to shared access.)

self[index] desugars[1] to either *Index::index(&self, index) or *IndexMut::index_mut(&mut self, index) depending on how the place is used -- if it's used in a way that requires mut access, IndexMut is used; otherwise Index is used. This is why the equivalence is self.index(index) and &self[index].

In C++, this behavior comes from operator[] conventionally returning T&C++: an lvalue referenceC++. Rust does not have an equivalent to C++'s lvalue reference, and imho this is a good thing, because Rust's references are just normal types/values that behave like any other type/value, whereas C++'s references behave distinctly differently from nonreference types[2].

Rust's a[b] syntax will always evaluate to a place, just like *a evaluates to a place despite Deref::deref returning a reference. Nobody's really asking for Deref to be able to return things other than a reference[3], so what makes Index any different?

Where Index[Mut] do fall short and could be improved is that the existing traits only cover two of the four ways that a place can be used. A place can be used by-ref (&place), by-ref-mut (&mut place), by-move ({place}), or as the lhs of an assignment (place =).

Additionally, it would be nice if indexing could return reference wrapping types like RefMut) -- and with some use of temporary lifetime extension, this is actually possible to do.

The most general that Index[Mut] may be in the future would probably look something like this:

trait Index<Ix: ?Sized> {
    type Output: ?Sized;
    type Ref<'a>: Deref<Target=Self::Output> = &'a Self::Output;
    fn index(&self, ix: Ix) -> Self::Ref<'_>;
}

trait IndexMut<Ix: ?Sized>: Index<Ix> {
    type RefMut<'a>: DerefMut<Target=Self::Output> = &'a mut Self::Output;
    fn index_mut(&mut self, ix: Ix) -> Self::RefMut<'_>;

    fn index_set(&mut self, ix: Ix, val: Self::Output) {
        *self.index_mut() = val;
    }
}

with the desugarings of

& self[index] as & *Index::index(&self, index),
&mut self[index] as &mut *IndexMut::index_mut(&mut self, index), and
self[index] = value as IndexMut::index_set(&mut self, index, value).

Note, however, that I do not think that index_set will ever happen nor that it is a good idea; overloading assignment is something Rust has rightly avoided so far and I think should absolutely continue to avoid[4]. Assignment should be a move should be a simple bitcopy; if you want behavior, use a method.

Note also that IndexMove is a very awkward trait, even when assuming the presence of something like &move references. I originally considered including a shape for IndexMove, but it's extremely unclear how such should even function. We already have "IndexMove" for all indexables where Output: Copy, as well as for [T; N] even when T isn't Copy, so whatever shape IndexMove takes would have to work for both of those. And I don't think that it's possible, unfortunately, to have an IndexMove which provides the correct semantics. (This is in contrast to DerefMove, which I absolutely think is something which can be made available to user types.)

Relaxing Index[Mut] in this way works because of temporary lifetime extension. Even if the reference wrapper is immediately dereferenced, Rust will hoist it to only drop at the end of scope if it's used -- this scope may be the end of the containing expression[5] or the containing block depending on if its borrow escapes the expression containing the temporary. Here's an example.


  1. Very rough desugar, and I'm skimming over auto(de)ref behavior a bit since it's not super relevant here. ↩︎

  2. Notably, you cannot nest C++ references, thus T&& being yet another different kind (rvalue reference) and the existence of std::reference_wrapper to lift from a reference kind to a normal type kind. ↩︎

  3. Although, now that I've said this, I'm basically guaranteed to find someone who unironically thinks this should happen. ↩︎

  4. The singular reason I could see something changing is for Cell. Cell::set is still just a simple bitcopy, but working with Celld data has a significant amount of extra red tape which doesn't need to exist, and Rust would probably benefit from removing somehow. ↩︎

  5. There's an existing footgun here: used temporaries in the scrutinee of match or if or other block-like expressions live until after the expression's block, even if the temporary is no longer accessible from within the block. This doesn't get in the way most of the time since "non lexical lifetimes" (NLL) allows regular references' lifetimes (and many struct's lifetimes) to be ended after the last use "prematurely" before the end of scope, but this does not happen if the struct containing the lifetime has any drop glue (implements Drop or contains a type with drop glue)[6]. ↩︎

  6. There's an unstable opt-out called the "eyepatch" with #[may_dangle], where with a type like Box<T>, despite having a Drop implementation, does not use T in any way, so any lifetimes in T are allowed to be NLLd the same as if T were held directly on the stack. ↩︎

6 Likes

Because overloading the [] operator is conveniently used, especially to design the library for users. Otherwise, the users would call x.index(...) wherever it should use x[...];

The operator operator[] in c++ does not restrict the return type. That means we can return anything, reference, pointers, or object type as you want.

I would expect that [...] can behave the same as that in c++. IMHO, the only difference is that, in rust, it restricts the return type of the index series method always be a reference.

I don't think that's a good idea either. I don't want to open the floodgates of allowing libraries to say "we did something fancy using existing syntax, but forget what you think they mean". See Boost.Spirit for an extreme example of overloading used…creatively. It uses it well, but I don't think it justifies the abuse that it allows outside of that. (Consider that Rust has said the same thing to pointer and/or uninitialized memory usage: it can enable quite fancy things, but it's just not worth the holes it makes in the common path.)

Why? C++ is C++ and Rust is Rust. Currently there is no reification of a "place", so there isn't anything that can actually be written in Rust today to allow such things to happen. Put another way: we could get this in Rust today, but x[idx] = … would be lost. This is not a suitable tradeoff due to the backwards-incompatibility it introduces into the language (as such, it is not something that is likely explained as an alternative in explicit terms).

The proper path for this feature (IMO) is to follow the existing "Place" RFC(s) which is the more likely way to get such behaviors to be possible.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.