Can Epochs change the definition of a stdlib trait?

Recently I’ve been thinking about the Index trait, and how it’s actually mis-specified: The index method returns a &Self::Output, but what it should probably have returned from the start is Result<&Self::Output, Self::Error> for some associated type called Self::Error.

The problem with the current definition is that if the index is out of bounds, you get a panic and there’s nothing you can do about it (save for catching panics which, while sometimes necessary, is smelly as an error mechanism and isn’t even guaranteed to work, depending on how panics are configured and what panic you get). As a consequence of this footgun, I find that it simply is not something I want to use in production code, and thus I opt for an index-like method with a proper Result<_,_> return type instead, even though the [] syntax would really help in stating intent.

Since the Index trait is tightly bound with both indexing semantics and the []-syntax, if this is to be fixed at any time and place, it has to be done in stdlib, and it would have to be at least an Epoch change because of the potential impact (though I have a hard time believing anyone would willingly write their own impl and in so doing introduce potential panics). But personally I think it might be a worthwhile change, actually making the trait usable without being smacked down by the big bad FOP (fear of panics).

What are everybody’s thoughts on this?

I would argue that the current interface is better than returning Result for many uses, but it doesn’t even matter because editions can’t make this change (just like pretty much all “library changes”) since crates of all editions need to be interoperable with each other. So a Rust 2015 crate needs to be able to implement the trait and have its implementation be usable in Rust 20XX code, and conversely an Index implementation in a Rust 20XX crate needs to be usable in Rust 2015 code.

4 Likes

I think you can propose TryIndex trait (with appropriate implementations in the std) which will have blank unwrapping implementation of Index trait, and maybe in future some sugar in addition to [] which will be desugared into TryIndex call.

There could be a new TryResult trait that the new compiler would use instead.

The problem is that (current) index returns a reference, so it can’t allocate anything. So you can’t even make a new Result! That’s also a problem for current collections that can’t transform their internals before returning them.

2 Likes

@hanna-kruppe Why do you consider forced panicking on index failure superior over having the choice of what to do with an out of bounds error value? In general I consider forced panicking a strictly inferior alternative, as it takes the choice out of the hands of the consuming code. This is a similar line of reasoning to why library APIs should never panic.

I do see your point about Epoch stability though, so Index is unlikely to change. Still, some way to leverage [] (or something like it) and also full have control of what happens when indexing is OOB is a useful combination to have.

@newpavlov That could be done but the point here is the syntax, which would still be bound to Index. Thus the net effect would be the same as it is now: a relatively verbose function call for every place I’d rather use [] for indexing.

@kornel I don’t see how a new hypothetical TryResult type would help here?

And indeed, the first post mentions that the new returned type would ideally be Result<&Self::Output, Self::Error> or something to that effect.

As I’ve said you can propose []-like sugar which will utilize TryIndex trait, I think it will be the most reasonable approach to take. I highly doubt that any [] changes similar to the proposed one will get much traction. Just imagine amount of code which will have to be reworked after such change.

You aren’t forced to panic. The get method provides you fallible indexing.

5 Likes

I think the fundamental issue is the lack of clarity around when it is appropriate to Panic, and why, and when it is appropriate to return Option or Result<T,E> (or similar). This seems to be a very fuzzy concept (at least to those coming anew to the language). For example, in this case, you are confusing indexing (which is designed for efficiency at run-time) and “.get( n )” (which is designed to have recoverability in the case of out-of-bounds). A clearer story and message as to the purpose and meaning of Panic could go a long way to making these cases more understandable I would think.

EDIT: Ask yourself this, “What legitimate algorithm involving indexing should perform attempted computations on out-of-bound indexes?” If there is no legitimate use for out-of-bound indexing attempts, then Panic is entirely appropriate for that case because no legitimate algorithm would ever attempt such.

5 Likes

To add to the above: any method encouraging handling via .unwrap() or a generic handler (“lets just see what happens if we ignore that”, as in many Java apps) is poor design IMO. In general, errors which are rare and a logic bug are sensible to handle via panicking. Usually panics can still be caught if necessary (via catch_unwind or thread::spawn), although at this point only high-level handling is possible.

The answer to the larger question here is no. We can deprecate and remove APIs, but their names are reserved forever. We could never replace any definition in std with one with a different signature.

(Also, the real trick is for Index to have a type Output<'a>;, then you could return a result, or any other “proxy,” if you want to)

6 Likes

The only issue there is that [] itself is already bound to the Index trait, so the syntax would have to differ somehow to disambiguate the 2.

@sfackler I’m aware of the get function. What I mean is that I want the [] syntax (or something like it) because it would help clarify intent. But using that would introduce potential for panics in code where there currently is none. My ideal solution is something with semantics similar to get (it could also return a Result) but also with syntax that’s to the point. And looking distinctive from regular method calls doesn’t hurt either. That is also what my comment was referring to: today, if you want the neat [] syntax, you get forced to introduce potential for panicks due to the definition of the Index trait.

@gbutler I’m not confusing anything. What I’m saying is that I want something with properties of both [] syntax and recoverability, and current-day Rust won’t let you do that. I see no reason why something like &my_datastructure[idx]? couldn’t work. Or rather something like it, since the [] syntax ship itself has sailed by now. But what would be appropriate is a tougher question, as the 3 pairs of brackets already serve a purpose in Rust. Perhaps something like &my_datastructure[[idx]]?, or is it too noisy?

What legitimate algorithm involving indexing should perform attempted computations on out-of-bound indexes?

My use case isn’t so much encoding an algorithm as it is making child accesses in a tree like structure stand out. In either case indexing out of bounds is a bug (is there any use case at all where that is not true?), but in my case uncontrolled panicks are simply unacceptable, and here at least in principle it’s preventable. And in fact with the current code (which doesn’t use [] at all) panicks are prevented. It would just be nice to make child indexing stand out in the source code a bit more, which using [] would accomplish.

@dhardy There’s nothing smelly about a method that returns a Result (or an Option for that matter). In fact, if the operation can fail, it’s generally the sane thing to do I’d say. The only reason I see to use a panic is if there is nothing that could possibly be done at all to recover, no matter the circumstances e.g. when the host is out of memory. Well, that’s a fairly small set of circumstances. In pretty every other case using Result as a return type at least allows the caller to decide what to do. If that caller decides to inappropriately call .unwrap() on it, then the program panicks, yielding similar behavior to a forced panic. On the other hand, at least the caller has the option to do something with the error signal.

Also, FTR, personally I see .unwrap() as the thing to use when I’m prototyping something; proper error handling can come later, and the spots are easily grepped for. It’s never something that is meant to replace proper error handling, rather it’s a stall tactic to allow me to temporarily focus on then-more-important matters e.g. the actual program design.

@withoutboats Unfortunately that wouldn’t be enough, as the index method returns a borrow.

In what way does a[1] clarify the intent in comparison to a.get(1)? The intent of a[1] is “give me the 1st element, and panic if that doesn’t exist”, and the intent of a.get(1) is “give me the 1st element, or None if that doesn’t exist”.

3 Likes

The problem with this is that once it is known that your program has attempted an out-of-bounds index access, it is known that you have a flawed algorithm which means that just barreling on and continuing to make further computations is only likely to result in incorrect outcome because your algorithm made incorrect assumptions. That is why panic is preferred for out-of-bound index. You should know the bounds and your algorithm should only compute and use indexes within those bounds; otherwise, it indicates your algorithm is logically flawed and further computation is only likely to result in an incorrect answer.

1 Like

I don’t actually use get as I am not talking about slices or Vecs. Rather the method has a name like child_ref(IndexType), which still isn’t terrible (and does in fact work correctly). I just would like to use indexing syntax there. In general I’d like to be able to integrate DSLs for anything (indexing can be seen as a so-small-its-almost-trivial DSL), but that’s another and longer discussion.

As for the intents of [] vs get: sure, by definiton. All I’m saying is that now something like Index is opinionated, and it would be nice if it wasn’t, if only because I still have not seen even 1 actually good motivation for these definitions being what they are. I get that [] is intended to panick atm if it’s out of bounds; that’s what I have an issue with :slight_smile:

sorry this is what I meant:

trait Index<T> {
    type Output<'a>;
    fn index<'a>(&'a self, idx: T) -> Self::Output<'a>;
}

this can’t be expressed today, because associated types cannot be generic. All the existing impls would have type Output<'a> = &'a {current Output}.

3 Likes

That would be absolutely perfect. But even if GATs were valid today, existing traits methods won’t be changed, so having that bound to syntax would still require something new (unless it was a strict generalization I suppose, in which case the change might be defensible).

The motivation is that once it is known that your algorithm has attempted to use a non-existent index it indicates that your algorithm is logically flawed and continued computation is more than likely going to result in an incorrect computational result.

5 Likes

So the main problem in the name. If we had a name, we could do this (using Index2 as the standin for the name):

impl<I, T> Index2<T> for I where I: Index<T> {
     type Output<'a> = &'a <I as Index<T>>::Output;
}

And the behavior of [] would be controlled by Index2. Anything that implements Index works the same way, but now you can use the more flexible Index2.

Unfortunately, of course, we need a name, and there’s really no better name than std::ops::Index, so deprecating that and using a new trait would have a pretty serious downside.

Well the status quo of TryInto/Into and TryFrom/From leads me to nominate TryIndex.

However, how would the compiler know when to use which trait, if TryIndex is also bound to []? Perhaps default to TryIndex and fall back to Index only when absolutely necessary?

Since it also works for things like HashMap, wouldn’t std::ops::Access or something with similar semantics work?

Edit: Finding a good name for a Deref improvement might be harder though.

4 Likes