Enhancing Rust Range?

Have there been any attempts to address Rust range problems? Mostly things outlined in the blog Rust Ranges and Suffering. Namely:

  • Range isn't Copy
  • Range implements Iterator rather than IntoIterator
  • Maybe allow the to be reversible but not too sold of it.

On top of those issues, or maybe complementing (I'm not too sold on reverse iteration), is there a way to represent the end of a range as you can do in C# e.g.

var x = "Hello World!"
var world = x[^6..^1]
6 Likes

People tried working on this for the 2021 edition, but there wasn't a great way found to do the transition, so nothing ended up happening.

Is there a link to discussions why or why not?

I don’t know whether @scottmcm had some other discussions in mind, but this issue seems quite relevant:

The issue also contains a link to a relevant Zulip discussion

https://rust-lang.zulipchat.com/#narrow/stream/268952-edition-2021/topic/range.20reform

2 Likes

The range transition could definitely be done across an edition - I just didn't have the time to invest in making it happen as 2021 was being finalized.

I think this is the most up to date version of the proposal: Range Reform - HackMD

There's also some discussion in the zulip: https://rust-lang.zulipchat.com/#narrow/stream/268952-edition-2021/topic/range.20reform

3 Likes

I couldn't quickly find reference (wrote post before seeing sfackler's link), but AIUI, it's primarily that it's a not insignificant amount of change for fairly minor benefit. Combine with the fact that ops::Range is the "nice" name for the "wrong" type (with hindsight and a time machine, ops::Range should be the Copy + IntoIterator and iter::Range the Iterator), and it becomes difficult to justify the additional complexity.

That said, I think all std functions which accept ranges are generic (i.e. over RangeBounds or SliceIndex), so introducing new range types which are Copy + IntoIterator would be able to work with minimal breakage.

I think it's mostly a matter of someone dedicated enough to do some measurements to try to qualify what the benefits and costs are and put forward a full proposal. With such, I could see it potentially happening for edition2024, or at least a decision that it's not worth the churn.

(But if we do want to make this switch, the more editions we go without doing it, the more difficult it becomes to do. My own gut feeling is that if we're to do it edition2027 is the absolute latest we'd be somewhat likely to accept it.)

Edition-dependent path resolution existing for another reason would perhaps make it a bit easier (i.e. because ops::Range remains the primary "range" type) but this doesn't justify edition-dependent path resolution on its own. It's a pretty big jump, since so far we've been able to correctly say that editions don't change the library[1], only the language (and even then only slightly).


  1. Even the existing proposal for edition-sensitive name resolution skirts around making the library edition-dependent by a) not changing the library, changing how a language feature works (name resolution, and only when it would otherwise be an ambiguity error) and b) does so in the name of letting std pretend to be locked to your edition/msrv version despite being forcibly updated/tied with the compiler version. std is special because you can't upgrade the compiler without upgrading std versions, and that fact weakens the stability guarantee of always being able to upgrade the compiler a little bit. (And is why "minor breaking changes" are a class of change, and the upgrade promise is "with minimal and mechanical adjustments" and talks about a "fully disambiguated" dialect.) ↩︎

2 Likes

Still, it feels strange that such a small change requires that much work. My proposition changes the literal somewhat, which is also an even bigger change.

It’s a breaking change, and Rust takes backward compatibility seriously. Not only that, it’s a breaking change in the library and as of now those aren’t allowed even across editions. There should at least be an extended grace period during which using a range directly as an iterator will raise a deprecation warning, which could be changed to deny-by-default in the next edition, with a cargo fix available, and sometime after that the Iterator impl could perhaps be removed.

4 Likes

That's only true if ops::Range is changed but, as @CAD97 pointed out, we really need two types: One that implements Iterator and another that implements IntoIterator— The existing type is perfectly suited to the former of these roles.

One way forward, then, would be to define a new Copy + IntoIterator type (RangeConstructor maybe?) and change the desugaring of .. across an edition boundary to refer to the new type.

7 Likes

I think the concerns about Range misdesign are heavily overblown. It's a bit of a newbie roadbump, yes, but other than that I never really suffered from those limitations in real-world code. What are the libraries which would significantly benefit from a breaking change to ranges?

Note that the core purpose of Range is really just to index subslices and to iterate over a range of indices. It'a not really well-suited for dealing with arbitrary mathematical ranges, the API is severely lacking and hard to extend (since it is entirely in stdlib). Perhaps the first step forward could be to make .. and ..= operators into overloadable ones, so that libraries could provide their own domain-specific properly designed range types for their values. This doesn't help much with uN/iN integer types, though, but do they need such support?

What do you mean by that, when they're already "implemented" for all types?

fn make_range<T>(x: T, y: T) -> std::ops::Range<T> {
    x..y
}

In my mind, Range should just be a pair of values denoting lower and upper bounds. If the type is PartialOrd, they define a lattice by that order, which is the case with the current inherent methods contains and is_empty. Similarly, any other special behaviour is based on the properties of the type T.

I don't think it's inherently a huge problem that Range<T: Step> implements Iterator, but it is the primary reason to not impl Copy, and leads to RangeInclusive containing an extraneous bool that's only needed to support using it as an iterator, even when it doesn't impl Iterator because T: !Step.

Lack of Copy for Range feels off when all fields are pub so you can just copy each field:

    let r = 1..7;
    // this would move the range:
    // let c = r;
    // these make a copy:
    let c1 = r.start..r.end;
    let c2 = std::ops::Range { ..r };
2 Likes

What about this:

  • Create an entirely new type for a copyable range, maybe RangeValue or Interval or something else (and RangeInclusiveValue and so on). For concreteness let's call it RangeValue.

  • Make an alias type RangeIter = Range and make existing uses of the type Range deprecated. (this step is optional but if we go through all this trouble, just do it)

  • In a new Rust edition, say 202X, change the desugaring of a .. b to build a RangeValue instead. If you want a RangeIter, just write (a .. b).into_iter(). This doesn't affect iteration because for i in 1 .. 10 still does the same thing (the for desugaring inserts an implicit into_iter)

  • Add a blanked impl that says that if a type implements Index<RangeIter> it should automatically implement Index<RangeValue> too. This makes x[a .. b] continue to work on both Rust editions (even though in Rust 2021 a .. b has type RangeIter and in Rust 202X, a .. b has type RangeValue). This blanket impl must exist at the same compiler version where RangeValue is defined, to prevent people to write manual Index<RangeValue> impls.

  • Make cargo fix update your code with ranges by just calling into_iter, except in specific places where the tool would use heuristics to verify that you can use a bare range. In special, make it so that if you give it code that compiles in Rust 2021, it will spit Rust 202X code that still compiles. Also, if Range was deprecated, update it to RangeIter, too.

What do we gain from doing this? Mainly, the ability to store ranges inside a struct and still have this struct implement Copy. Currently, we need to define our own custom type for that, which isn't interoperable with the rest of the ecosystem.

9 Likes

That's roughly what I proposed in 2020, actually: https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/ridiculousfish.20Range.20feedback/near/211055146

(Though that post is just a sketch, so doesn't have full details. The new type should also implement SliceIndex in std::slice - Rust, for example.)

2 Likes

You may want to store a Range for slicing later. Think about how in Rust we frequently tell people “don't store a self-referential &T in your struct, store indices.” Think about the case where what needs to be stored for later is a range of indices with which to make an &[T]. Range<usize> is the natural type for this because it is for indexing subslices. Yes, you can store a pair of usizes as a tuple or your own struct, but why should we have to when Range is already there with well-defined inclusive/exclusive-bound semantics?

In my own application, I need to store ranges that represents sections of a GPU vertex buffer[1] that was made from a Rust vector and then later needs to be drawn via GPU command. wgpu rightly uses the Range type in the parameters to the draw_*() commands, because it has clear and useful syntax, but those Ranges can't always be conveniently stored even though they fit in all other aspects. (In my case, the lack of Copy is not an impediment to using it anyway, but if ranges were Copy, some code would be simpler.)

In general,

  • “A range of indices” is a concept that is useful for more than just slicing.
  • When it is for slicing, the slicing won't necessarily happen immediately after the range is calculated.
  • std's job is to provide common vocabulary types, and it's bothersome that this particular vocabulary is not as commonly usable as it ought to be.

These are the reasons why I would like to see the .. and ..= syntaxes start producing a type that is Copy.


  1. for graphics people: actually an index buffer, but let's not use the word "index" again here ↩︎

18 Likes

Agreed. It's definitely not a newbie problem/bump. The fact that Range isn't Copy is why this type exists for example: Span in regex_automata - Rust

(I am not necessarily trying to say that my specific use case justifies jumping through hoops to fix the Range problem, but just making a narrow rebuttal against the idea that it's specific to newbies.)

14 Likes

As another example, Use RangeInclusive in SpanData instead of lo/hi · Issue #534 · rust-lang/compiler-team · GitHub was rejected (in part) because RangeInclusive is a bad choice for use in data structures because of this problem.

7 Likes

Here is another can of worms related to Range/PartialOrd, which I found surprising when attempting to treat ranges as an interval... There have been a couple of times where I would have used range if not for it's PartialOrd implementing a total LexicalOrd, and instead implemented an Interval type instead.

Seems difficult to fix given that it more relates to the PartialOrd bounds on various std collections. but seemed worth mentioning.

Whoa! I've been toying with the idea of writing a RFC for that, no idea that this had been already proposed!

So.. anyway, we could have a possibly automated transition from old-style ranges to new-style ranges. And again, we have a Rust edition around the corner (it will be 2024, right?).

Why not do it this time?

1 Like

Maybe ranges should just have a method to implement interval order.

1 Like

I think that'd probably be sufficient? At least thinking back I believe that I always knew a concrete range/inveral types, rather than relying on PartialOrd bounds. At the very least it'd be a nice to have it documentation wise.