No sane way to generate [1., 1.5, 2., 2.5].iter().cloned()

std's range support is missing ranges with explicitly specified increments. There is Iterator::step_by but that is subtly different: it skips values (so cannot be used to iterate downwards). (0 .. -10).step_by(2) is empty and (0 .. -10).step_by(-2) is a compile error. None of this works with floating point numbers because they don't implement Step.

The easiest way to write this at the moment, that I found, is some construction involving Iterator::successors.

Thinking about how to fix this, I think we probably don't want to do anything funky syntax-wise (eg, Feature idea: impl Iterator for tuple (1,2..n) as range with step). We really just want to be able to write:

for v in (0. ..= 2.5).adding(0.5) { ... }
for v in (0 .. -10).adding(-1) { ... }
for v in (0. .. 10.1).adding(1./3.) { ... }

I think this is easily implemented by providing an adding method on Range, RangeFrom and RangeInclusive, and a corresponding RangeAdding struct. Something like:

struct RangeAdding<Idx> {
  next: Idx,
  step: Idx,
  bound: Bound<Idx>,
};
impl<Idx> Iterator for RangeAdding<Idx>
where Idx: Add<Rhs=Idx,Output=Idx> { ... }

The docs for Iterator::step_by ought to refer to adding and vice versa.

The docs for adding ought to mention the floating point rounding issue. This issue is not entirely straightforward and does present a hazard for naive use. But the algorithms for avoiding it typically involve substituting multiplication for addition, or explicitly choosing a bound in between expected values. If the user wants multiplication they can write (0 ..= 30).map(|x| x as f64 * (1./3.)) or something. I don't think we want to provide sugar for that. I also don't think that we should automatically adjust the offset; that would be too confusing,

We don't need .subtracting because either the type is signed and you can just write .adding(-whatever), or you can use .step_by and .rev.

1 Like

I think this – arguably weird – syntax would have to be compared to the alternatives of a simple function range_step(0., 2.5, 0.5) or a struct with public fields RangeStep { from: 0., to: 2.5, step_size: 0.5 }

I guess the step_by+rev approach works for any integer type.

Furthermore, ranges don’t support floats at all right now, i.e. 0.0..10.0 does not implement Iterator.

As you mentioned

which is all-the-more important when the main point of your proposal is just to add floating-point support for ranges, or, a range-like iterator for floating points with explicit step-size.


On prior art in other languages, I know Haskell supports floating point numbers in ranges (and it also supports specifying step size). The way that it handles those is that it considers the specified upper-bound plus half of the step size as the actual upper bound, which makes things like [0.0, 1.0/3.0 .. 1.0] work “as expected”, and it also uses the multiplication approach (they also explain why here). Of course that’s still a trade-off; the fact that [0, 3 .. 5] == [0, 3], but [0.0, 3.0 .. 5.0] == [0.0, 3.0, 6.0] is certainly somewhat confusing as-well.

Going back to Rust, there’s also the problem of DoubleEndedIterator support. With an iterator doing addition on each step, you can’t support DoubleEndedIterator unless you accept that the exact value of the iterator depend on whether you iterate forwards of backwards. In any case, the comparison to Haskell, and in particular the explanation given why mutliplication is used makes me thing that multiplication is the best approach, which – as you mentioned, too – can already be done (easily, explicitly, and without any confusion about imprecise bounds) in Rust with a map.

3 Likes

This behavior is almost always what I use when I'm trying to iterate over a range of floating point numbers, because I don't want to have to think about the gotcha of "sometimes it has one more or less element at the end than you expected".

Syntactic sugar is good for commonly used things with obvious meanings that you don't want the programmer to have to think about. To have a convenience method like this, where it looks like it "just does the thing", encourages the programmer to not worry about the floating-point rounding issue, regardless of what you write in the documentation. So I definitely don't think we should make a convenience method that exhibits the floating-point rounding issue in common cases.

7 Likes

Thanks to both of you for your helpful comments. Let's try to separate syntax from semantics. So

Semantics

I found your argument convincing.

That is interesting. It seems to me I should look at Python (which in many areas is the most "normal" kind of language, even though I don't like it much personally :slight_smile: ). There's numpy.linspace which takes min and max and count, and there's numpy.arange which has a health warning.

linspace has quite a nice API with no gotchas. I think having that in std would be worthwhile, given we have ranges for integer types.

The only problems with linspace is that it can't generate the half-unbounded iterator, and that it will be slower than repeated addition. The slowness is probably OK, especially if the API doesn't suggest that it's actually just addition.

The Haskell folks say it's faster to count in floats than in integers, but for API sanity the argument should be an integer so I think it has to

Syntax/API

I agree that the syntax I proposed looks unusual. But it has one massive advantage over 3-argument functions: there is no doubt about which of the three arguments to the operation are what.

I think this is a compelling reason to make this take a Range of some kind. And once you do that, istm that making it a method on the range is better than a free function.

num's range iterators

There are various range iterator functions in the num crate but they have some problems. There is no discussion of the floating point rounding issue. The range_step_by function takes 3 undistinguished arguments.

conclusion / revised suggestion

impl Range<f64> {
  // for v in (0. .. 2.5.).linspace(5) {
  fn linspace(count: u32) -> impl Iterator
    + DoubleEndedIterator + ExactSizeIterator ...;
}

impl RangeFrom {
  /// for v in (0. ..).adding(0.5) {
  fn adding(step: Idx) -> impl Iterator;
}

(I 'think Range::linspace can be generic over Idx with some trickery. We need to synthesize Step (which f64 shouldn't implement) without using stuff from num_traits, but we can do that if we have count: C where C: Default + Step, Idx: From<C>, knowing that for f64::From(Step::step(i32:default())) ought to be optimised to the constant 1.0f64.)

I agree that counting in integers is better. I actually just opened

because counting in floats for single precision floating point numbers (i.e. f32) has undesirable behavior IMO (it reaches its limits after just about 16 million elements!). The argument of performance, “The benchmark on T7954.hs shows that this approach leads to significant degeneration on performance (33% increase allocation and 300% increase on elapsed time).” also seems quite Haskell-specific to me, at least with regards to the “increase in allocations” (and even in Haskell, I’m not convinced it’s impossible to avoid this increase in allocations).

Hmm. You may well be right. I will do a trivial experiment.

Given that people generally want to be able to represent descending sequences with this, I don't think anything involving Range is the right answer as 10..0 is empty.

You might be interested in https://docs.rs/itertools-num/0.1.1/itertools_num/fn.linspace.html or https://docs.rs/ndarray/0.15.3/ndarray/struct.ArrayBase.html#method.linspace.

And if you're on nightly, you could use the new fNN::lerp to make a pretty simple helper like https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=0bb706f1e35e622abb1cd3e15bea005f

2 Likes

Thanks for your comments.

10..0 is only empty if you look at it as an iterator. What it actually is is a struct containing 10 and 0. With my suggestion (10. .. 0.).linspace(7) would work just fine and not be empty.

Thanks for these references. Interesting. I did look in ndarray but didn't find linspace because I was still looking for ranges. ndarray and itertools_num's linspaces can't be transposed into std because they need Float from num_traits.

Perhaps it is better to expect people to use one of these than to bodge up (in std) some set of trait bounds sufficient for linspace.

I also note that the API which takes two floats and an integer avoids, in most cases, the problem of argument confusion with three-argument float-generating iterators. The range has to be the two floats. I still think linspace ought to take a Range<F> rather than two separate parameters. But maybe that ship has sailed.

When searching for an answer to this problem I found questions on stackoverflow etc. but none of the answers mentioned these two crates. Are we allowed to reference external crates in the docs? I think discussing floating point iteration in the docs for Range as Iterator might be useful.

(10 .. 0) is unnecessary sugar. Its going to be empty because there's nothing to iterate. (IMO, this should be a syntax error if the lower and upper bounds are both known at compile time and it can be determined that lower bound >= upper bound.) If you want to iterate backwards, use rev(): (0..10).rev().

It's also empty if you look at it as an interval:

  • (1.0..9.0).contains(&3.14159) is true, but
  • (9.0..1.0).contains(&3.14159) is false.

Code: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=13713a00923b78886e88ba01e0714fea

7 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.