On ranges and bounds

derekdreery · April 6, 2020, 7:47pm

I'm currently writing a parser (for CSS as it happens), and in CSS there exists the concept of an integer. It's used in a number of places (pun intended), and in some of those places not all integer values are valid. So, I want to create a function that can optionally take some range and return a different result if the parsed integer is out of that range (an error). So my function looks something like

fn parse_int(input: InputType, range: Option<impl RangeBounds>) -> Result<ParsedInt>;

Unfortunately, this ends up not being practical, since the compiler can't know how to monomorphize when None is passed: we haven't given any information about the type of the Option.

Oh well, I thought, since I don't need super cache locality etc., I can just use a trait object. Sadly this doesn't work either, because the RangeBounds trait has a function with a type parameter (contains), which isn't supported by rust's trait object implementation, so I'm a bit stuffed.

At this point I started looking at the details of how ranges are implemented in rust, and I noticed a few things. For starters, start and end bounds are treated differently: end bounds can be open, closed, or unbounded, but start bounds can only be closed or unbounded. Then I noticed that a new Bound enum had been added in 1.17, and I realised that this type was the key to a good range implementation.

So the reason I'm posting this here is that I now have an idea of how I would make a range type in rust if I were unrestrained by backwards compatibility. It would look like this.

enum Bound<T> {
    Closed<T>,
    Open<T>,
    Unbounded,
}

struct Range<T> {
    start: Bound<T>,
    end: Bound<T>
}

These two compound types cover all permutations of start and end bounds, and all built-in syntax could desugar to an instance of Range. This is mentally must less complex than the current arrangement, my only concern would be that it was less efficient, as currently there is no need for branching when evaluating whether a value is contained in something like core::ops::Range.

Of course it's not possible to change what is already in std/core. I thought I would post this to see if others agreed on the general idea, and maybe to spark some other ideas about changes that could be made in a backwards-compatible way. For example, it would be possible to add the Range type (under some other name) so that code outside of libcore/libstd can take a fully generic range argument and it be compatible with code in other libraries.

hanna-kruppe · April 6, 2020, 8:13pm

I believe your original problem can be addressed by just not having the Option wrapper: since RangeFull (the type of ..) implements RangeBounds<T> for all T, you can use it to represent "no range restriction".

As for a unified Range(Bound, Bound) type: the largest problem I see is performance. Different types for different kinds of ranges allow specialized code to be generated that performs less work when one or both bounds are statically trivial. It also allows specialized representations. For example, to make iteration optimizer-friendly, Range and RangeInclusive represent their iteration state differently. I don't think we could do that (quite as well) if all range types were collapsed into one.

mbrubeck · April 6, 2020, 8:13pm

libstd currently has ops::Bound which is equivalent to yours, and the RangeBounds trait which is implemented for (Bound<T>, Bound<T>) pairs.

steffahn · April 6, 2020, 9:09pm

Ignoring that this particular use case may have different and better alternatives, I’d think there is room to improve the experience with unconstrained or under-constrained type variables, especially in cases where there is actually no values of that type involved.

Here’s some code that compiles today:

#![allow(dead_code)]
#![allow(unused_must_use)]

use std::ops::RangeBounds;

struct InputType;
struct ParsedInt;

// ‘Never’ ≙ ‘!’
pub enum Never {}

fn parse_int(
    _input: InputType,
    _range: Option<impl RangeBounds<i32>>,
) -> Result<ParsedInt, ()> {
    unimplemented!()
}

fn foo() {
    parse_int(InputType, Some(1..3));
    parse_int(InputType, None::<Never>); // parse_int(InputType, None::<!>);
}

// should be generated for ‘!’
impl<T: ?Sized> RangeBounds<T> for Never {
    fn start_bound(&self) -> std::collections::Bound<&T> {
        match *self {}
    }
    fn end_bound(&self) -> std::collections::Bound<&T> {
        match *self {}
    }
}

Imagine a ! instead of Never, then I believe that this trait instance at the bottom should exist automatically (and I think I remember seeing some proposal somewhere about this idea).

Then furthermore, in my opinion, it is not unreasonable to expect this code to work with just None in place of what currently is None::<Never> above and what would be None::<!>. If a type is under-constrained enough that even ! is a valid option, I don’t really see a reason why it wouldn’t usually be the best option.

bill_myers · April 6, 2020, 9:44pm

You can just explicitly specify the type for Option::None, and create a type alias for that.

derekdreery · April 7, 2020, 10:03am

Thanks for your suggestions, I was conscious I didn't want this post to be Q&A, but I do appreciate the answers all the same.

system · July 6, 2020, 10:03am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Range syntax is confusing language design	23	5844	November 2, 2019
Inclusive ranges with RangeFrom (making 0.. work like 0..=255 for u8) language design	6	1649	March 25, 2019
Pre-RFC enum from integer language design	24	13803	March 25, 2019
A tale of two's complement	62	24719	March 25, 2019
Make 'index out of bounds' a compile-time error language design	26	8920	March 25, 2019

On ranges and bounds

Related topics