More on ranged integers


#1

Beside ranged integrals Ada has ranged floating point value with optional precision, modular integers, and other built-in things. I think the most useful types are the integers with a range.

Ada gives run-time errors if you go past the range of such specified integers, I’ve found the detailes semantics explained at page 26 here:

https://hal.inria.fr/hal-01238376/document

Signed integer types are usually defined using a range, e.g.:

type Int_10 is range 1 .. 10;

Here, Int_10 represents the set of integers between 1 and 10. Signed integer types have a base type which is another signed integer type. Typically, it is a type corresponding to the smallest possible machine words in which all the value of the range (in the classical 2-complement representation) fit. The choice of the base type is compiler-dependent. For example, the base type of Int_10 will most probably be the type of all signed 8-bits words, ranging from -2^7 to 2^7-1. The semantics of Ada specifies that each single operation on a signed type will raise an exception whenever the result overflows its base type. Additionally, when the result of such a computation is stored back into a variable or given as input to a function call, it is checked that the type’s bounds, here 1 and 10, are not exceeded. Example:

A : Int_10 := 10;

B : Int_10 := (A + 1) - 1; -- No overflow here

C : Int_10 := (A + 127) - 127; -- Exception: overflow in the first addition

D : Int_10 := A + 1; -- Exception: type range exceeded

If Rust gains ranged integers they could have the same semantics above of Ada, or something stricter.


In programming one way to teach the usefulness of something is to show lots and lots of usage examples. To see some usages of ranged values in Ada I’ve taken a look at many small programs here:

http://rosettacode.org/wiki/Category:Ada

Those tiny programs aren’t the best if you look for the usefulness of ranged values because I think ranged values usefulness shines better in larger programs. But that site is quite handy, and the comparisons with other languages allow me to better understand the Ada code, and compare it with solutions in other languages that lack ranged values.

Below I have extracted several usage examples:


15_Puzzle_Game:

subtype Row_Type is Positive range 1 .. Rows;
subtype Col_Type is Positive range 1 .. Cols;

This allows to not confuse the row with the column types when accessing the game table, and it has other adantages.


Tic-tac-toe:

   type The_Range is range 1 .. 3;
   type Board_Type is array (The_Range, The_Range) of Character;

      for I in The_Range loop

         X,Y: The_Range;
      begin
         while not Found loop
            X := Rand.Random(Gen);
            Y := Rand.Random(Gen);

The Rand.Random generates a coordinate in the right range.


Sudoku:

type sudoku_ar_t is array (integer range 0.. 80) of integer range 0 .. 9;


Sierpinski_triangle:

   subtype Practical_Order is Unsigned_32 range 0..4;

   procedure Sierpinski (N : Practical_Order) is

   for N in Practical_Order loop

Set_puzzle:

   type Three is range 1 .. 3;
   type Card is array(1 .. 4) of Three;

Self-referential_sequence:

   subtype Seed is Natural range 0 .. 1_000_000;
   subtype Num is Natural range 0 .. 10;
   type NumList is array (0 .. 10) of Num;

   function Init (innum : Seed) return NumList is
      list : NumList := (others => 0);
      number : Seed := innum;  d : Num;
         d := Num (number mod 10);
         list (d) :=  list (d) + 1;

   for i in Seed'Range loop

Self-describing_numbers:

   subtype Desc_Int is Long_Integer range 0 .. 10**10-1;

   function isDesc (innum : Desc_Int) return Boolean is
      subtype S_Int is Natural range 0 .. 10;
      type S_Int_Arr is array (0 .. 9) of S_Int;
      ref, cnt : S_Int_Arr := (others => 0);
         digit := S_Int (num mod 10);

Permutations:

   subtype Element is Positive range 1 .. N;
   type Permutation is array(Element) of Element;

   procedure Set_To_First(P: out Permutation; Is_Last: out Boolean) is
      for I in P'Range loop

Perfect_shuffle:

subtype index is Natural range 0..2 * half_size - 1;
subtype index_that_move is index range index'first+1..index'last-1;
type deck is array (index) of index;

Pascal’s_triangle:

   function First_Row(Max_Length: Positive) return Row is
      R: Row(0 .. Max_Length) := (0 | 1 => 1, others => 0);

Move-to-front_algorithm:

   subtype Lower_Case is Character range 'a' .. 'z';
   subtype Index is Integer range 0 .. 25;
   type Table is array (Index) of Lower_Case;

Monty_Hall_problem:

   type Door_Index is range 1..3;
   package Random_Prize is new Ada.Numerics.Discrete_Random(Door_Index);

      Chosen : Door_Index := Random(Seed);
      Monty : Door_Index;

Miller-Rabin_primality_test:

  subtype Number_Range is Number range 2 .. N - 1;
  package Random is new Ada.Numerics.Discrete_Random (Number_Range);

Maze_solving:

   X_Size: constant Natural := 45;
   Y_Size: constant Natural := 17;

   subtype X_Range is Natural range 1 .. X_Size;
   subtype Y_Range is Natural range 1 .. Y_Size;

   East:  constant X_Range := 2;
   South: constant Y_Range := 1;

Maze_generation:

generic
   Height : Positive;
   Width : Positive;

   subtype Height_Type is Positive range 1 .. Height;
   subtype Width_Type is Positive range 1 .. Width;

   type Maze_Grid is array (Height_Type, Width_Type) of Cells;

     (Maze   : in out Maze_Grid;
      Row    : Height_Type;
      Column : Width_Type)

            if Column > Width_Type'First then
            if Row < Height_Type'Last then

Matrix_transposition:

  type Fixed is delta 0.01 range -500.0..500.0;

        Put (Fixed'Image (Fixed (X (I, J))));

LZW_compression:

   MAX_CODE : constant := 4095;
   type Codes is new Natural range 0 .. MAX_CODE;
   type Compressed_Data is array (Positive range <>) of Codes;

   function Compress (Cleartext : in String) return Compressed_Data is
      Result : Compressed_Data (1 .. Cleartext'Length);

      type Code_To_String is array (Codes) of UStrings.Unbounded_String;

Luhn_test_of_credit_card_numbers:

  function Luhn_Test (Number: String) return Boolean is
    Digit: Natural range 0 .. 9;

        Sum := Sum + (Digit*2 mod 10) + (Digit / 5);

Langton’s_ant:

   Size: constant Positive := 100; -- change this to extend the playground
   subtype Step is Integer range -1 .. +1;

   procedure Right(N, W: in out Step) is
      Tmp: Step := W;
   begin
      W := - N;
      N := Tmp;
   end Right;

Kaprekar_numbers:

   type Int is mod 2 ** 64;
   subtype Base_Number is Int range 2 .. 36;

   From_Digit : constant array (Character) of Int :=
     ('0'    => 0,
      '1'    => 1,
      '2'    => 2,
      '3'    => 3,
      '4'    => 4,
      '5'    => 5,
      '6'    => 6,
      '7'    => 7,
      '8'    => 8,
      '9'    => 9,
      'a'    => 10,
      'b'    => 11,
      'c'    => 12,
      'd'    => 13,
      'e'    => 14,
      'f'    => 15,
      others => 0);

   function To_String (Item : Int; Base : Base_Number := 10) return String is

    function Is_Kaprekar (N : Int; Base : Base_Number := 10) return Boolean is

      if Is_Kaprekar (I, 17) then

Knight’s_tour:

generic
   Size: Integer;

   subtype Index is Integer range 1 .. Size;
   type Tour is array  (Index, Index) of Natural;

      procedure Visit(X, Y: Index; Move_Number: Positive; Found: out Boolean) is

Guess_the_number:

   subtype Number is Integer range 1 .. 10;
   package Number_IO is new Ada.Text_IO.Integer_IO (Number);
   package Number_RNG is new Ada.Numerics.Discrete_Random (Number);
   Generator  : Number_RNG.Generator;
   My_Number  : Number;
   Your_Guess : Number;

   Number_RNG.Reset (Generator);
   My_Number := Number_RNG.Random (Generator);

Flipping_bits_game:

   subtype Letter is Character range 'a' .. 'z';

   type Matrix is array
     (Letter range 'a' .. Last_Col, Positive range 1 .. Last_Row) of Boolean;

   function Rand_Mat return Matrix is
      M: Matrix;

    for I in M'Range(1) loop
     for J in M'Range(2) loop
        M(I,J) := Boolean_Rand.Random(Gen);

Evolutionary_algorithm:

   subtype DNA_Char is Character range '@' .. 'Z';
   
   Target : constant DNA_String := "[email protected]@[email protected]@[email protected]";

   function Fitness (DNA : DNA_String) return Natural is

   package Random_Char is new Ada.Numerics.Discrete_Random (DNA_Char);
   DNA_Generator : Random_Char.Generator;

Dutch_national_flag_problem:

   type Colour_Type is (Red, White, Blue);

      package Random_Colour is new Ada.Numerics.Discrete_Random(Colour_Type);
      Gen: Random_Colour.Generator;

            B(I) := Random_Colour.Random(Gen);

Constrained_random_points_on_a_circle:

   subtype Coordinate is Integer range -15 .. 15;
   type Point is record
      X, Y : Coordinate;
   end record;
   type Point_List is array (Positive range <>) of Point;

   function Acceptable (Position : Point) return Boolean is
      Squared_Sum : Natural := Position.X ** 2 + Position.Y ** 2;
   begin
      return 10 ** 2 <= Squared_Sum and Squared_Sum <= 15 ** 2;
   end Acceptable;

Caesar_cipher:

   type M26 is mod 26;

   function To_M26(C: Character; Offset: Character) return M26 is
   begin
      return M26(Character'Pos(C)-Character'Pos(Offset));
   end To_M26;

   Key: M26 := 3;

So the correctness of a ranged integers get verified statically or at run-time when it is created. Then you can put it in a data structure or pass around as function argument, and when it’s used as index for a fixed-size array (or a slice that LLVM statically knows its length), it doesn’t need further bound tests.

If you generate a random integer number you don’t need to state the range again if you need a value inside its all range.

Functions that take an integer argument that must be in a certain range don’t need to verify the value of an input if it’s a ranged integer, so the blame of wrong data is pushes outside the function, to the caller, and closer to where the value is created, this often allows to find bugs (or input mistakes) faster/sooner.

Ranged integers work well with fixed size arrays, nudging Rust programmers to use heap less and arrays more, giving often some performance improvement.

Ranged values are also runnable documentation, they allow to write down in the code some specifications in a more clear way.

Ranged values allow to define array and string literals in a more precise and more correct way (enforcing they contain only a range of Ascii bytes or integer values).

Well designed ranged values must work well with %, its result should be convertible in a reliable way to a different ranged integer with nonfailing into()/from() conversions instead of failing try_from()/try_into() ones. Generally non-failing conversions between subsets should be allowed.

Generally ranged integers allow you to create clockwork code that works more like gears and bolts that fit into each other in a reliable way. Ranged integers could be useful in both high level and embedded low-level C-like code. Ranged integers add static constraints so they aren’t the best for the kind of code where you want high run-time flexibility.

To be useful ranged integers need to be integrated as much as possible with the other parts of the language.

A way to define a ranged type:

type Ten = u32[0 .. 10];
type Natural = usize[1 ..];

To define a subset of a range:

type B = Ten[.. 5]; // Subset of Ten.

Some helpers:

Ten::range() ==> (0 .. 10)
Ten::MIN ==> 0
Ten::MAX ==> 9

Example usage:

for i in Ten::range() {}

Some ways to define a ranged value:

let x: Ten = 10;
let x = 10: Ten; 
let x = Ten(10);

Ranged integers play well with exhaustive_integer_patterns ( https://github.com/rust-lang/rust/issues/50907 ):

type C = u32[.. 5];
let x1: C = 3;
match x1 {
    C::MIN => {},
    1 => {},
    2 .. 4 => {},
}

Random number in a range:

let y1: Ten = rand::random();
let y2 = rng.gen::<Ten>();

Explicit fail-less conversions:

type Ten = u32[.. 10];
let x: Ten = 9;
let z1: u8 = x.into();
type B = Ten[.. 5];
let x2: B = 3;
let x3: Ten = x2.into();
let x4: u64 = 21_500;
let x5: Ten = (x4 % 10).into();

Similar things should cut down the number of hard “as” casts in Rust code, improving reliability.

Conversions that can fail:

let x4: u64 = 21_500;
let xy: Ten = x4.try_into().unwrap(); // Will panic.

Array indexing:

type Five = u8[.. 5];

fn foo(a: [Five; 3], i: usize[.. 3]) -> Five {
    a[i] // No array bound test here.
}

Here I’ve defined the range of the argument i in-place. If that structural typing of ranged integers is not desired then you define the type before:

type Five = u8[.. 5];
type Ri = usize[.. 3];

fn foo(a: [Five; 3], i: Ri) -> Five {
    a[i] // No array bound test here.
}

In future ranged integers could play well with packed representations:

type Nibble = u8[.. 16];
// This takes 50_000 bytes.
#[repr(packed_bits)]
let mut factors = [0: Nibble; 100_000];

#2

if it only needs runtime enforcement this is easy to implement with const generics: struct RangedUsize<const MIN: usize, const MAX: usize>(usize); with asserts in all the math ops


#3

What I’m curious about is… is this the same story with arithmetic? I imagine it would be common enough, and if you have to widen or potentially panic with every operation, it starts to look less efficient and/or useful. What would the rules be with respect to overflow? (And would they be compatible with @withoutboats’ idea?)

The semantics of Ada specifies that each single operation on a signed type will raise an exception whenever the result overflows its base type. Additionally, when the result of such a computation is stored back into a variable or given as input to a function call, it is checked that the type’s bounds, here 1 and 10, are not exceeded.


#4

I think some of the usages shown above can’t be done well with a library-defined type. I think ranged integers to be useful need to be integrated as much as possible with all the rest of the language.