Any RFC for Units of Measure?

If we are talking about custom literals it will be up to the crate which defines inch. I don't think it's reasonable to require that custom literal always must be transparent, e.g. I want 1minute to be converted to Duration::new(3600, 0) and not to some Minute(1) type.

Yes, in this particular case youll need to use explicit i32::from, but the main idea still stands.

You assume that we always store unit values as float of the same size, but as was written we may want to store the same unit (Meter) in different "storages": u32, i64, f32, f64, bigints, etc...

Those two numbers have the same type (either f32s or f64s) but km and nm have different types, and I don't think that's fine. I can be convinced otherwise, but avoiding these types of bugs is kind of the point of strongly typing units of measure. If someone wants to get this behavior, they can store both units as m. @newpavlov mentions that people might want to use different reprs types here for storing the types, but I think we won't want this even if both are stored using the same repr type.

I prefer C++'s ratio method for this: a compile-time rational number

I prefer this as well. We have discussed using something like this in some future version of the chrono crate to deal with time, but without const generics this is really hard to do. It would be great to have a good compile-time rational number library that's as easy and ergonomic to use as C++ <ratio>. That would be a widely useful thing. Until we get const generics, it's hard to tell how feasible this is.

I don't have a solution for this, just a few thoughts.

A lot of these units have domain specific definitions and rules. For example: Currencies have non-const interchange factors, rounding rules and requirements for the underlying data store, Units of time beside SI-derived ones can have very complicated and changing rules (think of leap seconds). It may be possible to integrate all of this into one system, but my intuition is thats its more trouble than its worth.

SI plus some tools to convert other measurements to SI just gives a lot of bang for the buck. When its ok for all of physics, maybe its enough for us too.

If I’m not misunderstanding something, this approach would make every expansion of the set of base dimensions (for any “unit” type) a breaking change, without a language feature like variadic generics, which seems like a show-stopper.

Not if we use a HashMap for Unit:

struct Unit {
    dims: HashMap<u64, i32>,
}

const fn make_key(module_path: &str, name: &str) -> u64 {
    let mut hasher = Hasher::new()
    hasher.input(module_path);
    hasher.input(name);
    hasher.output()
}

macro_rules! define_unit(
    ($name:ident) => {
        const $name: Unit = Unit {
            dims: hash_map! { make_key(module_path!(), stringify!($name)) => 1 },
        };
    }
)

struct Measure<const U: Unit> {
    val: f64,
}

Then anyone can create and use their own units.

define_unit!(FUNKY);
define_unit!(CHUNKY);

let x: Measure<FUNKY> = ...;
let y: Measure<CHUNKY> = ...;
let z: Measure<CHUNKY * FUNKY> = x * y;

Of course the “true” way to represents units would be for rust to have existential consts and the ability to reason algebraically at compile-time. eg.

existential const METRE: f64;

let mean: f64 = 2 * METRE;
let std_dev: f64 = 9 * METRE;
let sample: f64 = rand::distributions::Normal(mean, variance);
println!("sampled {} metres", sample / METRE);

This is assuming the compiler had the ability to check that every usage of METRE ends up cancelling-out and so it never has to be given a specific value. I can’t imagine rust ever supporting this, but it’s sort-of doable in languages like Idris.

One thought orthogonal to the current discussion:

When its (currently) not possible to create a compile time system which is sufficient, what about a runtime system with checks disabled in release builds?

Its a little bit defeatist, but at least it would be possible to to hash out the ergonomics and guide the language design to make it possible at compile time. Also, the error (or in that case, panic) messages could be a lot nicer.

I don't think you can really talk about truth in this sort of situation, but for the actual use-case of units of measure in a type system — dimensional analysis — this approach doesn't make sense. If you were programming in a dependently-typed language, you would likely use quotient types or setoids (in a manner similar to the canonicalisation).

This would be possible to do entirely in a library (though it does seem like it loses most of the benefits if you only check at run-time), so I don't think we'd want to consider a run-time solution for Rust itself.

Sorry for being unclear, i was not proposing a build in solution for rust. And i don't even think such a library would be a real solution to the problem, just a vehicle to explore the ergonomics.

At least i’m getting a little lost in the weeds on this topic, so i’m trying to summarize and systemize the current state.

So this are the discussed options:

Exponents of a fixed set of base units

The type of the unit is a list of type level exponents of a defined set of base units. Say, all SI units.

What we need:

  • type level integers. Since exponents tend to be small, we might get away without them.

Pro:

  • Is the most realistic compile time solution with current/near future rust.
  • Simple to implement

Cons:

  • Limited. Unit systems are closed and can’t be extended by users and by the library author only with a breaking change
  • Type errors border on unacceptable. For example, m/s^2 would be something like
SI {
    meter: 1,
    second: -2,
    kelvin: 0,
    kg: 0,
    mol: 0,
    candela: 0,
    ampere: 0
}

and that is with native type level integers.

Variadic set of base units with type level exponents or multiset of base types for nominator/denomitator

The design space is quite wide for such a solution, because of that i don’t try to summarize it in one sentence. More or less a stand in for the type theorist wet dream solution.

What we need:

  • a much more powerful type system. Beside type level integers we need either type level sets/multisets or variadic lists and unification.
  • probably custom type error messages.

Pro:

  • actually what most of us want

Contra:

  • I won’t hold my breath that the required features actually land in rust in the foreseeable future.

Compiler magic :rainbow:

Aka the F# solution. Bake in unit unification into typechecking as a special case.

What we need

  • nothing in particular, since it would be tailored to our needs

Pro

  • most potential in terms of ergonomics

Contra:

  • High cost in terms of compiler maintenance and language complexity
  • There is (rightly so) quite a bit of resistance against compiler magic
  • Bikesheding potential is high, so even if the community went along with compiler magic, it’d be hard to iterate a solution without the typical “external library goes to nursery goes to std lib” path since there are so much design parameters

library solution with runtime checking

Aka giving up. :wink: There is as much design space to explore as with compile time solutions, but since it played little part in this discussion, i summarize it as one point.

Pros:

  • Actually possible with current rust
  • There is a lot of precedence in other languages

Contra:

  • Not zero overhead
  • Even if we find that most use cases would be fine with the overhead, it goes very much against the idea of rust.

I don’t think this would be a good solution overall, but there are two reasons why i don’t want to discard it completly

  1. there are actually legitimate use cases for run time unit checking. If we build a library that can do compile time checking, maybe it should also be possible to defer it to runtime
  2. it could be a good way to hash out the ergonomics.

other stuff

Speaking of ergonomics… imo that topic should play a bigger part in this discussion. Everything that makes it harder to use units will result in people not using them. Also, our minimum bar on type errors should be something like: “expected unit of meassure m/s, but found kg” or something like that, not a printout of the necronomicon like we get with type trickery a’la typenum.

4 Likes

All we would need is const generics, which should land within the next year (or maybe sooner if we all cheer @varkor on). Ideally we would also want the ability for const functions to allocate, but until then we could just have a hard limit on the number of units people can define and make the hash map (or whatever) fixed-size and unboxed.

I would like to propose a solution that was not discussed yet. The basis for that is this paper about a runtime unit of measure library for common lisp: https://3e8.org/pub/scheme/doc/lisp-pointers/v5i2/p21-cunis.pdf

The paper is quite short and light on detail, but i like the idea very much.

TLDR: Associate each unit with a unique prime number. Store a compound unit as a rational number. For example:

m ~ 2; s ~ 3
~>
m/s^2 ~ 2/9

Prime factorisation of the nominator and denominator yield the set of sorted base units. Equal compound units have equal rational representation as long as they are reduced. So, multiplications of the measure only requires multiplication and reduction of the rationals, addition a test of equality.

New units can be defined in a backward compatible way, but the associated prime number has to be unique.

With this as the underpining idea, i’d propose a library that provides a DynamicMeasure and a StaticMeasure type (all names are just stand ins). StaticMeasure checks at compile time and can be converted into a dynamic measure. Dynamic measures can be created at runtime and checks at runtime. So it’d be even possible to check user input on correctness, which is by definition impossible with compile time checking.

The representation is quite memory efficient. Its hard to list a range for nominator and denominator that is sufficient in practice, since additional units use greater primes and the compound number gets big fast, but what i found while experimenting is normal physical compounds fit into a u32, but barely. u64 should be fine for everything not extremely weird and u128 should be on the safe side. So that would mean 16-32 bytes runtime overhead per number. Should be much more acceptable than a whole hashmap.

const generics could use the same representation, which would make conversion easier. And the fixed size representation would mean no boxing.

Imo, having compile time safety with an escape hatch to defer to the runtime when necessary seems quite rusty to me.

The runtime part could be written today, the other as soon as const generics land.

Open questions:

  • How to enforce prime uniqueness
  • Something like a Display for types would be needed to show the real unit of measure in compile errors
3 Likes

Author of uom here. I don’t really think that an RFC is necessary. Based on the current functionality in dimensioned and uom Rust already provides what is needed for zero-cost unit libraries. Const generics will be a major boon, but typenum provides the same functionality right now with zero run-time cost.

Neither library has hit 1.0 so I invite everyone who is interested to review them in more detail and consider contributing. Many of the ideas discussed in this thread have already been explored and implemented in these libraries.

6 Likes

One thing I noticed with the uom crate is that it seems to display the same faulty decimal calculation behavior als floats (f32, f64) do, which is unnecessary I think. Have you ever looked at the Decimal crate? It is lower performance than pure floats but it can be used without any rounding errors causing nonsensical calculations.

I haven’t looked at the decimal crate specifically. The way uom is setup is that it essentially wraps the underlying storage type with quantity/unit information. See the features section where all the different underlying storage types are listed. bigrational is the closest to decimal currently and there is no reason that a decimal feature couldn’t be added as long as the decimal type implements the traits from the num-traits crate.

It has been four days, and I shouldn't even be responding to this, but I can't help it. Call it a pet peeve.

Please, don't fool yourself. Decimal numbers have all of the exact same rounding "bugs" as binary numbers.

#[macro_use]
extern crate decimal;

fn main() {
    let one = d128!(1.0);
    let nine = d128!(9.0);
    assert_eq!(nine * (one/nine), one);
}
thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `0.9999999999999999999999999999999999`,
 right: `1.0`', src/main.rs:7:2

If you wanted to make them closed under division, you could give them a BigInt representation and add repeating decimals. Then bravo! Now you can compute 1/9. But guess what?

  • It'd be even slower.
  • You're still out of luck for fractional exponents and transcendental functions.
  • You still don't need decimal! (Repeating binary numbers exist, too!)

There's nothing but tradeoffs in any direction you look. Nobody can make this decision for everyone, and no decision is "wrong" without first knowing the use-case.

13 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.