Any RFC for Units of Measure?

Aside: You may be interested in https://github.com/rust-lang/rfcs/pull/2507 which sort of half-proposed this along with some custom literal stuff, or in type inference for consts/statics · Issue #1349 · rust-lang/rfcs · GitHub since const type inference covers a lot of the same design space.

1 Like

I disagree. Adding two quantities with the same dimensions should be allowed regardless of units, so long as the smaller can be losslessly converted into the larger. (Note that that's different from overflow in the addition, which is always a problem with even the same units.)

So 56km + 23nm => 56000000000023nm is totally fine. Even 12um + 34inches => 863612um is fine, since an inch is an integer multiple of a micrometre. But 1m + 1inch would require an explicit conversion method to allow the possible loss, or the use of two coersions to more-precise types.

(Depending on how the underlying numeric types are used, the checks can even enforce that the lossless conversion cannot overflow, by for example allowing a u8 number of km to become a u32 number of millimetres.)

However, if units were represented as a Dimension Type / Hierarchy / Unit Components triplet where components of 1 hierarchy could auto-coerce to the closest component of the other hierarchy, this might not be an issue.

Following this logic we should allow 1u32 + 1u8, which is not the case today. If Rust team will allow such auto-conversion for primitive types it will make sense to do the same for units, but without it I don't think we should make exception.

But if in 1m + 1inch both 1m and 1inch produce value in meters (i.e. they are custom literals which resolve to meters at compile time), I have no problem with it.

To me it's like forbidding this snippet:

// Nanometer and Kilometer implement From for each other
let x: Nanometer = Nanometer(1.0f32);
let y: Kilometer = Kilometer(1.0f32);
// forbidden. you must explicitly use `into()` to do chosen conversion
let z = x + y;

Supposing that these are stored as f32? as f64? as integers? What about 56e200km+23e-100nm ? Would that be lossless or lossy (playground) ? How would we detect this at runtime, e.g., for a fn foo(x: km, y: nm) -> km { x + y } ?


FWIW I think that physical units should be orthogonal to data-representation (f32, f64, isize, etc.) so that I can store kilometers in an isize if I want to, and that conversions should follow the same rules as for data-representation. Also, applying conversion factors when it comes to units is weird, e.g. if I have two lengths one in m and another one in in, stored in two i32, how should the conversion happen? Should it use an integer conversion factor? Should it convert both quantities to f32 or f64 first, then convert, then add, then convert back to integers? If all that were to happen implicitly, that would be too much implicitness going on for my taste.

1 Like

What about if UOM were always represented as a quartlet (is that a word?) consisting of Dimension Enum / Hierarchy Enum / Unit Enum / Big Integer. Auto-Coercion is always in direction of smaller unit (meaning multiplying the Big Integer to a larger value represented with the smaller unit). That way, there would never be any meaningful loss, it would be relatively efficient, not too non-compact, sized, not too cache unfriendly, etc. If the “Big Integer” part were implemented such that it used a i128… Never mind…the more I think about it and try to come up with a viable alternative that allows sensible auto-coercion between units, the more I realize that any such solution would require non-sized types and allocation which would likely be highly inefficient for little useful gain. I think I’m won over to the camp advocating against auto-coercion.

1 Like

I agree. One approach can look like this.

Std code:

// assuming we'll get const trait fns
// here const guarantees that trait have only const methods
trait UnitSystem: const Div + const Mul + const BitXor { }

struct UnitValue<Value, U: UnitSystem, const UNIT: U> {
    value: Value,
   // not sure if we need PhantomData field here
   phantom: PhantomData<..>,
}

impl<V1, V2, U, const U1: U, const U2: U> Mul<RHS=UnitValue<V1, U, U1>>
    for UnitValue<V2, U, U2>
    where U: UnitSystem, V2: const Mul<V1> 
{
    type Output = UnitValue<V2::Output, U, U1*U2>;
    // ideally we should be able to define `const fn mul` if `Mul<V1>` is const
    // and `fn mul` if `Mul<V1>` is not const,
    // it's probably should be possible with specialization
    // but to start things we can do this impl only for constant `Mul`s
    const fn mul(self, rhs: Self::RHS) -> Self::Output {
        UnitValue { value: self.value*rhs.value, phantom: Default::default() }
    }
}

// generic impls for Div, Add, etc., some inhrent methods for f32, f64, etc.

We also can add helper methods to UnitSystem trait for conversion between unit systems, but I am not completely sure how they should look.

Unit system crate:

struct SI {
    meter: i8,
    second: i8,
    kelvin: i8,
    // ..
}

impl Mul for SI {
    type Output = SI;
    
    const fn mul(mut self, rhs: SI) -> Self {
        self.meter += rhs.meter;
        self.second += rhs.second;
        self.kelvin += rhs.kelvin;
        // ..
        self
    }
}

impl Div for SI {
    type Output = SI;
    
    const fn div(mut self, rhs: SI) -> Self {
        self.meter -= rhs.meter;
        self.second -= rhs.second;
        self.kelvin -= rhs.kelvin;
        // ..
        self
    }
}

// we hijack `^` operator here, but it will make code significantly nicer
impl BitXor<RHS=i8> for SI {
    type Output = SI;
    
    const fn bit_xor(mut self, rhs: i8) -> Self {
        self.meter *= rhs;
        self.second *= rhs;
        self.kelvin *= rhs;
        // ..
        self
    }
}

impl UnitSystem for SI {}

// f32  module
type SIVal<const UNIT: SI> = UnitValue<f32, SI, UNIT>;
const Meter = SI { meter: 1, second: 0, kelvin: 0, .. };
const Second = SI { meter: 0, second: 1, kelvin: 0, .. };
const Hertz = Second^-1;

User code:

fn foo<const U1: SI, const U2: SI>(
    a: SIVal<U1>, b: SIValF32<U1>, c: SIVal<U2>
) -> SIVal<U1/U2> {
    (a + b)/c
}

fn bar(length: SIVal<Meter>, time: SIVal<Second>) -> SIVal<Meter/Second^2> {
    2.0*length/(time*time)
}

The drawbacks of this system are:

  • Reliance on some features without implementation plan (const trait fns, const trait bounds). It can be circumvented a bit, if we'll move div and mul methods to UnitSystem trait, but it will make generic code a bit more verbose, as you will not be able to use * and /.
  • Unit system can not be extended by third-party code. Though arguably in practice it shouldn't be a big problem.
  • You will not be able to create derivative unit systems easily, e.g. by replacing meter with millimeter.
  • Generic functions will be still quite unwieldy.

Something like this is what I had in mind as well. I hadn’t thought about it as much as you did, but I basically thought that we would just have “something in the spirit of”:

struct Quantity<Value, Unit>(Value, Unit);

where each system has a different Unit type, which are just ZSTs, and Value can be any type. And that then we would “somehow” implement conversions between Quantities via From/TryFrom and friends, e.g.,:

impl<V0,V1,U0,U1> From<Quantity<V1,U1> for Quantity<V0, U0> 
    where V0: From<V1>, U0: UnitFrom<U1> { ... }
}

All extremely hand-wavy to be really useful, but iff we go in this direction that’s in very broad terms what I had in mind.

I don't think thats a problem. The last time a SI unit was added was 47 years ago. Imo its fair that adding or changing base units is a breaking change.

Your proposed system could have another (10-based) exponent for dimension. For meter it would be 0, for millimeter -3 etc. That would also make for a more systematic conversion between dimensions.

Hm, interesting idea, but I am not sure how Div and Mul implementations for UnitSystem should look with this modification and how generic implementations for UnitValue should be changed. For example what should happen if I’ll divide 10 m^2 by 3 mm/s? Should result be m*s or mm*s? How conversion and loss of precision should be handled if Value=u32? And other similar questions.

Probably we can handle it by making exponents const argument for generic SI type… And thus we will allow creation of derivative unit systems, but not an implicit auto-conversion between them.

I would say:

10m^2 / (3mm/s)
=  (10u32 [10^0, m^2, s^0] ) / (3u32 [10^-3, m^1, s^-1])
= (10u32 / 3u32 ) [10^(0 - (-3)), m^(2-1), s^(0 - (-1))]
= (10u32 / 3u32 ) [10^3, m^1, s^1]
= 3 kms

... whatever a kms is ;).

1 Like

Ah, so you want to keep an exponent for the whole unit, not for separate dimensions. Hm, I guess it could work, but there is still difficult questions about precision loss and handling of different float and integer types. For example what should happen here 10u32[m] + 1u32[mm]? How to handle computation of 10^n for both floats, integers and maybe arbitrary bigint types? (and don’t forget about overflows) Also your modification will not work with non-SI prefix derivatives. For example what if for astronomy application I want distances in au or light years and leave everything else as-is?

Also for some applications it can be important to keep mm, without implicit conversion to km or something compound.

Ok, lets think this through :wink:

For me, this would be the most unsurprising behavior. In rust, 10u32/3u32 is 3u32, so having 10u32 [some unit] / 3u32 [some unit] be 3u32 [some unit]would be consistent and logical.

For the pure unit computation, it wouldn't matter, since the actual values wouldn't be touched by the units.

Where it starts to matter is convertion between dimensions. That would be number type specific. I don't know if its possible to abstract that completely, maybe we need number type specific constants for that. Example:

let converter = 1f64/1000f64 kilo;
let x = 4f64 m;
let y = x * converter; // is 4000 mm now

would be a compile time error. [10^0, m^1] is a different unit than [10^-3, m^1]. You'd have to manually convert one of the values so that the dimension line up. At that point, you change of of the values and take the decision what to preserve, range or precision.

That is correct. Non consistent unit systems are hard. Personally, i would treat it the same way like wie treat unicode. Have an consistent internal representation and convert at the borders. But i get that may be not satisfying.

If you have a consistent usage of mm, mm/s, mm/s^2 etc. i cannot imagine a situation where one ends up with km somewhere. You'd have to concisely divide by a nonsensical unit to do that.

If that should happen, the compiler would error out because all function in such a system would specify usage of mm and there would be no auto conversion from km to mm.

3 Likes

Yeah, conversion framework will require some design work. At the very least we will need a generic way to calculate 10^n. The most straightforward approach will be to simply loop n.abs() times and do division/multiplication by 10 depending on the sign.

Ah, indeed. But it will lead to somewhat surprising interactions, as km/s will be equivalent to m/ms. Though it's perfectly sound. :slight_smile:

It's probably will be the most practical solution. Especially if we'll consider Celsius and Fahrenheit degrees...

I've meant that you may want to keep 1/(mm^2) instead of converting it to M(1/m^2). But I guess it's matter of internal representation and shouldn't matter to users much, and they should do necessary conversion on output/printing stages.

So overall, I think your idea will be indeed a great addition to the proposed system.

I could imagine a trait that one has to implement for a number type to be scalable by si dimensions (that trait could be as simple as a single function that gives back the constant 10, together with the Mul and Div traits, we have everything we need). Ideally, that would be the only "touching point" between number types and their units.

I haven't thought about that... yeah, either always show a canonical form when displaying units or have some clever idea to specify the format...

Thank you :slight_smile:

Probably we can handle it by making exponents const argument for generic SI type… And thus we will allow creation of derivative unit systems, but not an implicit auto-conversion between them.

So that's what I had in mind.

When there is no good solution for all cases, I tend to prefer to make things a type error, and ask the user to disambiguate. Here the user would need to explicitly convert either the m^2 to mm^2 or the mm to m before performing the division.

How conversion and loss of precision should be handled if Value=u32 ?

If both quantities use an u32 division should work just like it does for u32s, and conversions should work explicitly in the same way as for other types (e.g. via From when they are "value preserving").

That is, if one tries to convert 1_u32 mm to u32 m, I expect the result to be 0_u32 m. As mentioned above, I don't think these conversions should happen implicitly, not even for floats, so I don't think u32s should be handled any differently.

Implementation wise, the From/TryFrom/.. impls have some freedom about how to perform the conversions. For example, when converting from 1_f32 mm to 0.001_f64 m we can apply an f32 unit conversion to the f32 and then convert the representation to f64, or convert the representation first to f64 and then apply an f64 conversion factor. I think this is going too much into the details at this point.

Yeah, that's reasonable for SI units in particular, but it also seems reasonable that there could be other unit types (probably including user-defined units), which could hit these problems.

I think it's important that 1inch produces something that's repr(transparent) as 1, so that people can use cgs, mks, fps, or whatever without introducing additional conversions or potential precision loss.

Note that .into() doesn't work for that, the same as it doesn't work for 1_i32 + 1_i8.into().

Probably the same way we detect 5.6e200 + 2.3e-100 at runtime. (AKA we don't, and that's fine.)

Have any examples? The seven SI base dimensions (mass, length, time, temperature, current, light, amount) seem pretty extensive.

I prefer C++'s ratio method for this: a compile-time rational number. That way you can use non-SI units with exact definitions in terms of SI as part of a compatible system, like making the inch exactly 127/5000 of a metre.

Any time you're working with quantities of objects, you're essentially using the object as a unit itself. You can create wrapper types over integers, etc., but this requires delegating to the inner value (manually or otherwise). Also, it gets awkward when you want ratios.

More generally, there are a number of non-SI units that are commonly used in practice. It's plausible you could define these as some sort of alias, but you run into the same sort of precision issues as raised throughout this thread.

1 Like