[Pre-RFC] integer and float literals for custom types

I have a proposal for implementing integer and float literals for custom types that I would like feedback for.


The few proposals I've seen for extending integer and float literals to programmer-defined types have the complication of compile-time-only big integer types and such. This is a much simpler proposal as far as the compiler goes.

The compiler transforms the literal from something like -28_384BigInteger into IntegerLiteral { base: Base::Decimal, negative: true, digits: vec![2, 8, 3, 8, 4] }, which is then fed into BigInteger's FromIntegerLiteral implementation to produce a Result<BigInteger, IntegerLiteralError>.

Through this, parsing is taken care of by the compiler, avoiding the need for every implementation to reimplement it from scratch. Then the compiler can give its normal error messages if needed.

Is there some way of ensuring that existing implementations can run at compile time other than waiting for traits to be able to force implementing methods to be const fns?


pub enum Base {
    Binary,
    Octal,
    Decimal,
    Hexadecimal,
}

pub struct IntegerLiteral {
    pub base: Base,
    pub negative: bool,
    pub digits: Vec<u8>,
}

pub struct FloatLiteral {
    pub base: Base,
    pub negative: bool,
    pub digits_before_point: Vec<u8>,
    pub digits_after_point: Vec<u8>,
    pub exponent_is_negative: bool,
    pub exponent_digits: Vec<u8>,    
}

pub enum IntegerLiteralError {
    TooHigh,
    TooLow,
    /* ... */
}

pub enum FloatLiteralError {
    TooHigh,
    TooLow,
    /* ... */
}

pub trait FromIntegerLiteral {
    fn from_integer_literal(literal: IntegerLiteral) -> Result<Self, IntegerLiteralError>;
}

pub trait FromFloatLiteral {
    fn from_float_literal(literal: FloatLiteral) -> Result<Self, FloatLiteralError>;
}

FloatLiteral also has a base field in case Rust follows the example of Java in allowing literals in another base. The base field doesn't apply to the exponent digits, which are always in decimal.

1 Like

This has been discussed here a number of times (search for "custom literals"). I think it would be nice, but it's also true that you can get 90% of the way there with macros and const fns (the latter getting more and more useful by the day). I think it would be useful if someone motivated put together a summary of past discussion (in particular in light of more recent const fn capabilities) and small survey of existing macro/library solutions.

Is there some way of ensuring that existing implementations can run at compile time other than waiting for traits to be able to force implementing methods to be const fn s?

Probably just "wait for const fns" would be the easier/more principled path.

3 Likes

I would love that. Here are previous discussions:

2 Likes

IMO the most comprehensive crate on crates.io for units of measure seems to be UoM. Any proposal for custom literals should take into account the many types of units covered by that crate. Also, if it provides a syntax for 2-component complex numbers, that syntax should be extensible to 4-component quaternions (more as a test for the completeness of the approach than for practical reasons).

1 Like

I recall someone bringing up allowing the production $path ! $literal at some point as sugar for $path!($literal). I think the context was in string interpolation, where people wanted to be able to write f!"foo is {foo}" or similar. I think that this extends in a fairly natural way to custom literals in general: big_int!42, which means the problem of how to represent a "general" literal can be punted to a procedural macro, rather than stapled to std.

Unfortunately it's not compatible with the existing 42u32 syntax, but that syntax is already a lexer-level concern iirc. It's also nice because you can write foo::bar!42 if you so chose. Path resolution seems to consistently be a problem with these proposals. It also has the benefit that you can write foo::bar!(42) if you really want to

2 Likes

Regarding the syntax: something I haven't seen in any of the linked discussions on the topic (but might have missed) is to use type ascription for this purpose - assuming it can be used on sub-expressions - rather than pre-/suffixes.

AFAICT this would not require any additional syntax other than type ascription, and would still look rather nice to me:

const length_explicit = 17 : units::si::Meter;  // can use fully specified paths

use units::si::{m, s};     // assuming 'm', 's' are public aliases for Meter, Second
const distance = 17 : m;
const speed = distance / 1.0 : s;

const found = "foo(bar)?":regex.matches(data);

const c = (1, 4) : Complex;  
const q = (1.0, 2.5, 0.0, -3.7) : Quaternion

Ofc this would need some support/rules how ascriptions apply to constants of other (underlying) types, and does not address the issues regarding unit handling, and underlying storage types.

For the latter, one nice option may be module level generics:

use units::si::<u64>::{m, s}; 
const length = 128 : m;         // base storage type is u64
1 Like

Do keep in mind that the way I would accomplish this today with with an impl From<i64> for BigInteger that then let's you write 28_384.into(), but that of course doesn't work for anything outside the range. At that point you could use a str, but it isn't as nice.

Be aware that the lever might have unexpected limitations that would make your extension not always behave the way you expect.

I verified that macros can parse literals of any length.