[Pre-RFC] Implicit number type widening

Tom-Phinney · June 21, 2019, 3:32am

The primary argument I’ve seen against implicit widening in Rust is about the greater difficulty and potential ambiguity of type inference. If that is a correct assessment, let’s limit the discussion to that impact without going afield to consider the nuances of widening in languages other than Rust.

Tom-Phinney · June 21, 2019, 3:41am

As for widening various uN or iN types to usize for use in indexing, I often just use a macro:

macro_rules! ix {
    ($e: expr) => {
        let val = $e;
        assert!(val >= 0);  // Checks that e is integer and positive
        (val as usize)
    }
}

Accessing array[expr], for any integer expr, becomes simply array[ix!(expr)].

scottmcm · June 21, 2019, 4:12am

I think this is a great demonstration that we need to make this easier, because people are already doing workarounds to simplify things, and those sometimes have potential gotchas at the edges (like array[ix!(1_u128 << 64)]). See also heterogeneous comparison, because right now the easiest way to compare an i32 with a u32 is a < b as i32 (or similar), and that's usually the wrong choice.

Tom-Phinney · June 21, 2019, 4:21am

Presumably your concern is about truncation when usize has fewer bits than the type of the expression. We don't need u128 with a shorter usize for that; u64 with usize == u32 suffices. It's easy to extend the macro to include an assert that compares against the upper bound of usize. What your example really demonstrates is the danger of Rust's as operator, which can silently truncate even when that is not the intent.

scottmcm · June 21, 2019, 4:25am

I completely agree. It's often the easiest thing to reach for, and thus it often gets used even when there are other options that might be better (and clippy will suggest) -- so we need either to add ways that are more-ergonomic than as or discourage/deprecate as.

vorner · June 21, 2019, 8:29am

I kind of enjoy the Rust stance of „nothing is going to get converted/casted implicitly anywhere“. That’s clear, simple (both in philosophy and and in reasoning about code).

I’m not entirely against the widening conversions in their own name. But the „Rust does not automtically cast anything except this“ is bigger ergonomic cost when thinking about the code than having to write .into() or as _ from time to time. I’m not talking about performance, or correctness per se, I’m just talking about the fact that with that some kinds of warning signs in the code would disappear and I’d have to extend additional mental effort to track what exactly should be happening in the code. For me the low cognitive overhead when reading the code and still understanding the details (certainly lower than C++, which does all kinds of magic implicitly, but also lower than most high-level languages, which do all bunch of things behind the scenes) is probably the biggest selling point of Rust.

petrochenkov · June 21, 2019, 8:47am

Widening only or not, it doesn’t matter.
If you have to perform a cross-type operation here it still means that you are doing something wrong/weird/special elsewhere and may want to change the types or abstract them away compressed-index style.

kornel · June 21, 2019, 2:44pm

Yes, reliance of widening is a symptom of using differently sized types in different places. Whether that’s good or not can be judged without referencing C’s implementation.

C’s implementation has a terrible reputation of being a “type soup” not only for what it does, but for how it does it (unexpected signed/unsigned changes, int is special, implicit lossy narrowing, float<>int conversions, all integer sizes are vaguely defined, etc.).

This case is different: Rust is a different context (integer sizes clearly defined, overflows defined), and the proposal is smaller (only widening, only lossless).

So “let’s not do implicit widening, because implicit conversions in C are bad” to me seems as pessimistic as saying “let’s not have move semantics in Rust, because C++'s std::move is such a hopeless mess”.

johnthagen · June 21, 2019, 3:22pm

Does Rust guarantee this will be "cheap" for all current (and future?) architectures?

One thought that came to mind was 8-bit architectures that have to emulate larger integers (e.g u64) relatively expensively.

RustyYato · June 21, 2019, 3:35pm

Rust will not support 8-bit architectures, if we were going to we wouldn’t have made u16: From<usize>. Because on 8-bit architectures usize = u8.

josh · June 21, 2019, 3:44pm

FWIW, while my experience with that behavior in C does factor into my consideration of this proposal, I’m also trying to carefully think about the actual feature proposal to only allow widening. C’s behavior would be an emphatic no. The proposed behavior, for me, raises potential concerns but I also find it interesting and worth serious consideration.

I still feel like, on balance, I would want to have the compiler flag mixed integer types as issues, but there’s always a balance there regarding false positives versus false negatives.

josh · June 21, 2019, 3:51pm

I find myself increasingly in favor of appropriate instances of PartialOrd and Ord and PartialEq and Eq. Not having them means catching mismatched types in another place, but the correct way to cast and compare is not at all obvious, and writing instances that Just Do The Right Thing seems like a good idea to me.

(This is something where my position has changed somewhat over time.)

I feel like carefully chosen additions of trait instances allow more nuance and consideration for the likely causes of integer type mismatches. Roughly speaking, for instance, I'd favor comparison operators more than arithmetic operators, arithmetic operators more than indexing, and indexing over arbitrary widening. And at the moment, I'd be inclined to draw the line after comparison and before arithmetic.

newpavlov · June 21, 2019, 3:52pm

I also don’t like implicit widening. Although I think it will be an ergnomic win to allow more freedom in trait-based operations with “obvious” implementations, so code like this would work: slice[1u8], 1u8 > 2i16, etc. But IIRC the main blocker here is ambiguity when desiding type of literals, so we’ll need some type of compiler hint, so one trait implementations will be prefered over others during type inference.

dhm · June 21, 2019, 8:00pm

With number type widening, the most plausible thing here is that both cases lead to an overflow, since that's what x + x does; afterwards, the obtained value is widened to a usize when indexing only (i.e., y is inferred to be a u8). Hence y plays no role here (or, in other words, it is as if y was implicitly used in the first example).

To see why y cannot be inferred to be a usize, and that the widening must occur only at usage site (e.g., indexing site), take the following example: the semantics should not change depending on whether the if false { data[y]; } line exists or not.

let x = u8::MAX;
let mut y = x; // should not be widened even in the presence of the last line
y += x; // so that this, that may overflow, always overflows
// if false { data[y]; }

EDIT: the RFC should therefore require some kind of implicit_widening_on_inferred_integer_type lint (warn by default).

PO8 · June 25, 2019, 9:48am

I’d be really excited to see the general case of lossless widening, if the type-inference issues could be worked out.

To be honest, the current situation as I’m working on audio code is quite unpleasant; for any kind of scientific computing, really. The noise level of all the into() and from() and as makes me more error-prone, rather than less, I think: I’m likely to stick in a conversion I don’t want or need as I fiddle around with finding the right types for a given expression, and I’m likely to make outright math errors because I don’t read what I’ve written correctly.

I too am leery of automatic conversion to/from usize (or isize) for portability reasons. I can live without this, I think.

I’d be uncomfortable with the indexing-only version, as it makes it harder to explain and understand how the rules work: there’s now a special case. I think that expressions should be referentially-transparent: any implicit widening should only be to one of the types already present (or inferred) for an expression.

An alternative rule: implicit widening can be done only for an expression whose type is known or can be inferred without reference to the expression’s arguments, and will widen exactly to that type. With this rule

let x = 5u32 - 7i64;
println!("{}", 5 + 5.0);

would be disallowed, but

let x: i64 = 5u32 - 7i64;
println!("{}", (5 + 5.0) as f64);

would be allowed. To be honest, I haven’t through this through super-carefully; maybe there’s something terrible about it.

Anyway, really would like to see this in Rust, to the point where I’ve thought about some crazy workarounds for the existing language.

josh · June 26, 2019, 4:08pm

One thing that helps is that in theory you should only be able to write .into() for safe (widening) conversions, and if you ever have to use as then you’re doing a potentially lossy conversion that needs more careful checking. (And in theory you can use the TryInto trait for those, if you want to handle overflow errors.)

But I do find myself wanting a few extra operator implementations to handle the common cases.

timvermeulen · June 27, 2019, 12:47am

Are there any obvious downsides to doing this? Any instance of any of the primitive integer types unambiguously represents an element from the mathematical set of integers, so I don't think there ever can be debate about what the correct output should be. And still nothing would be converted implicitly so that aspect of Rust isn't compromised.

matt1985 · June 27, 2019, 1:23am

The main downside is that it makes type inference somewhat harder.

RustyYato · June 27, 2019, 1:24am

There may be some inference regressions but that should be all. I think that is fine with our backwards compatibility guarantees.

scottmcm · June 27, 2019, 2:15am

A quick experiment shows that even if x < 7 turns 7 into an i32 by literal fallback, LLVM is still smart enough to collapse that back down to a comparison in the original type:

fn lt_mixed(a: u32, b: i32) -> bool {
    b >= 0 && a < b as u32
}

pub fn lt_7_mixed(&x: &u8) -> bool {
    lt_mixed(x.into(), 7)
}

becomes

%1 = icmp ult i8 %0, 7

So it might be worth making a PR with all the PartialOrd combinations to see how bad the perf impact is (from potentially involving more code needing to be optimized-away) and to run a crater check to see how bad the inference problem is (I don’t know how much a < b propagating types between the two variables is critical today).

More data always helps

Topic		Replies	Views
[Pre-RFC] Integer/Float literal types	11	1835	March 25, 2019
Pre-RFC: ergonomics around NonZeroU* and literals language design	11	1506	March 25, 2019
Idea: In the next edition, stop accepting `0.` as a valid float literal	32	3535	January 15, 2020
Pre-RFC: Extended array literal syntax language design	5	827	September 21, 2019
Integer Constructor Functions libs	12	772	November 16, 2021

[Pre-RFC] Implicit number type widening

Related Topics