This is a pretty obvious topic so I searched around, but this has apparently not been discussed for at least 6 years, and as far as I can tell the reason the discussions stopped is that they got distracted and then forgotten. So, I think it's appropriate to bring it up again with a stricter scope, providing as much specific reasoning as I can. I guess this is a pre-RFC, but I'm new to this.
Rust's lack of implicit numeric coercions prevents unexpected precision loss and forces conversions to be intentional. This is sometimes good, but this approach misses the fact that only some conversions cause precision loss. The constant usage of as can cause a lot of pollution in code that doesn't deserve it. Adding just the coercions that are lossless would be very convenient while keeping the benefits intact. In fact, I believe it would bring an extra benefit. If you get precision loss somewhere in your code, it's from a lossy conversion, and since only lossy conversions would require as, they would be much easier to spot. This is similar to how UB typically comes from an unsafe block.
These are the needed changes:
When a value of a known numeric type T is provided but a specific numeric type E is expected, for example in a function call or an assignment, as a last resort before erroring, check if the conversion T -> E is allowed as per the tables below, and if so, perform it. For example, let a: u64 = 1u32; would be valid, and a would be 1u64.
When a binary arithmetic operator is used between two values of known numeric types A and B, but the operation would error due to mismatched types, check if either A -> B or B -> A is an allowed conversion, and if so, perform it to equalize the two types and make the operation valid. For example, 1u64 + 1u32 would be valid, and the result would be 2u64.
These are the main conversions that are obviously safe, because as they are defined, data loss is impossible, and they cannot be reversed without an explicit as:
u8 -> [u16, u32, u64, u128, usize, f32, f64]
u16 -> [u32, u64, u128, usize, f32, f64]
u32 -> [u64, u128, f64]
u64 -> [u128]
usize -> [u128]
i8 -> [i16, i32, i64, i128, isize, f32, f64]
i16 -> [i32, i64, i128, isize, f32, f64]
i32 -> [i64, i128, f64]
i64 -> [i128]
isize -> [i128]
f32 -> [f64]
These are also safe, and I believe they should also be included, but whether they're desirable is slightly more debatable (see the last note):
u8 -> [i16, i32, i64, i128, isize]
u16 -> [i32, i64, i128]
u32 -> [i64, i128]
u64 -> [i128]
Notes:
This is backwards compatible, as it only deals with cases that would error before.
This doesn't interfere with inference or generics, as it doesn't apply if a type isn't already known.
This would allow indexing arrays with u8 and u16 in addition to usize, because the only implementation of Index for an array is Index<usize>, which is currently used as a type inference hint. Allowing indexing with larger types would require a different solution, due to u32 not being considered convertible to usize. Also, if such a solution is added in the future, it wouldn't require any change in the code anyway, so this is forwards compatible.
Despite being a common "small" substitute for usize on 64-bit platforms, u32 cannot be implicitly converted to usize because usize is allowed to be 16-bit. Changing that is out of scope. Instead, if Rust ever decides to not allow 16-bit usize in a future version, this implicit conversion, as well as From<u32> for usize, should be retroactively added.
Similarly, usize cannot be implicitly converted to u64 because Rust expects 128-bit platforms to appear in the future. Obviously, it will take a long time for pointers to need to be more than 64 bits, if they ever do, but that discussion is also out of scope.
Converting unsigned types to signed ones is theoretically less likely to be intended by the programmer, but I don't believe this is a significant issue, because it only happens if you gave the value an "incorrect" type in the first place. For example, I work on things like compilers and emulators, so I often hold "unsigned" values whose signedness is actually meant to be unknown. Let's say I wanted to check if a u16 value is between -128 and 127 when interpreted as signed; if I forget to convert it to i16 first, the value might be silently converted directly from u16 to i32, and it would be invalid. However, since I explicitly typed it as u16 in the first place, I basically told Rust that this is how it should interpret the value, with my unusual usage of it being my responsibility, so that behavior seems reasonable to me. Also, since Rust defines u16 as i32 as a zero extension, it seems like the language already agrees that this type of widening should not be considered ambiguous.
I think I was mistaken in my note about indexing, I'll have to correct the proposal. Yes, that would work. My original thinking was that would still look for an impl Index<u16> for [u8; 3] and fail. I also thought that implementing Index for other types sounded like it would be more "complete" solution, but apparently that can't be done because it would break type inference for the index value. Type inference uses the fact that the index can only be usize to hint that any index is usize.
But then when I realized that, I realized that that means the expected type is known, so it will naturally try to coerce to usize, which will allow indexing with u8 and u16. This might also apply to other kinds of trait method calls where only one possibility exists. This is still backwards and forwards compatible and still doesn't interfere with inference, so I see no problems with allowing this. It's just a slight improvement.
u32 wouldn't work sadly, but I often use u8 or u16 as array indices, so it's still useful. For example, think of a compact u8 index into a small list of registers in an emulator, which is getting referenced in the implementation of every single instruction. Also, if some sort of solution is added for indexing with u32/u64/u128 in the future, it wouldn't break this (thus "forwards compatible").
It would be the latter yes. I don't think that generally makes a difference, but I see that it can possibly overflow. No need for a and b to be different types for that though:
fn foo(a: u16, b: u16) -> u32 {
a + b // (a + b) as u32
}
That said, this is how additions work in general in Rust, every single invocation of + is technically allowed to overflow. And at least to me, this is what I expect, because if I do an operation with two values somewhere, it depends on their types, and shouldn't be affected by a later operation (returning in this case). So I suppose you mean that someone might write this carelessly, forgetting to extend their types at all. But still, they did explicitly define a and b as u16, so I find it unlikely that the user doesn't expect that.
A third rule that extends inner operands to prevent overflows would guarantee the former and prevent the overflow, but then I fear that behavior would be unexpected.
Maybe it would help to detect this pattern in a lint regardless?
currently the literal is coerced to i16 after it fails to be i32 (which is the default for literals).
with widening available, I would (naively) expect a as i32 + 1 because literals default to i32.
also: currently inference also works backwards: Rust Playground. how would it work there?
I don't think that's how it works? Literals are affected by type inference first, so they will be inferred to be i16 in the first case and u16 in the second case. Literals default to i32 if a more specific type cannot be found.
Other than that the issue is the same as before. At the very least it would be caught by a lint like "result of a lower precision operation immediately extended, did you mean to extend the operands first?".
But you're not supposed to overflow in Rust anyway, since it panics in debug mode, so I think widening early could actually make sense? That would solve this issue and guarantee that the results in all of your examples will be correct. The only problem would be if the user expects it to overflow, which doesn't seem to make sense in any situation, because the correct way to intentionally handle overflow is by calling one of the explicit *_add functions; you're not supposed to let the plain + operator overflow. The third rule would then be basically: if a binary operation is immediately widened, the individual operands get coerced to the result type first. Might need some more rules to handle all the edge cases as well.
I might be wrong but that honestly doesn't look clear even without automatic widening. This note in the doc for leading_zeros makes me think that someone has thought the same before:
Depending on what you’re doing with the value, you might also be interested in the ilog2 function which returns a consistent number, even if the type widens.
Imo, the issue here isn't exactly that the type conversion is unexpected, but that leading_zeros (and co.) is a member function whose behavior is too dependent on an implicit type, making it unclear. I feel like it should be encouraged to use u32::leading_zeros(x) rather than x.leading_zeros() in general.
Default type ascription solves a similar set of issues as implicit widening, and then some. Currently we have
impl Add<u8> for u8 {
type Output = u8;
fn add(self, rhs: u8) -> Self::Output { ... }
}
Now foo_u8 + bar can infer that bar: u8. In situations like foo_u32 + (bar_u8 * baz_u8) we can error and force the programmer to explicitly figure out what they want the overflow to look like (this is something that any implicit widening proposal needs to cover). But there's no reason we couldn't define some more impls:
impl Add<u32> for u8 {
type Output = u32;
fn add(self, rhs: u32) -> Self::Output { ... }
}
impl Add<u8> for u32 {
type Output = u32;
fn add(self, rhs: u8) -> Self::Output { ... }
}
Except that breaks inference; we need some way to specify which implementation gets picked in ambiguous cases. And it breaks the warnings/errors, which can be fixed. It's definitely more ergonomic in very simple cases, but any case more complicated than foo_u32 + (bar_u8 * baz_u8) will need the programmer to be explicit.
Which makes me think of how this interacts (if at all) with Wrapping and friends. Do the wrappers still need explicit as casts or into calls? That feels…unfortunate. But then how to deal with something like:
fn foo(i: Wrapping<i8>) -> i16 {
i + 1
}
Now one could want the late implicit-as cast. The alternative, I suppose is:
let j: Wrapping<i8> = i + 1;
j
which feels…awfully magical and against the "prefer to be explicit" guideline Rust generally prefers. Type ascription could help, but that seems like a heavy anchor to attach new proposals to (last I knew).
Hmm, I thought we had an FRC for this, but it looks like we don't.
My biggest feedback for this RFC is that it need more specific examples of real situations where this is better than the alternatives.
The core problem is that while it's true that in isolation the trivial, fully-qualified cases like
let x: u32 = 1;
let y: u64 = x;
the widening would be fine, the problem is once it gets broader.
For example, if you have a: u16 and b: u16 and fn foo(_: u32), then in a call like foo(a + b) it's probably wrong to do foo(u32::from(a + b)), because you likely wanted foo(u32::from(a) + u32::from(b)) instead.
And thus any simple "well we have a u16 so it's fine to implicitly widen it" rule like proposed here is not something that the lang team is willing to accept, from my current understanding of the current gestalt. (Speaking for me, not as the team, though we've discussed things similar to this in the past, which is why I have this example readily available.)
The version of this that wouldn't have this problem would essentially be having u16 + u16 give a u17 instead, and then promoting the u17 to u32 would be fine because the intermediate operations wouldn't have overflowed. But that's a massive change that probably not practical for rust at this point, since it would need to change nearly every operation first.
I see what you mean. That could handle the case of operations, although it still wouldn't automatically convert in assignments or function calls. I feel like the two features would work together, if that one ever happens. (Also I believe the syntax would be impl<Rhs = u8> Add<Rhs> for u8 {?)
While the ergonomics of Wrapping are a bit unfortunate as you say, that is a struct in the standard library, so it can't really be included, and it doesn't even get as casts. Your code there needs to use into explicitly. Implementing a similar thing for arbitrary types would be a whole other can of worms, and probably not desirable, for the same reason that .clone() isn't automatically inserted by the compiler.
Thanks for the feedback. The proposal is slowly getting more specific as these concerns are brought up. Did you see our discussion of that case above? The idea is that, since having an overflow when using a plain operator like + is already treated as a crime (it panics in debug mode but wraps in release mode, which honestly feels like a stronger deterrent than if it just panicked consistently), and we have ways for the user to choose the specific behavior they intend (strict_add, checked_add, wrapping_add, Wrapping, etc.), it might be reasonable to propagate conversions backwards and into the operands of arithmetic operations, guaranteeing that the result is numerically correct. For example, let a: u32 = 255u8 + 1u8; should cause the individual operands to be converted to u32 first, and result in 256u32, which is the correct result. I believe that would have a similar advantage to the u17 thing you mention. I will update the main post, I'm just still thinking about the specifics of this. There might also be quirks or problems with this idea, but I'm not sure yet.
Function calls would work if the function were able to accept Into<T> default T. Currently doing that can make it so one always has to specify T or always force callers to call .into(). Automatically widening in assignments is indeed not covered by type defaults.
Fixed, kind of. The second syntax doesn't compile either way.
Again, this is solution-focused. Please instead show realistic (non-toy) examples of code with these kinds of width inconsistencies and explain why the current things that exist aren't enough. What you need, before talking about the changes you want to make, is to convince people that the need for explicit widening is even a problem in the first place.
Note that arithmetic isn't the only possible problem here. Imagine
takes_u64(text.parse::<u32>()?)
where the language legitimately doesn't know if it should be
takes_u64(u64::from(text.parse::<u32>()?))
or
takes_u64(text.parse::<u64>()?)
and thus the lack of implicit widening is a helpful point to ask the programmer which behaviour they want.