Implicit numeric widening/coercion proposal

I would be strongly opposed to such a change. I have seen (in C++) implicit casts cause bugs in both narrowing and widening directions. At my day job (where we still have C++) we use tooling to forbid all implicit casts.

7 Likes

I feel like Rust's existing implicit coercions (e.g. CoerceUnsized) can already lead to extremely confusing type system errors, particularly in complex cases.

It's an entirely separate set of rules that interacts with things like type inference and trait resolution , and those interactions can blow up in unexpected and counterintuitive ways.

6 Likes

I have been bitten badly by this exact issue in C (on a platform with 16-bit int):

// somewhere:
uint16_t timeout_secs = 150;

// somewhere else:
uint32_t timeout_millis = timeout_secs * 1000;
printf("%d\n", timeout_millis);  // prints 18928

At all points while writing this code I was careful in choosing my types to avoid overflow. But the buggy snippet was a quick change to a small piece of code in a hurry, and it didn't occur to me in the moment that timeout_secs was only 16 bits. (And because I used an artificially short timeout to test the timeout handling code, the bug made it into production and caused the program to abort early.)

You're right that the overflow is the "expected" behavior of these operations, but it's only expected behavior if you expect integer promotion to be happening. Implicit conversion rules can hide a subtle bug by silently inserting an incorrect conversion when you didn't realize the code needed one.

One of the reasons I write Rust today is that this bug can't happen to me anymore. Explicit conversions means that when an integer needs to be widened, the compiler lets me know and asks me where I want to put the conversion instead of taking a guess and leaving me in the dark about it.

This works for simple cases, but becomes even more surprising when you have complex expressions or expressions split out into separate variables. For example, the following code snippets will behave differently:

// A
let timeout_secs: u16 = 150;
let timeout_ticks: u32 = timeout_secs * 1000 * TICKS_PER_MS;

// B
let timeout_secs: u16 = 150;
let timeout_millis = timeout_secs * 1000; // inferred as u16, overflows
let timeout_ticks: u32 = timeout_millis * TICKS_PER_MS;

Overflow and type-conversion bugs are tricky, so I really want my compiler to let me know whenever there's a surprising case instead of trying to figure out what I meant.

13 Likes

The major question here is why it's a good thing to do all of that implicitly, rather than facilitating it by way of compiler errors telling the user that they have the wrong type.

Maybe the user wants to widen the two u16 values before adding them. Maybe they want to go back to where those are defined and change their types. Maybe they have the wrong type on the function argument, and it should be changed to a u16.

We shouldn't guess, and risk unpleasantly surprising the user.

7 Likes

I wonder if this is something we can introduce across an edition. Keep the generic integer types u<N>/i<N> separate from the existing u8/i16/etc. This way you can add all of those widening impls and coercions, just for the generic integers.

The existing u8/i16/etc types would coerce only directly into the same-bitsize generic integer. Then in a new edition:

  • the iN syntax would become a shorthand for generic integers
  • all literals would be represented by the smallest possible generic integer
2 Likes

TBH if you do this you want not just bitwidth restrictions, but a full Integer<MIN, MAX>. Then you have Integer<A, B> + Integer<C, D> → Integer<{A+C}, {B+D}> and such, along with making the literal 4 have type Integer<4, 4>.

With all that you get a bunch of nice properties. For example, you no longer need a u32::midpoint method, because (a + b) / 2 just works. (For example, you have (Integer<A, B> + Integer<A, B>)/Integer<2, 2>Integer<2*A, 2*B> / Integer<2, 2>Integer<A, B>, and thus you end up with the same type you started with, with the middle widened just enough to be correct. And similarly with more complicated kernels like (a + b + c) / 3, which don't work with just bitwidths.) And from start: Integer<A, B> and last: Integer<C, D> you can make a RangeInclusive that iterates over Integer<A, D> -- where you'll notice that B and C are interestingly irrelevant, which is great in combination with the literal rule because then x..=10 will always iterate Integer<_, 10> in a way that the compiler can actually know.

You can then take that even further by using it in array indexing, for example, so there's a type system check that the calculation you did is in-bounds.


All that said, much as I like it, I don't know that it'd be feasible to move Rust to that.

9 Likes

I was responding to the concern that you specifically brought up as a problem with the proposal, since there was already an idea for solving it. Anyway, as I mentioned before, low-level things like emulators/VMs and compilers are cases that are made particularly hard to read as they need to store very compact values which actually operate on larger values. In fact, anything that's memory optimized likely faces this a lot when trying to use small types in certain places. And that's the key- you might think that a type should be changed farther back, but if the type was chosen for a reason, that's not it. So, this is about the ergonomics of that. Here is an example of a simple VM: Rust Playground. Many of the casts in this code are a formality and somewhat hurt readability.

If implicit widening existed, it could become a bit cleaner:

Instruction::Xori {dst, lhs, imm} => {
	self.registers[dst] = self.registers[lhs] ^ imm;
}
Instruction::Addi {dst, lhs, imm} => {
	self.registers[dst] = self.registers[lhs].wrapping_add_signed(imm);
}
Instruction::Maddi {dst, lhs, rhs} => {
	self.registers[dst] = (self.registers[dst] as i32)
		.wrapping_mul(mul_imm)
		.wrapping_add(add_imm) as u32;
}
Instruction::Imm {dst, imm} => {
	self.registers[dst] = imm;
}
Instruction::Load {dst, ptr, offset} => {
	let addr = self.registers[ptr].wrapping_add_signed(offset) as usize;
	self.registers[dst] = self.ram[addr] as u32
		| (self.ram[addr + 1] as u32) << 8
		| (self.ram[addr + 2] as u32) << 16
		| (self.ram[addr + 3] as u32) << 24;
}

Part of this is the indexing. That part could be improved before by adding getters and setters that take u8, but those would be two functions that do nothing but work around the ergonomics, so I don't find that to be as good as being able to reference the array directly. I also feel like it's not a great solution in general because it reduces consistency. If, say, you actually have a usize for it at another point, or you used a literal, you might want to use the array directly. You can't use += and etc, or grab a reference, without adding more helper functions.

The other part is the arithmetic, in particular when the immediate operands are brought in. With these simple instructions, those are a small part of the code, but this is the focus to me. They always need to be extended, which is again only a formality, as they had the correct type and the intention of the operations is obvious.

Note how some explicit casts are still there. The as i32 and as u32 are needed because the signedness is changing, which needs to be done with care. Then, as usize is needed because converting u32 to usize isn't considered lossless due to a technicality. Then, the final as u32 part runs into that situation where an inner smaller operation is immediately converted. If the language propagated the coercion into the operands, the cast wouldn't be needed and the result would be correct without it. Otherwise, see the last part of this post.

Instruction::Store {src, ptr, offset} => {
	let addr = self.registers[ptr].wrapping_add_signed(offset) as usize;
	let value = self.registers[src];
	self.ram[addr] = value as u8;
	self.ram[addr + 1] = (value >> 8) as u8;
	self.ram[addr + 2] = (value >> 16) as u8;
	self.ram[addr + 3] = (value >> 24) as u8;
}

This is the opposite. This shows as being required for lossy casts that need to be done with care. Without a bunch of lossless casts around that also need it, I find these a bit easier to spot. This is also the case with the signedness changing in the previous part.


The type argument is explicit, so shouldn't it be impossible for it to become u64? If it were inferred, then it would just be u64. In any case, this has to be limited to arithmetic. When the user uses + on a primitive instead of strict_add or wrapping_add, they are essentially signing a contract that it won't overflow (or they don't care what happens if it does), in exchange for convenience. If the operands were extended, we know the resulting value will be correct, because that's how those numbers work, and this would be part of that convenience. Since none of this reasoning makes sense with arbitrary functions (as opposed to arithmetic operators on known primitive types), they shouldn't be affected. Arbitrary types shouldn't get affected either.


timeout_millis would be inferred as u32, not u16, because the conversion would be propagated backwards into it. (If this feels weird to you, see below.)


To be clear, the goal here isn't to justify doing it a certain way, it's to find some way to allow this in certain situations where it can improve ergonomics while still avoiding those problems, and change the proposal accordingly.

The idea behind propagating conversions backwards into operands was that it would preserve the correct values and as far as I can tell it can't be incorrect or surprising due to how those operations work. However, another possibility is to use the mechanism for that to simply error. For example, takes_u32(1_u8 + 1_u8) would just error.

let a: u8 = 1;
let b = a + 2;
let c: u32 = b;

would error. The error message could be something like "result of a lower-precision operation was extended implicitly, please convert explicitly" (but better worded). So instead of solving the surprising cases, it would just catch them and tell you. Would this be better? We also have to think about if there are any more edge cases to ensure they are caught.

While this is an interesting idea, what assembly will this generate? Will this be a efficient (especially for u64::midpoint or u128::midpoint)?

Real CPUs support a handful of fixed sizes. But they have overflow flags, which high level languages don't generally expose. But relying on flags can introduce a data dependency. Which modern CPUs might be smart enough to deal with. So I could see this going either way, depending on how smart LLVM is.

1 Like

If the types just grow whenever overflow is possible, then it can simply be compiled as the next supported type. If u16 + u16 = u17, then you can just calculate it as u64 or whatever you have, because there won't be overflow anyway.

I actually don't think implicit casts are going to help as much as you think in the example code, but perhaps. What would definitely help are some array-of-byte style accessors, either on your ram's impl or elsewhere, but the language does come with some by default.

This could be written:

Instruction::Store {src, ptr, offset} => {
    let addr = self.registers[ptr].wrapping_add_signed(offset) as usize;
    let value = self.registers[src];
    self.ram[addr...addr+4].copy_from_slice(&value.to_le_bytes());
}

Likewise

self.registers[dst] = u32.from_le_bytes(ram[addr...addr + 4].try_into().unwrap());

The ergos of that try_into().unwrap() could also be improved, but e.g. Ram could have a store_u32/load_u32 pair that does the same.

1 Like

To be clear those snippets are the hypothetical cleaned up version. The real version with more explicit casts is in the playground link. That is indeed a cleaner way to write that part, but that's the part where I intentionally left the casts in because they would still be necessary (unless the propagation makes it in). Good catch though.

The ergos of that try_into().unwrap() could also be improved

Honestly I always think about this when I do this kind of thing (like creating a slice or having to use try_into). I feel like a "start; length" type of range is really missing from the language.

FYI this is essentially what I intend to battle-test with deranged. As soon as it's possible on stable, I'll be including

within the same primitive type. If we somehow get the ability to determine what the internal primitive type is, I could even make that implicit — essentially matching your idea with the exception of integer literals (another compiler limitation).

On a related note: in lang, we've talked about having some mechanism for making integer literals use a special type so that compile-time computations never overflow, and then coercion to the expected type will error if it's out of range.

That does seem entirely appropriate for integer literals, as opposed to specifically defined types.

4 Likes

I don't find this persuasive at all, because why are you mixing types here at all? wrapping_mul and wrapping_add are identical between i32 and u32, so I'd say that instead the program should be picking one of those two and being consistent about it, rather than converting like this.

(I'd probably pick whichever is more convenient for however immediates get extended, and use that everywhere, because whatever, with .cast_unsigned() and .cast_signed() as needed for things like sdiv or udiv. Plus conversions between i32 and u32 aren't ever explicit widening anyway...)

Similarly, the vast majority of what's happening here is just failing to use

self.registers[dst] = u32::from_le_bytes(*self.ram[addr..].first_chunk().unwrap());

(Where I use an unwrap because the indexing is also panicking if you go past the end of the RAM.)

Because maybe I'm parsing a format that's only expecting u32, and bigger numbers are supposed to fail.

Yup, LLVM already handles it. Give it

define i64 @midpoint_u64(i64 noundef %a, i64 noundef %b) local_unnamed_addr #0 {
    %aa = zext i64 %a to i65
    %bb = zext i64 %b to i65
    %cc = add nuw i65 %aa, %bb
    %dd = lshr i65 %cc, 1
    %d = trunc nuw i65 %dd to i64
    ret i64 %d
}

define i128 @midpoint_u128(i128 noundef %a, i128 noundef %b) local_unnamed_addr #0 {
    %aa = zext i128 %a to i129
    %bb = zext i128 %b to i129
    %cc = add nuw i129 %aa, %bb
    %dd = lshr i129 %cc, 1
    %d = trunc nuw i129 %dd to i128
    ret i128 %d
}

and you get https://llvm.godbolt.org/z/zK7ac84Er

midpoint_u64:
        mov     rax, rdi
        and     rax, rsi
        xor     rdi, rsi
        shr     rdi
        add     rax, rdi
        ret
midpoint_u128:
        mov     rax, rdi
        add     rax, rdx
        adc     rsi, rcx
        setb    cl
        shrd    rax, rsi, 1
        movzx   edx, cl
        shld    rdx, rsi, 63
        ret

For u64 that's the classic &+^+>> trick; for u128 it uses the carry flag and multi-precision shifts. A great result.

4 Likes

No, no, those snippets were supposed to be the hypothetical cleaner version of the code (which doesn’t compile). The full code was in the playground link. You're right those were unnecessary, in fact it seems I had already removed one of them in the playground and forgot to update the snippet, but anyway, those were examples of casts that would remain, not casts that would be removed.

An example from that sample again, but with both versions shown:

Instruction::Xori {dst, lhs, imm} => {
	self.registers[dst as usize] = self.registers[lhs as usize] ^ imm as u32;
}
// to
Instruction::Xori {dst, lhs, imm} => {
	self.registers[dst] = self.registers[lhs] ^ imm;
}

Part of the improvement is the indexing. The other part is imm being used in the operation directly, which is the focus to me.

Yes, exactly. You specified u32, so it will be parsed as u32 (and then extended if you need it to be). If you left it up to inference, then it would be inferred as u64. That’s just how it works right now. I don't want to change anything about that.

Indeed. An example issue with unsizing coercion was just recently discussed here.

I'd say any implicit coercion is a potential source of very nasty bugs, and there should be very, very strong reasons to include it. Like, massive ergonomic wins all over the code. Not just a slight simplification in a few places where the types were sloppy to begin with.

2 Likes

I think the meta-disconnect here is around whether the code that was written was correct.

I actually completely agree that if the code is correct, then implicit widening is great. (Similarly, global inference is wonderful when the code is correct, like if you design a language to have terse final code for inclusion in an academic paper.)

The problem is that most of the code I write is at least subtly wrong at first, and these inconsistencies are at least as likely to be me having written the wrong thing as they are to be because I really meant it.

If I wrote .parse::<u32>()? then give it to something that wants u64, I think it's good for the compiler to take that moment to say "hey, which thing did you mean?" -- particularly if there's an easy way for it to offer me both fixes in a way I can trivially apply.

14 Likes

I gave up on the idea because the reactions have been negative and the latest arguments against it have been reasonable. I believe this thread has fulfilled its purpose of finding out what exact issues, if any, this idea would cause.

I would then like to see the abilities to index arrays with arbitrary types and perform arithmetic between different types, like what was mentioned by @toc, although not necessarily like that. I believe those abilities would be a better way to improve the ergonomics like I wanted without introducing any of these issues. The problem then becomes that, although everyone seems to want those, the ways to make them happen have been stuck in bikeshedding for a long time. But that's its own topic.

1 Like

One question worth asking there: do you want the ability to index the same array with many different types, or do you want the ability to define an array indexed by one specific type that isn't usize?

The former has some of the same problems as coercion.

The latter, on the other hand, seems extremely reasonable and worth pursuing.

2 Likes

The biggest problem I know is that the transition to do it for indexing is hard, because it needs a fallback, and fallback is historically just fundamentally hard. (Distinct from just bikeshedding, since it's not just "well if someone decides it's done under the current name it could ship".)

I really want this for Vec in particular. rustc takes huge advantage of IndexVec which could be even better if it stored u32s instead of usizes.

1 Like