`u32` as a second fallback type

tczajka · March 12, 2021, 10:50pm

Isn't that the same argument that is made against bounds-checking array accesses in C++?

The optimizer would be able to optimize most instructions, just as it does for array accesses (certainly when you shift by a constant, which is the most common case).

For performance in those rare cases when squeezing out performance is more important than safety and the optimizer can't figure out the range there could be a special cpu-specific (or not cpu-specific) intrinsic for "shift by x modulo 32".

scottmcm · March 12, 2021, 10:57pm

The material difference is the memory safety implications. Unchecked indexing can violate memory safety. Shifting being masked can only be a logic error.

zackw · March 12, 2021, 11:31pm

Uh? Yes, that's what I meant?

tczajka · March 12, 2021, 11:46pm

When I shift a u32 by a runtime-variable number of bits, much of the time 32 is actually a valid expected value - often the valid range is 0-32 inclusive. So then on multiple occasions I actually have had to special-case the shift by 32 manually -- which probably results in worse performance than what the compiler could do automatically if it had to handle it.

For instance, just the other day, I wanted a bitmask of n bottom bits. One way to do it is u32::MAX >> (32-n). Another way to do it is (1<<n).wraping_sub(1). But both required a special case because of this problem: the first one for n=0, the second one for n=32.

SlightlyOutOfPhase · March 12, 2021, 11:49pm

Sorry, I thought you were kind of referring back to my initial comment about the typical realistic sizes of stack-based arrays in Rust.

The problem is that would only work on arrays that were exactly 256 elements long, as opposed to everything up to that length.

"User defined types" aren't really useful in many cases when you're actually trying to declare something like a concrete length field in a struct and don't want it to be unnecessarily large (in terms of size in bytes).

For example, in my crate staticvec, I have no real choice but to declare the basic struct like this:

pub struct StaticVec<T, const N: usize> {
  data: MaybeUninit<[T; N]>,
  length: usize,
}

I wouldn't even mind manually doing the conversions to usize for various stuff in the internal implementation involving length if it were possible to declare length as just "whatever data type actually makes sense for an array that has N capacity", but there's no way of doing that either.

toc · March 13, 2021, 3:59am

SlightlyOutOfPhase:

toc:
> let i = 2;
let x = xs[i];
> ```
I think in that case it always would be inferred as usize , even if it were otherwise possible to use not- usize for indexing in more explicit contexts.

"Let's just add these impls" doesn't quite work, because making 10000 codebases give this error:

 |     println!("{}", a[i]);
 |                    ^^^^ `[T]` cannot be indexed by `i32`

isn't great. There are a lot of ways to make this work! Like an all-encompassing from-first-principles numeric stack, or default type hinting (as in, make the linked code sample not guess i32), or hacky compiler type defaults, or [fill in the blank]. My guess is that default type hinting (post-chalk?) is the easiest path, and this discussion can be more-or-less put off until that's a thing.

SlightlyOutOfPhase · March 14, 2021, 2:16am

TBH I don't really get what your overall point is intended to be. Your playground link fails specifically because of the extremely contrived ambiguity you've created there (within the context of a single module) with (honestly nonsensical) Index implementations specifically for struct A. If you comment out either of the conflicting-for-very-obvious-reasons trait impls, the code compiles correctly.

Beyond that, the fact that rustc gives an error at all there for whatever reason (despite two index impls that output the type it's claiming is invalid) is an implementation detail (and possibly a bug or at least probably not the desired outcome) that is not all that relevant to the discussion here in my opinion. The logical behavior there would IMO be for rustc to either say "hey, your trait impls conflict in a super obvious way, fix that", or just return 42 from whichever one it feels like in the event it actually fully knows that their final concrete output is completely identical.

You seem to have missed the point that I was talking about defaulting to usize specifically for "primitive" arrays (or anything else where usize is "the universal norm" at basically the language level) in the event that the provided index doesn't have a named explicit type, also, not general indexing for end-user types.

It's possible this would require some level of "compiler magic" that isn't currently applied to certain things, but to me that seems fine in the event that it's enough of an ergonomic win while amounting to "a thing the compiler absolutely could do relatively easily, but doesn't do currently." I don't think "Compiler magic is bad because it's bad and not for practical / tangible reasons I can explain", as I've sometimes seen at least implied, is a reasonable argument.

Not that I'm trying to argue my opinion is hugely relevant in the grand scheme of things, of course, haha.

Last edit: to be very clear, when I said I'd be fine with "manually doing the conversions to usize" in my other comment, I meant literally doing them on the fly for anything that required it. Not implementing Index or any other trait.

quaternic · March 14, 2021, 5:27am

The observation that it would compile just fine if one impl was removed is his point as I understood it. Well, in reverse: if such a second impl was added, the errors would appear. It is unclear to me what you mean by conflicting impls; there's no problem with the impls, the error is with type inference. If you add an impl Index<i32> for A, it once again compiles, and uses that impl because i32 is the assumed default integer type.

I do think the error could be more helpful by pointing out that impls for usize and u64 do exist and if the programmer wanted one of those they should specify the type one way or another.

tczajka · March 18, 2021, 3:49pm

The thread quickly veered to other topics. But back on "u32 as a second fallback type" idea: I understand the objection that fallbacks go poorly with the type system, but isn't this already the case with i32? It is already dependent on the details of where the fallback is applied first etc. Does the issue become much worse if you add a second fallback?

scottmcm · March 18, 2021, 6:21pm

No, because it's not a staged thing. After inference runs, it just substitutes every unconstrained type parameter with i32, and if that doesn't work it fails to compile. So defaulting to i32 can't make something else that was otherwise unknown be u32.

tczajka · March 18, 2021, 6:51pm

I see. Here is a way I think this process could be extended:

Constrain every integer literal to the {integer} trait.
Perform type inference.
Constrain every literal that is still ambiguous to a trait that is "i32 or u32".
Continue type inference.
Constrain every literal that is still ambiguous to i32.
Continue type inference.

My experiments show that the {integer} trait is today treated somewhat specially, in that if the intersection of {integer} and some other trait constraint results in a single type, then that type is deduced. This doesn't seem to work for intersecting any arbitrary two traits (maybe it will in the future with the chalk type system rewrite?). The "i32 or u32" trait would also have to behave like that.

system · June 16, 2021, 6:51pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Interaction of user-defined and integral fallbacks with inference language design	39	11183	March 25, 2019
Integer types internals	5	3452	March 25, 2019
Edition dependant types language design	5	406	January 31, 2025
A tale of two's complement	62	24905	March 25, 2019
Minimal acceptable integer size - i32min, u8min, etc language design	30	2780	November 14, 2020

`u32` as a second fallback type

Related topics