`u32` as a second fallback type

Isn't that the same argument that is made against bounds-checking array accesses in C++?

The optimizer would be able to optimize most instructions, just as it does for array accesses (certainly when you shift by a constant, which is the most common case).

For performance in those rare cases when squeezing out performance is more important than safety and the optimizer can't figure out the range there could be a special cpu-specific (or not cpu-specific) intrinsic for "shift by x modulo 32".

The material difference is the memory safety implications. Unchecked indexing can violate memory safety. Shifting being masked can only be a logic error.

Uh? Yes, that's what I meant?

When I shift a u32 by a runtime-variable number of bits, much of the time 32 is actually a valid expected value - often the valid range is 0-32 inclusive. So then on multiple occasions I actually have had to special-case the shift by 32 manually -- which probably results in worse performance than what the compiler could do automatically if it had to handle it.

For instance, just the other day, I wanted a bitmask of n bottom bits. One way to do it is u32::MAX >> (32-n). Another way to do it is (1<<n).wraping_sub(1). But both required a special case because of this problem: the first one for n=0, the second one for n=32.

1 Like

Sorry, I thought you were kind of referring back to my initial comment about the typical realistic sizes of stack-based arrays in Rust.

The problem is that would only work on arrays that were exactly 256 elements long, as opposed to everything up to that length.

"User defined types" aren't really useful in many cases when you're actually trying to declare something like a concrete length field in a struct and don't want it to be unnecessarily large (in terms of size in bytes).

For example, in my crate staticvec, I have no real choice but to declare the basic struct like this:

pub struct StaticVec<T, const N: usize> {
  data: MaybeUninit<[T; N]>,
  length: usize,
}

I wouldn't even mind manually doing the conversions to usize for various stuff in the internal implementation involving length if it were possible to declare length as just "whatever data type actually makes sense for an array that has N capacity", but there's no way of doing that either.

1 Like

"Let's just add these impls" doesn't quite work, because making 10000 codebases give this error:

 |     println!("{}", a[i]);
 |                    ^^^^ `[T]` cannot be indexed by `i32`

isn't great. There are a lot of ways to make this work! Like an all-encompassing from-first-principles numeric stack, or default type hinting (as in, make the linked code sample not guess i32), or hacky compiler type defaults, or [fill in the blank]. My guess is that default type hinting (post-chalk?) is the easiest path, and this discussion can be more-or-less put off until that's a thing.

3 Likes

TBH I don't really get what your overall point is intended to be. Your playground link fails specifically because of the extremely contrived ambiguity you've created there (within the context of a single module) with (honestly nonsensical) Index implementations specifically for struct A. If you comment out either of the conflicting-for-very-obvious-reasons trait impls, the code compiles correctly.

Beyond that, the fact that rustc gives an error at all there for whatever reason (despite two index impls that output the type it's claiming is invalid) is an implementation detail (and possibly a bug or at least probably not the desired outcome) that is not all that relevant to the discussion here in my opinion. The logical behavior there would IMO be for rustc to either say "hey, your trait impls conflict in a super obvious way, fix that", or just return 42 from whichever one it feels like in the event it actually fully knows that their final concrete output is completely identical.

You seem to have missed the point that I was talking about defaulting to usize specifically for "primitive" arrays (or anything else where usize is "the universal norm" at basically the language level) in the event that the provided index doesn't have a named explicit type, also, not general indexing for end-user types.

It's possible this would require some level of "compiler magic" that isn't currently applied to certain things, but to me that seems fine in the event that it's enough of an ergonomic win while amounting to "a thing the compiler absolutely could do relatively easily, but doesn't do currently." I don't think "Compiler magic is bad because it's bad and not for practical / tangible reasons I can explain", as I've sometimes seen at least implied, is a reasonable argument.

Not that I'm trying to argue my opinion is hugely relevant in the grand scheme of things, of course, haha.

Last edit: to be very clear, when I said I'd be fine with "manually doing the conversions to usize" in my other comment, I meant literally doing them on the fly for anything that required it. Not implementing Index or any other trait.

The observation that it would compile just fine if one impl was removed is his point as I understood it. Well, in reverse: if such a second impl was added, the errors would appear. It is unclear to me what you mean by conflicting impls; there's no problem with the impls, the error is with type inference. If you add an impl Index<i32> for A, it once again compiles, and uses that impl because i32 is the assumed default integer type.

I do think the error could be more helpful by pointing out that impls for usize and u64 do exist and if the programmer wanted one of those they should specify the type one way or another.

2 Likes

The thread quickly veered to other topics. But back on "u32 as a second fallback type" idea: I understand the objection that fallbacks go poorly with the type system, but isn't this already the case with i32? It is already dependent on the details of where the fallback is applied first etc. Does the issue become much worse if you add a second fallback?

No, because it's not a staged thing. After inference runs, it just substitutes every unconstrained type parameter with i32, and if that doesn't work it fails to compile. So defaulting to i32 can't make something else that was otherwise unknown be u32.

I see. Here is a way I think this process could be extended:

  1. Constrain every integer literal to the {integer} trait.
  2. Perform type inference.
  3. Constrain every literal that is still ambiguous to a trait that is "i32 or u32".
  4. Continue type inference.
  5. Constrain every literal that is still ambiguous to i32.
  6. Continue type inference.

My experiments show that the {integer} trait is today treated somewhat specially, in that if the intersection of {integer} and some other trait constraint results in a single type, then that type is deduced. This doesn't seem to work for intersecting any arbitrary two traits (maybe it will in the future with the chalk type system rewrite?). The "i32 or u32" trait would also have to behave like that.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.