The operator `as`: statistics of usage


#1

The operator as can do a lot of things and it’s good to know how often it is used in differrent contexts. I particular, the numbers would be useful as a basis for the discussion about Implicit widening, polymorphic indexing, and similar ideas. So, here’s some statistics on usage of the operator as in the Rust codebase. (The data itself can be found here.)

as: 5485

Total number of occurrences of the operator as.

as usize: 436

Casts to usize(uint) not in context of indexing, usually widening.

as *const/mut T: 405

Casts to raw pointers.

as u64/i64: 322

Widening casts to u64/i64 to represent big numbers, often from usize(uint). Sign changing casts u64 <-> i64

as libc::*/c_*/*_t/DWORD etc: 263

Casts in FFI context.

as u32/i32: 195

Why do people cast numbers to u32/i32? Who knows.

use x as y;: 183

use and crate imports with renaming.

as i8/u8/i16/u16: 181

Casts to small integral types, predominantly narrowing.

as isize: 164

Casts to isize(int), mostly indexes/sizes to offsets.

as $T: 131

Conversions in macros x as $T, whatever T means.

num.rs: 126

Lots of tests in libstd/num.

a[i as usize]: 91

Conversions to usize(uint) directly in the indexing context.

as &Trait/&mut Trait/Box<Trait>: 62

Casts to trait objects.

as char: 51

Interpreting bytes as characters.

as Something: 48

Everything else.

as f64/f32: 46

Conversions to floating point numbers, from integers and from each other.

<T as Trait>: 38

Universal function call syntax.


The problem with array/slice/vector indexes
On Casts and Checked-Overflow
Implicit widening, polymorphic indexing, and similar ideas
#2

The most surprising detail is how little direct impact the polymorphic indexing would have on the Rust codebase.


#3

Just a little analysis on https://github.com/petrochenkov/rust-as/blob/master/usize.txt. The purposes are usually actually,

  • Use in shift a << usize or a >> usize, before things like a << u32 is now possible
  • Convert raw pointer to integer,
  • Convert char to integer, mainly for parsing integers (and the target is a usize),
  • Convert enum to integer, mainly used in librbml for serialization (but wouldn’t a fixed-size int more suitable?),
  • Compare (==, <, …) with len() or size_of::<T>,
  • Use as a length or capacity of something (e.g. Vec::with_capacity(x as usize), repeat(N).take(x as usize))
  • Use as an index outside of indexing context (e.g. objs.remove(x as usize))
  • The API just requires a usize, though these could be changed after a review (e.g. chmod(path, x as usize))
  • Some other purposes, but likely due to needing to unify with a variable having one of the above properties.

(Also, I see some of the cast to i32/u32 are due to deserialization from [u8], or char as u32 for Unicode code-point operations, or just widening/truncating for API requirement)


BTW, 8 lines incorrectly categorized into usize.txt, e.g.

281: ./librustc_llvm/lib.rs: LLVMAddFunctionAttribute(llfn, idx, self.bits() as uint64_t);

These should be considered widening or FFI. The regex should check for word boundaries. There are also 6 lines which should be considered used in indexing context, e.g.

180: ./libstd/sys/windows/timer.rs: match &mut chans[idx as uint - 1] {

There are just small numbers though, compared with all 436 lines.