Adding 16-bit pointer support


#1

I maintain a port of Rust that adds support for the 8-bit AVR microcontroller (which features 16-bit pointers).

Several modifications had to be made (including modifying build system to build a minimised set of stage2 libraries - only core currently).

The biggest change in the Rust codebase however was adding support in the compiler and libraries for 16-bit pointers. I posit that this patch could be useful to others hacking in Rust, notably those porting Rust to others architectures like myself.

What is everybody’s thoughts around having this patch pushed upstream? It is about 100 lines of code, mostly adding constants and match arms which are missing.


#2

Write an RFC, is my opinion.


#3

Write an RFC, is my opinion

I realized that my wording was not clear - I only intend to upstream support for 16-bit pointers (the build system modifications, however useful, are only half-baked right now).

The patch I hope to get merge consists solely of copy-pasting constants:

pub const DTOR_NEEDED_U16: u16 = repeat_u8_as_u16!(DTOR_NEEDED); // added this
pub const DTOR_NEEDED_U32: u32 = repeat_u8_as_u32!(DTOR_NEEDED);

and match arms to handle 16-bit pointers

match target_pointer_width {
    "16" => TyInt(TyI16), // added this
    "32" => TyInt(TyI32),
    ...
}

The Rust RFC README states that things adding solely “better platform coverage” and “additions only likely to be noticed by other developers-of-rust, invisible to users-of-rust” do not require an RFC.

That said, I am happy to write an RFC if it would be preferred :smile:

This post is mostly wondering whether the compiler team thinks that adding support for something that is not even going to be used in-tree is going to be worth the (admittedly small) maintence burden.


#4

I thought LLVM didn’t have an AVR backend in-tree?


#5

I thought LLVM didn’t have an AVR backend in-tree?

It doesn’t - I took an old abandoned fork with AVR support, fixed it up, added machine code/asm support, etc. It is here.


#6

For the curious, the patch can be found here.


#7

The primary problem with a patch like the linked patch is that it adds completely untested codepaths to the compiler, so probably not appropriate as-is.

That said, refactoring code so it works for arbitrary pointer sizes would be fine.


#8

Apparently LLVM does have a 16-bit x86 backend. I’d be surprised if there wasn’t an emulator that can run such code.
This might be a viable path for testing 16-bit support in the compiler.


#9

That said, refactoring code so it works for arbitrary pointer sizes would be fine.

I agree. I will work on this.


#10

This seems somewhat comparable to other “second tier” platforms, such as FreeBSD. I’m not necessarily opposed. Some kind of testing would of course be great.


#11

I attempted to make the Rust codebase independent of pointer size - here is what I found:

We can only compare against a configuration value, not get its value

The code is littered with code such as

let width = if cfg!(target_pointer_width = 32) {
    32
} else if cfg!(target_pointer_width = 64) {
    64
}

It would be good if there were a macro such as cfg_as_integer(target_pointer_width) to solve our problems.

We cannot use the #[cfg] attribute with arbitrary expressions, such as #[cfg(target_pointer_width < 64)] (useful for iterators which could need bounds checks).

We cannot concatenate an identifier with an integer: If we could use cfg_as_integer!(target_pointer_width), we could not then concatenate it with u or i to get for example u16 to generate std::{usize,isize} implementations.

I was however able to make a decent amount of the code pointer-width independent. I’ll send a PR.

Just my two cents :slight_smile:


#12

that’s a great summary, thanks!


#13

There is the constant std::usize::BITS


#14

This would be very helpful for the MSP430(16 bit) as well. Currently, rust does not accept a target-pointer-width of 16.


#15

In the abstract, adding 16-bit support doesn’t seem like a big deal. It is just a little bit more code.

However, it actually does create some complications. For example, it would be convenient to have automatic conversion from u32 to usize without casting. However, if we support 16-bit pointers, then it is not possible to have such automatic conversion on the 16-bit platforms because usize is 16 bits for them and u32 -> u16 would be a truncating conversion.

So, then, we have to decide whether we (a) avoid adding the u32 to usize automatic conversion or (b) we say <32-bit platforms are a special case where code that relies on such conversion may not compile.

IMO, these kind of issues are best solved by option (b). In the long term, I do not think 8/16 bit platforms are going to survive as 32-bit MIPS and ARM get cheaper and cheaper. Also, most people don’t have a convenient way to do testing on <32 bit platforms.


#16

FWIW, there is already a substantial amount of usize == u16 support in rustc, significantly more than for usize == u128. Using the grep string «#[cfg(target_pointer_width = “(16|32|64|128)”)]», here delimited by « and », here are the files which handle usize == u32 and usize == u64 but not usize == u16:

  • lib_alloc/vec_deque.rs
  • libcore/tests/hash/sip.rs
  • librustc_data_structures/fx.rs
  • libserialize/leb128.rs
  • libstd/sys/windows/c.rs
  • sys_common/backtrace.rs
  • test/compile-fail/huge-enum.rs
  • test/compile-fail/issue-17913.rs
  • test/compile-fail/int-exceeding-bitshifts2.rs
  • test/run-pass/enum-discrim-autosizing.rs
  • test/run-pass/extern-types-pointer-cast.rs
  • test/run-pass/huge-largest-array.rs
  • test/run-pass/issue-2895.rs
  • test/run-pass/num-wrapping.rs
  • test/run-pass/struct-return.rs
  • test/run-pass/wrapping_int_api.rs

Edit 1 Note: A larger set of files fail to handle usize == u128. In other words, many files that do not (yet) handle usize == u128 do already handle usize == u16.

Edit 2 Here is the corresponding list of files for usize == u128. Both of these lists were generated from the rustc source as of two days after the release of 1.26.0.

  • lib_alloc/vec_deque.rs
  • libcore/fmt/num.rs
  • libcore/iter/range.rs
  • libcore/num/mod.rs
  • libcore/num/wrapping.rs
  • libcore/tests/hash/sip.rs
  • libcore/tests/iter.rs
  • libcore/tests/mem.rs
  • librustc_data_structures/fx.rs
  • libserialize/leb128.rs
  • sys_common/backtrace.rs
  • test/compile-fail/huge-enum.rs
  • test/compile-fail/issue-17913.rs
  • test/compile-fail/int-exceeding-bitshifts2.rs
  • test/run-pass/const-negation.rs
  • test/run-pass/enum-discrim-autosizing.rs
  • test/run-pass/extern-types-pointer-cast.rs
  • test/run-pass/huge-largest-array.rs
  • test/run-pass/issue-2895.rs
  • test/run-pass/num-wrapping.rs
  • test/run-pass/struct-return.rs
  • test/run-pass/vec-fixed-length.rs
  • test/run-pass/wrapping_int_api.rs

Edit 3: Correctly identified the rustc source that was grep’d.


#17

rustc_mir::interpret will also horribly die on any code that assumes a usize can be u128. You’ll essentially need to replace all uses of u64 in the librustmir/interpret folder