I would like to propose the following small addition to the language. It’s mostly quality of life improvement, but I believe for some areas it will be quite usefull.
Summary
Addition of hex literals in the form of h"00 aa cc ff", which will be
transformed by compiler at compile time to &'static [u8; N], in this case to
&'static [0u8, 170u8, 204u8, 255u8].
Motivation
Hexadecimal representation is a very common for binary data. Currently Rust has
two ways to provide byte array constants:
-
b"foo" notation, which is convinient if binary data is an ASCII string,
but becomes harder to use for general byte string with a lot of \x escaping.
- Explitict arrays:
[0x00, 0x01, ..]. It takes three times more
space compared to a pure hex notation and thus harder to read and copy-paste from external sources. Additionally its
harder to group bytes, e.g. by groups of 4 or 8.
By introducing hex literals we can improve readability and writability of code which
works with binary constants. As a side effect we will be able to make code
examples smaller and easier to read. For example:
let udp_data = h"
1111 2222
0c00 ffff
6461 7461
";
let packet = parse_udp(udp_data);
assert_eq!(packet.source_port, 0x1111);
assert_eq!(packet.dest_port, 0x2222);
assert_eq!(packet.data, b"data")
Also it will allow to copy-paste hexidicimal data directly into Rust code without
an additional transformation step.
Guide-level explanation
Literals which start with h are called hex literals. They allow to
conviniently represent byte array constants in the hexadecimal form. String
inside h"..." accepts the following characters:
- Hexadecimal characters:
0-9, a-f, A-F
- Formatting characters: unicode whitespace class characters, tab, carriage feed and return.
Formatting characters will be ignored by compiler. Hex literal must contain even
number of hexadecimal characters, otherwise it will result in a compilation
error. Usage of any other characters will result in a compilation error.
Hexadecimal string will be converted to a byte array by compiler at compile time.
Usage examples:
assert_eq!(h"00ff", &[0u8, 255u8]);
assert_eq!(h"abcdef", h"ABCDEF");
assert_eq!(h"64 61 74 61", b"data");
assert_eq!(h"
00010203 0405060708
", &[0u8, 1, 2, 3, 4, 5, 6, 7, 8]);
assert_eq!(h"
00010203
10111213
", &[
0x00, 0x01, 0x02, 0x03,
0x10, 0x11, 0x12, 0x13
]);
How We Teach This
The book will need a page which will introduce and explain all variations of string literals: "...", b"...", r"...", h"...". (and maybe something like s"..." as a syntactic sugar for "...".to_string())
Drawbacks
Additional syntax, which can be conceived by some as overly specialized for niche use-cases.
Alternatives
Using built-in macro hex!("00 ff ee") or something similar and of course doing nothing.