Thanks very much @nrc. Your link was very useful.
I think that RFC can work quite well with bit-data. The one amendment would be that you should be able to place tag patterns “inside” bit-fields.
The reason for having this feature is that it is not always entirely clear-cut where to put the tag. For instance, imagine an i86 machine code specification. Sometimes, the “tag” (determining the instruction type) is 1-bit, but sometimes it involves several bits.
Another example would be tricks used in the representation of dynamicaly typed languages. If pointers are forced to be 8-byte aligned, then you can use those 3 free bits to “tag” values differently:
|XXX....XXXXX|0| A 63-bit integer
|XXX....XXX|001| Symbol-table entry
|XXX....XXX|011| An inline string
|XXX....XX|0101| Nil/False
|XXX....XX|1101| True
|XXX....XXX|111| A 64-bit pointer
I imagined this possible by adding “tag-specifiers.” They would comprise bit-patterns that would be set to determine and decode the meaning of the particular set of bits. I proposed the syntax
{field-name}: {bit-type} = {bit-value}
As an example, here’s the same example used in the RFC, but using only 32-bits while sacrificing a single bit of precision.
// Compact value storing 31-bit unsigned integers or
// floating point with 1-less mantissa bit.
union Value {
u: bitdata U { tag: u1 = 0, v: u31 },
f: bitdata F { tag: u1 = 1, s: u1, e: u8, m: u22}
}
As you can see, the first bit indicates what the value refers to. If the first bit is 0, then it is a u31, otherwise it is a 31-bit floating point number. Because of this, you can determine if the U-arm or the F-arm should be taken:
// Regular IEEE representation of floating point
bitdata Fp32 : f32 {
sgn: u1, exp: u8, mnt: u23
}
fn is_zero(v: Value) -> bool {
// This match is possible, due to the tag specificiation in the bit-data.
match v {
U { v: 0 } => true,
F { s, e, m } => Fp32 { sgn: s, exp: e, mnt: m<<1 } == 0.0
}
}
Notice that the carrier type for Fp32 is f32, so the constructed value can be used directly as a f32. Also, note that the test works for the IEEE floating point -0.0.