Additional enum tag optimizations?

abonander · June 16, 2016, 11:40am

I’m aware of the nullable pointer optimization for enums. But what about enums that need a discriminant yet still have a pointer at the beginning of each variant?

If I remember correctly, x86_64 does not use all the bits in its pointers (any other ISAs?). I don’t know if any OS uses these bits to store information, but perhaps if we wanted to be clever, we could have the compiler use those bits for the discriminant?

I understand the fundamental tradeoff, of course; you save 8 bytes at the beginning of the enum but you spend a couple extra cycles shifting and masking to actually get the discriminant every time you match on it. But in cases where the space savings are preferable, could this be a valuable optimization?

Amanieu · June 16, 2016, 12:10pm

The problem with that approach is that you will need to do this masking for all accesses of pointer types. Consider what happens if you take a reference to one of the enum fields. Now you have a &mut Box<T> but you still need to use a mask when dereferencing that reference.

lifthrasiir · June 16, 2016, 5:56pm

See also the prior discussion about enum representations. There are tons of possibilities even ignoring the pointer ~~alignments~~ representations.

abonander · June 16, 2016, 11:07pm

How so? If the ISA ignores those bits on deref, there's no need to mask them out, right?

Edit:

the AMD specification requires that bits 48 through 63 of any virtual address must be copies of bit 47 (in a manner akin to sign extension), or the processor will raise an exception.

I see what you mean now. All right, forget that idea.

osa1 · June 17, 2016, 10:04am

the AMD specification requires that bits 48 through 63 of any virtual address must be copies of bit 47 (in a manner akin to sign extension), or the processor will raise an exception.

I see what you mean now. All right, forget that idea.

Correct me if I'm wrong but that's not about alignment, alignment is about least-significant bits. This quote is about most-significant bits.

GHC is doing something like this. Evaluated objects are having non-zero first (least significant) 3 (or 2 on 32-bit systems) bits. If number of alternatives is smaller than or equal to 2^3 - 1 (or 2^2 - 1 on 32bit), the least significant 3 (2 in 32bit) bits also give the tag. So you can branch (in a case expression) without actually dereferencing the pointer.

gus · June 22, 2016, 3:49am

Indeed, I believe LLVM (and many other compilers) uses the LSB to indicate a C++ pointer is to a virtual method - so I imagine masking them out before dereferencing isn’t that expensive…

system · March 25, 2019, 8:26am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Store Option discriminant in Containing type (optimization) compiler	4	412	September 3, 2024
Optimized representations ideas (deprecated)	5	1934	March 25, 2019
Automated Data Oriented Design (DOD) transformations? language design	10	961	March 22, 2023
[Idea] Bitmask for valid pointer bits language design	5	952	March 25, 2019
Generalize 'None==null' enum optimisation.. .. 'Discriminant' trait? ideas (deprecated)	2	1376	March 25, 2019

Additional enum tag optimizations?

Related topics