I want to write a small compiler for a simple bytecode language. This byte code operates on a machine with some static amount of memory, and can read and write to constant offsets, see a small snippet of the approximate instruction set below. I want to execute this instruction set. For that purpose, I allocate the memory, my previous thought was for simplicity a Vec<u8>
. Now, the optimization I want to perform; Since for the duration of execution, this memory buffer won't move and all offsets are known ahead of time, I could calculate the address on which each opcode operates ahead of time and "just perform its effect". But, alas, I believe that pointer provenance will actually make this UB.
Hence the question: how would I correctly calculate the address ahead of time, but satisfy any supposed pointer provenance rules?
enum OpCode {
WriteConstU8 { addr: usize, value: u8 },
WriteConstU16 { addr: usize, value: u16 },
// ... more consts
AddU8 { addr_op1: usize, addr_op2: usize, addr_res: usize },
AddU16 { addr_op1: usize, addr_op2: usize, addr_res: usize },
// ... other opcodes
}
type MemoryU8 = *mut u8;
type MemoryU16 = *mut u16;
enum CompiledOp {
WriteConstU8 { addr: MemoryU8, value: u8 },
WriteConstU16 { addr: MemoryU16, value: u16 },
// ... more consts
AddU8 { addr_op1: MemoryU16, addr_op2: MemoryU16, addr_res: MemoryU16 },
AddU16 { addr_op1: MemoryU16, addr_op2: MemoryU16, addr_res: MemoryU16 },
// ... other opcodes
}
fn execute_compiled_op(op: &CompiledOp) {
// SAFETY: we checked the alignment and OOB conditions when we passed from OpCode to CompiledOp
match op {
CompiledOp::WriteConstU8 { addr, value } => unsafe { addr.write(*value) },
CompiledOp::WriteConstU16 { addr, value } => unsafe { addr.write(*value) },
CompiledOp::AddU8 { addr_op1, addr_op2, addr_res } => {
unsafe { addr_res.write(addr_op1.read() + addr_op2.read()) }
}
CompiledOp::AddU16 { addr_op1, addr_op2, addr_res } => {
unsafe { addr_res.write(addr_op1.read() + addr_op2.read()) }
}
}
}
Full example in this playground. I know the current implementation errors under MIRI with strict prov. The question is how to fix it.