I am building a static analysis tool which gets LLVM IR as input. I want to know which Basic blocks in the IR map to an unsafe scope in the source program.
AFAICT this information is not present in LLVM IR and doesn't seem to be present in MIR also. I am looking for
Is there a known method to retain this information in LLVM IR?
If nothing exists, how might I might go about mapping an unsafe code block/function to it equivalent LLVM IR?
It is available at mir.source_scopes[source_scope].local_data.safety. Unsafeck has run on MIR instead of HIR or AST for a while now. There is a proposal to move it to THIR though.
DebugInfo should work. Specifically, following DILocation -> DILexicalBlock entries will get me the enclosing "unsafe" scope. Ideally, I only want to read the LLVM IR file and not the accompanying source. The solution here would be to add an "unsafe" attribute to a LexicalBlock debug info when it is generated by the compiler. Something similar will be needed for unsafe functions.
I see some properties of C++/FORTRAN functions can be marked with spFlags.
Can unsafe functions or unsafe code blocks be marked similarly? Thoughts?
That sounds like a good idea to me. I have no idea what the right way to put this in DWARF would be, though
(Maybe Rust should request a DW_AT_unsafe flag, like Fortran has things like DW_AT_recursive? Or since it's not just the one of the function would it be DW_TAG_unsafe? Or until then maybe it would make sense to set the DW_AT_name to "unsafe" on the lexical block? But today's the first time I've looked at the standard, so I could likely be completely wrong...)
The above looks like a good interim solution to me.
Do we need agreement before code is written? What is the right forum/person/team to discuss this further?