Transitioning to MIR

Perhaps. Do you have a more specific proposal in mind? Figuring out just what is being borrowed does require at least a bit of calculation over the HIR, as mem_categorization does today.

Thinking about it, references to fields of packed structs is an lvalue problem (patterns have their own different set of problems), so it may make sense to compute safety during lowering, use that to add an unsafe bit to MIR, and do the check later.

1 Like

I’m thinking that a general transformation pass pipeline for MIR functions would be useful. There are a few optimisations that can be done that LLVM struggles with because we can’t quite encode the correct semantics in.

I think that the basic passes we need are: drop optimisation, overflow check insertion, inlining (maybe), and some sort of copy elimination based on dataflow (which should cover NRVO at the same time).

2 Likes

I should probably spell out my thoughts here more. @arielb1 and I had a brief chat on IRC the other day and it seemed like we had slightly different thoughts so it’s worth discussing more widely. My general feeling is that we should expect to do optimizations on the MIR when it makes sense, and that after the static safety checking is done, we may want to edit the MIR in ways that would no longer type check (but which we believe to be sound).

I also think that, whenever it makes sense, we should do optimizations on the MIR rather than doing it “on the fly” during trans. I believe this will result in more readable code (because the MIR is simpler); more portable optimizations if we ever move beyond LLVM; and more debugable results, because we can easily dump the MIR at any point and observe what has been done (this is always hard with LLVM output, imo).

Some examples of things I would like to do on the MIR:

  1. Expose the dynamic drop flags as MIR temporaries. After borrowck is done, rewrite DROPs to drop only the paths they need to, and to take a boolean argument that is either a constant true or else a stack variable in the cases where dynamic drop is required.
  2. Constant propagation, particularly as it is useful for deciding what to promote (in LLVM) to static values etc.
  3. Removing temporaries and stripping out other assignments that serve no logical role (e.g., zero-sized types). However, we must be cautious around destructors here: that said, because calls to DROP and so forth are already inserted explicitly in the MIR, we may not have to worry in particular.
    • example: rewriting tmp0 = ..; tmp1 = ..; x = Foo { f1: tmp0, f2: tmp1 }; to x.f1 = ...; x.f2 = ...;
  4. Removing no-op coercions etc.

drop flags would have to be allocas, not temporaries.

a “temporary” in MIR is just a compiler-generated variable (i.e., it IS an alloca). (Though at present we don’t assign to temporaries more than once, I don’t think.)

Wouldn't it be better to move to an even lower representation instead of declaring "now the MIR doesn't type check anymore". This representation would then have no notion of dropping, but explicitly contain the drop flag and a "drop terminator" (or even already a call to the drop glue?)

The way I see it, we ARE moving to a lower-level representation by declaring that MIR doesn't have to type-check anymore. That is, we can have two 'logical' representations that are both using the same compiler data-structures. Changing to an ACTUAL different IR is overhead -- both in terms of compiler maintenance and potentially runtime -- and I think MIR (possibly some edits over time) is at a pretty low-level of abstraction that is quite suitable for most optimizations we might do. It's worth pointing out that MIR already includes a number of bits of type-safety that have been desugared -- for example, bounds checking -- and hence which are no longer easy to check, and I don't think the things I have in mind are so very different.

However, it may be that we can devise other ways to achieve the same effect. I think @arielb1 had some suggestions along these lines. This could also be fine. I basically just don't want to push work onto trans or elsewhere. And certainly I see the potential value in retaining invariants over MIR so that we can have "sanity check" assertions -- I just don't think it outweighs the value of doing things in straight-forward ways and at the MIR level, rather than at LLVM level.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.