Blog post: an on-demand SLG solver for Chalk


#1

This blog post describes some experimental work on an alternative solver for Chalk that I think may be a better fit for rustc:

http://smallcultfollowing.com/babysteps/blog/2018/01/31/an-on-demand-slg-solver-for-chalk/

Feedback welcome!


#2

Sounds like you have a good system, except for doing union queries. Union queries are a lot like database joins. There is a lot of ink spilt on how to make databases fast, so maybe some of that will be helpful. Specifically it made me think of Worst-case optimal joins, in dataflow which roughly suggests doing away with a join plan and switching during the computation.


#3

Thanks for this informative post :slight_smile:

One thing I kept wondering about: You worry about internal run-away on queries like exists<T> { Rc<T>: Debug }, but when do such queries even come up in the compiler? I don’t feel like I want type inference to do open-ended queries like that, but I’m probably missing something.


#4

Good question!

Partly the answer is that I too do not want type inference doing open-ended queries, and for that reason I am paying special attention to them to be sure that we don’t waste too much time when they do happen.

But I guess the question is “how can they happen in the first place”, or maybe “can’t we just not make such queries”. I think to some extent the answer is yes, we can try to hold off on doing queries until the “input types” are fully resolved.

But it’s not always that simple. There are several cases where we rely on the trait solver to help us push type inference through. Some examples:

Associated type normalization. For example, consider this simple case:

impl<A, B> Foo for A
    where A: Iterator<Item = B>
{ ... }

when lowered to chalk, you get something like:

(?A: Foo) :- 
    ?A: Iterator,
    <?A as Iterator>::Item = ?B

Note that ?B is unconstrained here (of course, it will be derived from ?A).

Partially restricted impls. Similarly, we sometimes have “partially restricted” examples, like Vec<T>: Index<usize> and Vec<T>: Index<Slice<usize>>, where we’d like to be able to infer that the 3 in vec[3] is a usize (as opposed to some other kind of integer). It is initially an unconstrained type variable, so we start out with something like Vec<T>: Index<?U>, and we have to restrain ?U.

Integer literals have a bit of additional complexity here: in the compiler, they are a like a special ad-hoc kind of type variable that only unifies with integer types. That helps us here to narrow down the impls. In this SLG solver, you could model that instead with a pair of goals, like Vec<T>: Index<?U> and IsIntegral(?U), where IsIntegral is true for the integral types.


#5

Regarding the resolution of:

impl<T> Foo for T
  where T: Bar, T: Baz,

I wonder if there would be a way to alternate between T: Bar and T: Baz.

If we name those two subgoals A and B, and imagine that A has 1,000 possible answers (#A) while B has 10 possible answers (#B), then the “Oracle” way is to enumerate B’s answers and cross-check them against A’s. In this way: B generate 10 answers and A cross-check 10 answers. This gives the optimal complexity of ϴ(min(#A, #B)) (in the number of answers generated).

The other way around, A generates 1,000 answers and B cross-check 1,000 answers has a worst-case complexity of ϴ(max(#A, #B)).

So, what about simply alternating?

That is, let A produce an answer, cross-check it with B, then let B produce an answer, cross-check it with A, and keep going. In this way:

  • A generates 10 answers and B cross-check 10 answers,
  • B generates 10 answers and A cross-check 10 answers.

In fine, we end-up with a worst-case complexity of ϴ(2*min(#A, #B)).

Or, in the general cases of N sub-goals, ϴ(N*min(A[0], A[1], ... A[N-1])), which doesn’t sound so bad compared to the optimal complexity.