I have been working in this area for a while, although my approach is more around defining the language around what is natural for the ‘logic’ of the type system. If you look at my GitHub i have a couple of related projects, one is a compositional type system, and the other is Clors which is my equivalent of Chalk. There is an example Clors program that implements the same type compositional type system as my C++ code in tiny amount of code. I have taken the approach of keeping unification simple, but that is because i am defining type equality as unification.
The comment i wanted to make is that Prolog solvers have a common way of dealing with the problem you encountered with associated types, and it is “deferred goals”. In this way you do not need to introduce a new constraint beyond equality to solve associated types, you just add a deferred goal for the reduction that you cannot yet solve. You then replace the type variable with a special “deferred” type variable, which triggers the re-evaluation of the deferred goal if the variable is unified with a ground type in the future. This keeps things ordered and deterministic, instead if introducing a different kind of solver (as constraint solvers have no ordering, and may never terminate).
You will also find this necessary to implement the “does not equal” constraint, and in fact it’s powerful enough to handle general inequalities when you add an integer kind into the mix.
I could probably discuss this whole topic a lot more, as i have spent years on it, most of it getting nowhere on negation. It turns out disequality is easy and safe, but the definition of negation is problematic. Lots of papers have been written on negation as failure (it’s unsound), negation as contradiction, negation as inconsistency etc. In the end the only sound option to me appears to be to rewrite the logic program using disequality, which is easy for simple clauses, but gets really tricky with recursion.
A final comment is i don’t think you need ‘forall’ from first order logic in the logic-language. Prolog like languages generalise variables on instantiation, in other words every clause is implicitly university quantified at the outermost position of the clause head. You can rewrite clauses that have an inner forall into a set of clauses that do not. It really depends on your approach to ‘proof’. My approach has been to keep the core logic small, so it is trivial to prove correct, and then to implement features like type classes by translation into the simple core logic. I guess i may need to reconsider to keep things neet and solving time down, but i have put this kind of decision off until i have implemented negation.
For anyone interested, here is my compositional type-inference algorithm written as a Clors program:
member(Key, cons(Head, Tail), Val) :-
dif(Key,Head),
member(Key, Tail, Val).
member(Key, cons(def(Key, Val), Tail), just(Val)).
member(Key, nil, nothing).
append(cons(H, T1), L, cons(H, T2)) :-
append(T1, L, T2).
append(nil, L, L).
unifyall(cons(T, Tail), T) :-
unifyall(Tail, T).
unifyall(nil, T).
unifyeach(cons(Tapp, Tail), Tdef, Cxt) :-
duplicate_term(Tdef, pair(C1, Tapp)),
unifyeach(Tail, Tdef, Csub),
append(Csub, C1, Cxt).
unifyeach(nil, T, nil).
split(X, cons(def(Y, T), Tail1), Tail2, cons(def(Y, T), Tail3)) :-
dif(X, Y),
split(X, Tail1, Tail2, Tail3).
split(X, cons(def(X, T), Tail1), cons(T, Tail2), Tail3 ) :-
split(X, Tail1, Tail2, Tail3).
split(X, nil, nil, nil).
#-----------------------------------------------------------------------------
nat(zero).
nat(succ(X)) :-
nat(X).
bool(true).
bool(false).
#-----------------------------------------------------------------------------
# type inference
expr(var(X), cons(def(X, Type), nil), Type).
expr(nat(X), nil, nat) :-
nat(X).
expr(bool(X), nil, bool) :-
bool(X).
expr(pair(X, Y), C3, prod(A, B)) :-
expr(X, C1, A),
expr(Y, C2, B),
append(C1, C2, C3).
expr(app(Fun, Arg), Cxt, B) :-
expr(Fun, C1, arrow(A, B)),
expr(Arg, C2, A),
append(C1, C2, Cxt).
expr(lam(Var, Body), Cxt, arrow(A, B)) :-
expr(Body, C1, B),
split(Var, C1, Def, Cxt),
unifyall(Def, A).
expr(let(Var, Body, Rhs), Cxt, T2) :-
expr(Body, C1, T1),
expr(Rhs, C2, T2),
split(Var, C2, Def, C3),
unifyeach(Def, pair(C1, T1), C4),
append(C3, C4, Cxt).
:- expr(
let(cast2, lam(f, lam(x, lam(y, app(app(var(f), var(x)), var(y))))), var(cast2)),
Cxt, Typ).
:- expr(lam(x, let(y, var(x), var(y))), C, T).
:- expr(let(id, lam(x, var(x)), var(id)), C, T).
Its worth noting this is a pure prolog like logic language. The interpreter has no built-in definitions except ‘dif’ which is the disequality constraint, and ‘duplicate_term’ which takes a fresh instantiation of the first argument (with fresh type-variables) and unifies it with the second argument. At the top are the four helper definitions needed for the algorithm, and then we define a couple of primitive types (Bool, and Nat). Finally the algorithm itself which is very short and neat, and some examples. This algorithm infers compositional principal typings. Because it is bottom-up variable capture is not a problem for the algorithm, and we do not need to add first class names or other complexities to cope with variable capture.