This is a rather long post, in which I will discuss the design space that local and global inference gives to a programming language. As far as I understand C++ is much more on the side of global inference while Rust prefers local inference. I would like to know if a middle ground could be applied to Rust (global inference to the current crate only, any exported items would still follow the current rules).
Let's consider consider a binary, with a dependency to the library G
, itself having a dependency to the library H
. They provide those functions:
- main, f2, f3: from the main binary
- g1, g2, g3: from a first library G
- h1, h2: from a second library H
And now, let's consider this call chain:
main -> f2() -> f3() -> g1() -> g2() -> g3() -> h1() -> h2()
- g1 and h1 and respectively the entry point of G and H (technically I think that main could be considered the entry point of the binary too).
- f3 and g3 are internal function that call function from external libraries
- f2, g2 and h2 are purely interacting with the library/binary they are from
Furthermore:
- all function are generics, and an object is passed as-it from main to h2
- h2 returns errors that are handled in main
For this example we will consider the implication of those two use-cases:
- the type of the object created in main is changed, and the internal of h2 must be modified to handle this new type
- h2 can return a new type of error that must be handled by main.
Global inference (C++)
C++ choose to use global inference for many things like error handling (exception) and generics (template error are generated after mono-morphisation).
int main() {
try {
f2()
} catch (SomeError& e) {
// do something with e
}
}
// SomeType could be a template or a C++20 concept, it doesn't really matter
// the input could also be passed by reference to allow dynamic polymorphism
auto h2(SomeType input) -> decltype(auto) {
if (some_condition) {
throw SomeError
}
// rest of the implementation
}
All the other functions would look like this:
template <class T>
auto foo(T&& input) -> decltype(auto) {
return bar(std::forward(input))
}
With foo
and bar
being any pair of function in the call stack.
Since C++ use global inference, it's really easy to apply the two proposed changes. Only main()
and h2()
needs to be modified. That's a big upside. No code need to be changed for any of the other function, since they forward everything.
The main downside however is that changing the interface (input/output types/concept), as well of the possible exception returned of h2 can break silently semver. It's because the public interface h1
re-export that interface itself. For the same reason, updating H will break silently the semver of G since g1
re-export the interface of g2
, and transitively up to f1
.
And obviously the error messages are going to be much more complicated to debug, since modifying h2
can break main
or vice-versa. What make it really bad is that the error can come from extremely far in the dependency chain.
Local Inference (Rust)
At the opposite of the spectrum, Rust uses local inference.
fn main() {
let value = 42; // 42 must implement G::SomeTrait
match f2(value) {
Ok(_) => // ...
Error(G::SomeError) => // ...
}
}
fn f2<T: G::SomeTrait>(input: T) -> Result<SomeType, G::SomeError> { /* ... */ }
// re-export of H types and traits
pub use H::SomeError;
pub use H::SomeTrait;
pub fn g1<T: SomeTrait>(input: T) -> Result<SomeType, SomeError> { /* ... */ }
fn g2<T: SomeTrait>(input: T) -> Result<SomeType, SomeError> { /* ... */ }
fn g3<T: SomeTrait>(input: T) -> Result<SomeType, SomeError> { /* ... */ }
pub struct SomeError; // implement error and possibly other traits
fn g2<T: SomeTrait>(input: T) -> Result<SomeType, SomeError> { /* ... */ }
fn g3<T: SomeTrait>(input: T) -> Result<SomeType, SomeError> {
if some_condition {
return Err(SomeError);
}
// ...
}
If we want to change the type of value
in main, we can do it, as long as it implements the trait SomeTrait
. If it doesn't, we will get a nice error message saying that the type of value
should implements it. However, if we want to modify h2
to accept a NewTrait
, we will need to modify the whole call stack (no matter if NewTrait
is a superset, a subset or unrelated to SomeTrait
).
The situation is exactly the same for the error returned by h2
. If we want to return other error type, we must propagate this in the whole call stack. This is usually mitigated to use the same enum of error for the whole crate. Using a single enum has other downsides (for example, if foo()
returns Result<_, AllErrors>
, it's possible to downcast the error to BarError
even if foo()
will never returns such error).
Local inference has two big advantages: it's really hard to break semver without noticing it (you will modify the interface of the public API of the crate), and the error messages can be as good as possible! The price to pay is a lot of work and boilerplate when one wants to change any interface.
Something in between
I would like to know if a middle ground could be considered for the design of new Rust features:
- Would it be possible to allow global inference, as long as the inference cannot cross the boundary of what is exported by the crate?
For errors, this would means that f2
, g2
, g3
and h2
could returns an anonymous enum of unknown types, but that anonymous enum should be converted to a regular enum when exported publicly in g1
and h1
. Likewise the constraint of the type accepted by h2
could be any type as long as h1
convert it to T: SomeTrait
(and likewise for g3
/g1
). The same strategy can be applied for ABI, only the public interface of the module must adhere to the ABI of the caller. Internally anything can be used.
In this middle ground, when applying the proposed modification, only main
and h2
(obvisouly), as well as g1
and h1
would need to be updated. Neither of f2
, f3
, g2
, g3
would need to be modified.
// all exported trait and types must be public and statically known like today
pub use H::SomeTrait;
pub use H::SomeError;
// public interface, everything is specified like today
pub fn g1<T: SomeTrait>(input: T) -> Result<SomeType, SomeError> {
g2(input) // error are analyzed post-monomorphisation
}
// private interface, we can forward things like in C++
// I added the following hypothetical syntax:
// T: ??? means any type
// enum... is an anonymous enum
fn g2<T: ???>(input: T) -> Result<SomeType, enum...> { g3(input) }
fn g3<T: ???>(input: T) -> Result<SomeType, enum...> { h1(input) }
Both the utilization of T: ???
and enum...
would result in post-monomorphisation errors (currently Rust as only pre-monomorphisation errors).
Not allowing global inference for any item publicly exported by the current crate guaranties that any changes to and internal function that would otherwise changes the public API (like we did in C++) would result in a compile time error.
Allowing (even if it's internal to the current crate) global inference has trade-off compared to the current situation.
To sum-up, I'm not asking for any specific feature that would require global inference, but to know if global inference as long as it's internal to the current crate is something that could be used when designing new Rust proposal.