Machine Learning in Systems
Jeff Dean (of Google fame) gave a presentation two weeks ago (slides), where he stated
Learning Should Be Used Throughout our Computing Systems
Traditional low-level systems code (operating systems, compilers, storage systems) does not make extensive use of machine learning today
This should change
Then he went on to show how they have done some experiments replacing B-trees, hashmaps, and bloom filters in database indices with neural networks. For B-trees they obtained for 5-100x fold improvement in memory consumption and ~2 fold faster lookups, and similar (if not as large) improvements for hashmaps and bloom filters.
After some more examples of machine learning applied to systems (datacenter cooling controller, distributed execution device placement), he went on to explain:
[Where should we use ML in systems?] Anywhere We’re Using Heuristics To Make a Decision!
Compilers: instruction scheduling, register allocation, loop nest parallelization strategies, …
Networking: TCP window size decisions, backoff for retransmits, data compression, …
Operating systems: process scheduling, buffer cache insertion/replacement, file system prefetching, …
Job scheduling systems: which tasks/VMs to co-locate onsame machine, which tasks to pre-empt, …
ASIC design: physical circuit layout, test case selection, …
As Rust is targeted squarely at system programming, I think we should be asking how rust as a language is well suited to machine learning if ML is likely to make (significant?) inroads into all of these domains that are squarely in rust’s sweet spot
At this point I can imagine you saying, “that’s all well and good, but that’s what we have libraries for! You’ve stumbled into the internals forum by mistake!” And that’s where we come to the second leg of this journey.
Machine Learning is a language/compiler issue
If you have not read the excellent Julia-lang blog post regarding the interplay between languages and machine learning, you should go and do that now because they’ve explained the issue (and opportunity) better than I can do here.
But in summary: most machine learning (especially deep learning) libraries today maintain a separate notion of data-flow graphs with control flow, execution and scoping semantics, and a separate execution runtime which deals with parallelism, distributed execution etc. This is all separate from that provided by the language compiler/runtime/library ecosystem. This duplication of effort causes problems such as difficulty in debugging, lack of type safety, complicated runtime machinery, complicated deployment paths etc.
Combining this trend in the first half of this post, there is a real opportunity here to take a modern language/compiler (such as Rust/rustc) and add proper first class support for machine learning primitives (tensor types and supporting machinery mainly, perhaps autodiff), then extending the compiler to reason about tensor dataflow in addition to the types it already understands.
There are many benefits that come from doing this including re-use and leverage of existing compiler optimizations, deployment/debug/profiling tooling, library ecosystem/infrastructure etc. That I can expand on further if anyone would like.
Fortunately, there is already work in this direction, see this recent paper and associated LLVM conference talk, where a staged IR is layered on top of LLVM (working up from the bottom), and then a type safe DSL is embedded into Swift from the top. It is still early and not yet open source, but I like to view this as a sign of things likely to come.
I think that Rust is well positioned to capitalize on these trends similar to the direction Swift was being taken in that talk and paper above, and I wanted to open a dialogue here,
- Does this community see the importance of machine learning in systems and the opportunity that Rust has similarly? This is an area I work closely in so I’d be happy to continue this discussion/give more supporting evidence if others disagree.
- How could rustc iteratively experiment/advance in this space? I think const generics are a really important first step (from what I understand of them), and I’d imagine that gives much of the bedrock which first class tensor types would be based on? Then I’d think we’d want MIR optimization passes operating on these tensor types and either doing all the tensor level optimizations at this point, or emitting out to a DSL like DLVM? Is it possible to register MIR passes in a library similar to procedural macros? I think that would be one way this stuff could be experimented on outside of rustc itself.