Automatic Differentiation/Differential Programming via LLVM

wsmoses · October 9, 2020, 9:54pm

I just saw your discussion on Automatic Differentiation/Differential Programming Support. Some collaborators and I just open-sourced a tool that I think would be really useful to Rust.

Specifically, we just released Enzyme a plugin for LLVM that creates high-performance gradients (technically adjoints) from LLVM IR.

One of the cool things that we found is that performing AD at the LLVM level lets us do AD after optimization which can yield significant speedups. It also means that one could AD through foreign function calls (assuming there was embedded bitcode in libraries).

There are of course certain limitations (e.g. code has to be statically analyzable [not dlopen'd], requires some type information passed as metadata, haven't implemented exception support, etc), but in practice it seems to work quite well.

Anyone here interested in helping package this for use in Rust? I've never actually used rust before but could definitely help through all the LLVM/Enzyme pieces. We also welcome anyone who wants to contribute to the main Enzyme project.

For those more technically inclined we put out a preprint here that goes into more details.

H2CO3 · October 10, 2020, 1:57am

Do I understand correctly that this would require no changes to the language itself? That would make it far superior to previous AD proposals.

wsmoses · October 11, 2020, 9:16pm

Not an expert on Rust, but I believe it wouldn't require linguistic changes to function (though some may be helpful). In essence, you would just need to call a fictitious external function that Enzyme would fill in for you during "optimization."

There are, however, some changes to the compiler that may be necessary or helpful for optimization. This includes loading the LLVM plugin, passing down some type information as LLVM metadata, and ensuring that LLVM has access to all the relevant bitcode when AD is run.

It may also be desirable to extend the language to allow the user to specify a custom gradient operation (rather than rely on the derived version). This can be useful for taking advantage of algorithmic information the user may have for optimization.

In Clang (C/C++), we do this by adding two function attributes.

// Custom forward pass
__attribute__((enzyme_augment(mygradient)))
// Custom reverse pass
__attribute__((enzyme_gradient(mygradient)))
double foo(double);

Of course if one doesn't want to enable custom gradients, the linguistic extension isn't necessary.

carbotaniuman · October 12, 2020, 3:55am

I mean we could also do stuff with proc macros to get those attributes to work, but the hard part will definitely be getting the plugin to work properly.

mcy · October 13, 2020, 3:53pm

I imagine it would be nice to be able to write #[enzyme] on functions. Unclear if this can be done with a proc-macro, though, since AIUI it requires:

loading an LLVM pluggin
passing annotations down to the IR

I'm curious what it would take to thread this kind of information through rustc...

nestordemeure · October 16, 2020, 7:50pm

The ability to indicate that you want the gradient of one function and having it and its dependencies (that might be in other crates and not written with that in mind) differentiated would be extremely powerful.

I wonder how good Rust is at forwarding user-defined information to LLVM... Would it be possible to forward information to Enzyme without a deeply modified Rust compiler ? (just adding the plugin to LLVM)

To keep expectations in check: note that you need both automatic diffferentiation and GPU tensors to get deep learning theses days. But nothing stops you from decoupling them.

wsmoses · October 17, 2020, 4:01am

Not that this is how this should be used -- but as a cute test I told rustc to run Enzyme AD successfully via the linker. In this way no compiler modification is necessary.

// test.rs
extern { fn __enzyme_autodiff(_: usize, ...) -> f64; }

fn square(x : f64) -> f64 {
   return x * x;
}

fn main() {
   unsafe {
      println!("Hello, world {} {}!", square(3.0), __enzyme_autodiff(square as usize, 3.0));
   }
}

$ rustc -Clinker=clang -Clink-arg=-fuse-ld=lld-11  test.rs -C link-args="-Wl,-mllvm=-load=$HOME/git/Enzyme/enzyme/build11/Enzyme/ClangEnzyme-11.so" -C lto -Clinker-plugin-lto
$ ./test
Hello, world 9 6!

Obviously there is a lot of benefit from the user experience side to both massage the calling convention and compilation flags, but hopefully this indicates that it should be possible without being too intrusive.

Also note that this way of importing Enzyme doesn't run it at the best place in the optimization pipeline which should be done for performance reasons.

wsmoses · October 17, 2020, 4:28am

The information most helpful to Enzyme is the type of data being loaded/stored/memcpy'd. In Clang this is often passed down as TBAA (also beneficial for optimization sake since it can prove that more things don't alias). The type metadata doesn't need to be specific to Enzyme (e.g. like TBAA).

Offhand, I'm not sure how any custom derivative metadata can be done in a non-Enzyme specific way. Perhaps it would be useful for rust to be able to have a generic way to pass metadata for other use cases like debug info? We may also be able a non-metadata approach to passing the information but I'll have to think about it more.

system · January 15, 2021, 4:29am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Automatic Differentiation/Differential Programming Support language design	12	3988	September 10, 2020
Adding Autodiff support to rustc compiler	1	756	July 11, 2022
Pre-RFC: Differential programming support language design	20	3257	March 5, 2020
Native Differential Programming Support for Rust language design	28	5960	June 20, 2019
Resolving Rust's forward progress guarantees compiler	53	6835	December 22, 2024

Automatic Differentiation/Differential Programming via LLVM

Related topics