Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for AbstractDifferentiation.jl / DifferentiationInterface.jl #37

Closed
gdalle opened this issue Aug 7, 2023 · 11 comments
Closed
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@gdalle
Copy link
Contributor

gdalle commented Aug 7, 2023

AbstractDifferentiation.jl is an interface that makes it easier to call various autodiff backends with the same syntax. I think it would be nice to add bindings as an extension to FastDifferentiation.jl. I can even give it a shot if you agree.

@brianguenter
Copy link
Owner

brianguenter commented Aug 7, 2023

I'm not going to say no to volunteer labor!

I looked at the docs for AbstractDifferentiation.jl. It shouldn't be hard to implement but making it efficient will be tricky because this interface seems to be designed for code interpreters. FD is fundamentally a compiler.

A naive implementation would have FD generating and factoring the derivative graph and compiling the runtime generated function at each call to gradient, etc.

This would be ludicrously slow. Also, there is a single IdDict global expression cache which is not thread safe. You can't multithread calls to jacobian, hessian, etc. This is a documented limitation of FD, first sentence on the Examples doc page. Maybe it needs to be more prominent.

Someone using the AbstractDifferentiation interface is very likely to call gradient in multithreaded code. To make this work you'd need to compute the runtime generated functions once and cache them, in a thread safe way, something like this:

function AD.gradient(ab::AD.FastDifferentiationBackend, f, xs...)
       func = get(RUNTIME_FUNCTION_CACHE,f,nothing)
        if func === nothing
            #make function and add it to cache
       end
      return func(xs)
end

My assumption in writing FD was the typical workflow would be something like:

  • compute symbolic derivatives in single threaded preprocessing step
  • generate runtime functions, also in single threaded preprocessing step
  • use generated runtime functions in fast multithreaded code

This will only mesh well with AbstractDifferentiation if the cache for the runtime generated functions is efficient and thread safe. Seems like you'd need to put a lock on the global EXPRESSION_CACHE IdDict variable to prevent expression corruption.

IdDict's are not thread safe in Julia. There is ThreadedDicts.jl but not sure what its performance would be like, probably slow, also not sure if it supports IdDicts.

If you can make the caching efficient for multithreaded code then I'm for it. Without caching I'd rather not support the interface - FD would be unusably slow when accessed through the interface.

I believe Enzyme has this same problem. They can't be recompiling derivative functions every time they are called, there must be a cache. Maybe they figured out how to make this efficient and we could mimic them. I've pinged Billy Moses to see if they've got a solution they can share.

I believe the pushforward is equivalent to the Jv function, and the pullback is equivalent to J'v.

Give it a shot and don't hesitate to ping me if you have questions.

@gdalle
Copy link
Contributor Author

gdalle commented Aug 8, 2023

Okay that makes plenty of sense, looks a bit high risk low reward for now

@brianguenter
Copy link
Owner

should we clost this issue then?Or should I move it to discussions, where it would be easier to access when we decide to do this in the future? By the way thank you for fixing the doc strings. That was a lot of work.

@gdalle
Copy link
Contributor Author

gdalle commented Aug 8, 2023

A discussion sounds good! I have high hopes for AbstractDifferentiation but it needs more love anyway

@brianguenter brianguenter added enhancement New feature or request good first issue Good for newcomers labels Aug 8, 2023
@gdalle
Copy link
Contributor Author

gdalle commented Aug 8, 2023

And you're welcome for the docstrings. I was mostly scratching an itch ^^

@gdalle gdalle changed the title Support for AbstractDifferentiation.jl Support for AbstractDifferentiation.jl / DifferentiationInterface.jl Mar 17, 2024
@gdalle
Copy link
Contributor Author

gdalle commented Mar 17, 2024

@brianguenter I'm happy to announce that FastDifferentiation.jl is now supported by my new package https://github.com/gdalle/DifferentiationInterface.jl

You can check out the implementation at https://github.com/gdalle/DifferentiationInterface.jl/blob/main/ext/DifferentiationInterfaceFastDifferentiationExt/allocating.jl. There is still some performance to be gained, but most of it is there

@gdalle
Copy link
Contributor Author

gdalle commented Mar 17, 2024

In particular, the package is designed around a 2-step "prepare, then differentiate" paradigm. In this setup, it makes a lot of sense to generate the runtime functions during "prepare", and use them during "differentiate", so that's what I did.

@rambunctiousapple
Copy link

That's great, always happy when somebody uses my code.

@gdalle
Copy link
Contributor Author

gdalle commented Mar 18, 2024

Not only that, but I think it might make it easier for more people to use your code, because now it is as simple as switching the backend in DifferentiationInterface.
I have contributed to some issues on FastDifferentiation for the relevant features to get truly optimal performance.

Side note: did you change your user name?

@brianguenter
Copy link
Owner

I have separate github accounts for personal and professional use. Accidentally responded to your message while I was logged in on my personal account.

@gdalle
Copy link
Contributor Author

gdalle commented Apr 1, 2024

I think we can close this issue since

  • FastDifferentiation.jl is now fairly well supported by DifferentiationInterface.jl, although performance is not yet optimal (Change make_function to allow for multiple vector input arguments #39 would be useful there)
  • supporting it in AbstractDifferentiation.jl would require a change of paradigm that allows preparation of the executable, so I'm not sure it will happen soon

@gdalle gdalle closed this as completed Jun 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants