Code Monkey home page Code Monkey logo

implicitad.jl's Introduction

ImplicitAD.jl

Dev Build Status

Summary: Automate steady and unsteady adjoints.

Make implicit functions compatible with algorithmic differentiation (AD) without differentiating inside the solvers (discrete adjoint). Even though one can sometimes propagate AD through a solver, this is typically inefficient and less accurate. Instead, one should use adjoints or direct (forward) methods. However, implementing adjoints is often cumbersome. This package allows for a one-line change to automate this process. End-users can then use your package with AD normally, and utilize adjoints automatically.

We've also enabled methods to efficiently compute derivatives through explicit and implicit ODE solvers (unsteady discrete adjoint). For the implicit solve at each time step we can apply the same methodology. However, both still face memory challenges for long time-based simulations. We analytically propagate derivatives between time steps so that reverse mode AD tapes only need to extend across a single time step. This allows for arbitrarily long time sequences without increasing memory requirements.

As a side benefit the above functionality easily allows one to define custom AD rules. This is perhaps most useful when calling code from another language. We provide fall backs for utilizing finite differencing and complex step efficiently if the external code cannot provide derivatives (ideally via Jacobian vector products). This functionality can also be used for mixed-mode AD.

Author: Andrew Ning and Taylor McDonnell

Features:

  • Compatible with ForwardDiff and ReverseDiff (or any ChainRules compliant reverse mode AD package)
  • Compatible with any solver (no differentiation occurs inside the solver)
  • Simple drop in functionality
  • Customizable subfunctions to accommodate different use cases (e.g., custom linear solvers, factorizations, matrix-free operators)
  • Version for ordinary differentiation equations (i.e., discrete unsteady adjoint)
  • Analytic overrides for linear systems (more efficient)
  • Analytic overrides for eigenvalue problems (more efficient)
  • Can provide custom rules to be inserted into the AD chain. Provides finite differencing and complex step defaults for cases where AD is not available (e.g., calling another language).

Documentation:

  • Start with the tutorial to learn usage.
  • The API is described in the reference page.
  • The theory and also some scaling examples in this paper. A supplementary document deriving the linear and eigenvalue cases is available in the theory section.

Run Unit Tests:

pkg> activate .
pkg> test

Citing:

For now, please cite the following preprint. DOI: 10.48550/arXiv.2306.15243

Other Packages:

Nonconvex.jl and ImplicitDifferentiation.jl are other prior implementations of the nonlinear portion of this package. SciML provides support for continuous unsteady adjoints of ODEs. They have also recently added an implementation for the nonlinear case.

implicitad.jl's People

Contributors

andrewning avatar dingraha avatar juddmehr avatar taylormcd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

implicitad.jl's Issues

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Alternative method for providing partial derivatives?

The current method for providing user-defined jacobians could be simplified by allowing users to return the jacobian matrix from the solution procedure instead. Then a new function for the jacobian wouldn't have to be defined since it is often already computed as part of the solution procedure. The relevant modifications to make this happen are in the solve-drdy branch.

Partial derivative matrix cannot be safely re-used.

Currently, this package computes the partial derivative matrix A = drdy(residual, y, x, p) as part of the forward pass for use in the reverse pass. This approach appears to yield incorrect results when used as part of an iterative process.

Here's a minimum working example:

    function residual!(r, y, x, p)
        r[1] = (y[1] + x[1])*(y[2]^3-x[2])+x[3]
        r[2] = sin(y[2]*exp(y[1])-1)*x[4]
        return r
    end

    function solve(x, p)
        rwrap(r, y) = residual!(r, y, x[1:4], p)
        res = nlsolve(rwrap, [0.1; 1.2], autodiff=:forward)
        return res.zero
    end

    A = zeros(2, 2)

    function drdy(residual, y, x, p)
        A[1, 1] = y[2]^3-x[2]
        A[1, 2] = 3*y[2]^2*(y[1]+x[1])
        u = exp(y[1])*cos(y[2]*exp(y[1])-1)*x[4]
        A[2, 1] = y[2]*u
        A[2, 2] = u
        return A
    end

    function modprogram(x)
        z = 2.0*x
        w = z + x.^2
        y = implicit(solve, residual!, w, (), drdy=drdy) # first iteration
        y = implicit(solve, residual!, [y[1], y[2], w[3], w[4]], (), drdy=drdy) # second iteration
        return y[1] .+ w*y[2]
    end

    x = [1.0; 2.0; 3.0; 4.0; 5.0]

    J1 = ForwardDiff.jacobian(modprogram, x)
#      5×5 Matrix{Float64}:
#     8.68311  -0.709511   1.14925   7.45103e-16   0.0
#    -2.85187  12.4013     3.37157   1.4703e-15    0.0
#    -5.48354  -4.00227   25.7932    2.48556e-15   0.0
#    -8.86712  -6.47184   10.483    24.138         0.0
#   -13.0026   -9.4902    15.3721    5.38633e-15  28.9657

    J2 = ReverseDiff.jacobian(modprogram, x)
#     5×5 Matrix{Float64}:
#  10.0056    0.255723   0.67288   7.41178e-16   0.0
#   1.02787  15.233      1.97403   1.45878e-15   0.0
#   1.97638   1.4425    23.1061    2.46343e-15   0.0
#   3.19589   2.33259    6.1377   24.138         0.0
#   4.68641   3.42047    9.00023   5.33384e-15  28.9657

The solution is to compute the partial derivative matrix A = drdy(residual, y, x, p) as part of the reverse-pass. The corrected rule would look like this:

# Provide a ChainRule rule for reverse mode
function ChainRulesCore.rrule(::typeof(implicit), solve, residual, x, p, drdy, lsolve)

    # evaluate solver
    y = solve(x, p)

    function pullback(ybar)
        A = drdy(residual, y, x, p)
        u = lsolve(A', ybar)
        xbar = vjp(residual, y, x, p, -u)
        return NoTangent(), NoTangent(), NoTangent(), xbar, NoTangent(), NoTangent(), NoTangent()
    end

    return y, pullback
end

Dual Numbers in (Constant) Parameters

While it is implied that dual numbers shouldn't show up in the (constant) parameters, it could still happen in practice. It might be worth adding a note to the documentation to ensure users don't try this.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.