Code Monkey home page Code Monkey logo

sparsedifftools.jl's Introduction

SparseDiffTools.jl

Join the chat at https://julialang.zulipchat.com #sciml-bridged Global Docs

codecov Build Status buildkite

ColPrac: Contributor's Guide on Collaborative Practices for Community Packages SciML Code Style

This package is for exploiting sparsity in Jacobians and Hessians to accelerate computations. Matrix-free Jacobian-vector product and Hessian-vector product operators are provided that are compatible with AbstractMatrix-based libraries like IterativeSolvers.jl for easy and efficient Newton-Krylov implementation. It is possible to perform matrix coloring, and utilize coloring in Jacobian and Hessian construction.

Optionally, automatic and numerical differentiation are utilized.

Example

Suppose we had the function

fcalls = 0
function f(y,x) # in-place
  global fcalls += 1
  for i in 2:length(x)-1
    y[i] = x[i-1] - 2x[i] + x[i+1]
  end
  y[1] = -2x[1] + x[2]
  y[end] = x[end-1] - 2x[end]
  nothing
end

function g(x) # out-of-place
  global fcalls += 1
  y = zero(x)
  for i in 2:length(x)-1
    y[i] = x[i-1] - 2x[i] + x[i+1]
  end
  y[1] = -2x[1] + x[2]
  y[end] = x[end-1] - 2x[end]
  y
end

High Level API

We need to perform the following steps to utilize SparseDiffTools:

  1. Specify a Sparsity Detection Algorithm. There are 3 possible choices currently:
    1. NoSparsityDetection: This will ignore any AD choice and compute the dense Jacobian
    2. JacPrototypeSparsityDetection: If you already know the sparsity pattern, you can specify it as JacPrototypeSparsityDetection(; jac_prototype=<sparsity pattern>).
    3. SymbolicsSparsityDetection: This will use Symbolics.jl to automatically detect the sparsity pattern. (Note that Symbolics.jl must be explicitly loaded before using this functionality.)
  2. Now choose an AD backend from ADTypes.jl:
    1. If using a standard type like AutoForwardDiff(), then we will not use sparsity detection.
    2. If you wrap it inside AutoSparse(AutoForwardDiff()), then we will internally compute the proper sparsity pattern, and try to exploit that.
  3. Now there are 2 options:
    1. Precompute the cache using sparse_jacobian_cache and use the sparse_jacobian or sparse_jacobian! functions to compute the Jacobian. This option is recommended if you are repeatedly computing the Jacobian for the same function.
    2. Directly use sparse_jacobian or sparse_jacobian! to compute the Jacobian. This option should be used if you are only computing the Jacobian once.
using Symbolics

sd = SymbolicsSparsityDetection()
adtype = AutoSparse(AutoFiniteDiff())
x = rand(30)
y = similar(x)

# Option 1
## OOP Function
cache = sparse_jacobian_cache(adtype, sd, g, x; fx=y) # Passing `fx` is needed if size(y) != size(x)
J = sparse_jacobian(adtype, cache, g, x)
### Or
J_preallocated = similar(J)
sparse_jacobian!(J_preallocated, adtype, cache, g, x)

## IIP Function
cache = sparse_jacobian_cache(adtype, sd, f, y, x)
J = sparse_jacobian(adtype, cache, f, y, x)
### Or
J_preallocated = similar(J)
sparse_jacobian!(J_preallocated, adtype, cache, f, y, x)

# Option 2
## OOP Function
J = sparse_jacobian(adtype, sd, g, x)
### Or
J_preallocated = similar(J)
sparse_jacobian!(J_preallocated, adtype, sd, g, x)

## IIP Function
J = sparse_jacobian(adtype, sd, f, y, x)
### Or
J_preallocated = similar(J)
sparse_jacobian!(J_preallocated, adtype, sd, f, y, x)

Lower Level API

For this function, we know that the sparsity pattern of the Jacobian is a Tridiagonal matrix. However, if we didn't know the sparsity pattern for the Jacobian, we could use the Symbolics.jacobian_sparsity function to automatically detect the sparsity pattern. We declare that the function f outputs a vector of length 30 and takes in a vector of length 30, and jacobian_sparsity returns a SparseMatrixCSC:

using Symbolics
input = rand(30)
output = similar(input)
sparsity_pattern = Symbolics.jacobian_sparsity(f,output,input)
jac = Float64.(sparsity_pattern)

Now we call matrix_colors to get the colorvec vector for that matrix:

using SparseDiffTools
colors = matrix_colors(jac)

Since maximum(colors) is 3, this means that finite differencing can now compute the Jacobian in just 4 f-evaluations. Generating the sparsity pattern used 1 (pseudo) f-evaluation, so the total number of times that f is called to compute the sparsity pattern plus the entire 30x30 Jacobian is 5 times:

using FiniteDiff
FiniteDiff.finite_difference_jacobian!(jac, f, rand(30), colorvec=colors)
@show fcalls # 5

In addition, a faster forward-mode autodiff call can be utilized as well:

forwarddiff_color_jacobian!(jac, f, x, colorvec = colors)

If one only needs to compute products, one can use the operators. For example,

x = rand(30)
J = JacVec(f,x)

makes J into a matrix-free operator which calculates J*v products. For example:

v = rand(30)
res = similar(v)
mul!(res,J,v) # Does 1 f evaluation

makes res = J*v. Additional operators for HesVec exists, including HesVecGrad which allows one to utilize a gradient function. These operators are compatible with iterative solver libraries like IterativeSolvers.jl, meaning the following performs the Newton-Krylov update iteration:

using IterativeSolvers
gmres!(res,J,v)

Documentation

Matrix Coloring

This library extends the common ArrayInterfaceCore.matrix_colors function to allow for coloring sparse matrices using graphical techniques.

Matrix coloring allows you to reduce the number of times finite differencing requires an f call to maximum(colors)+1, or reduces automatic differentiation to using maximum(colors) partials. Since normally these values are length(x), this can be significant savings.

The API for computing the colorvec vector is:

matrix_colors(A::AbstractMatrix,alg::SparseDiffToolsColoringAlgorithm = GreedyD1Color();
              partition_by_rows::Bool = false)

The first argument is the abstract matrix which represents the sparsity pattern of the Jacobian. The second argument is the optional choice of coloring algorithm. It will default to a greedy distance 1 coloring, though if your special matrix type has more information, like is a Tridiagonal or BlockBandedMatrix, the colorvec vector will be analytically calculated instead. The keyword argument partition_by_rows allows you to partition the Jacobian on the basis of rows instead of columns and generate a corresponding coloring vector which can be used for reverse-mode AD. Default value is false.

The result is a vector which assigns a colorvec to each column (or row) of the matrix.

Colorvec-Assisted Differentiation

Colorvec-assisted differentiation for numerical differentiation is provided by FiniteDiff.jl and for automatic differentiation is provided by ForwardDiff.jl.

For FiniteDiff.jl, one simply has to use the provided colorvec keyword argument. See the FiniteDiff Jacobian documentation for more details.

For forward-mode automatic differentiation, use of a colorvec vector is provided by the following function:

forwarddiff_color_jacobian!(J::AbstractMatrix{<:Number},
                            f,
                            x::AbstractArray{<:Number};
                            dx = nothing,
                            colorvec = eachindex(x),
                            sparsity = nothing)

Notice that if a sparsity pattern is not supplied then the built Jacobian will be the compressed Jacobian: sparsity must be a sparse matrix or a structured matrix (Tridiagonal, Banded, etc. conforming to the ArrayInterfaceCore.jl specs) with the appropriate sparsity pattern to allow for decompression.

This call will allocate the cache variables each time. To avoid allocating the cache, construct the cache in advance:

ForwardColorJacCache(f,x,_chunksize = nothing;
                              dx = nothing,
                              colorvec=1:length(x),
                              sparsity = nothing)

and utilize the following signature:

forwarddiff_color_jacobian!(J::AbstractMatrix{<:Number},
                            f,
                            x::AbstractArray{<:Number},
                            jac_cache::ForwardColorJacCache)

dx is a pre-allocated output vector which is used to declare the output size, and thus allows for specifying a non-square Jacobian.

If one is using an out-of-place function f(x), then the alternative form can be used:

jacout = forwarddiff_color_jacobian(g, x,
                                    dx = similar(x),
                                    colorvec = 1:length(x),
                                    sparsity = nothing,
                                    jac_prototype = nothing)

Note that the out-of-place form is efficient and compatible with StaticArrays. One can specify the type and shape of the output Jacobian by giving an additional jac_prototype to the out-of place forwarddiff_color_jacobian function, otherwise it will become a dense matrix. If jac_prototype and sparsity are not specified, then the oop Jacobian will assume that the function has a square Jacobian matrix. If it is not the case, please specify the shape of output by giving dx.

Similar functionality is available for Hessians, using finite differences of forward derivatives. Given a scalar function f(x), a vector value for x, and a color vector and sparsity pattern, this can be accomplished using numauto_color_hessian or its in-place form numauto_color_hessian!.

H = numauto_color_hessian(f, x, colorvec, sparsity)
numauto_color_hessian!(H, f, x, colorvec, sparsity)

To avoid unnecessary allocations every time the Hessian is computed, construct a ForwardColorHesCache beforehand:

hescache = ForwardColorHesCache(f, x, colorvec, sparsity)
numauto_color_hessian!(H, f, x, hescache)

By default, these methods use a mix of numerical and automatic differentiation, namely by taking finite differences of gradients calculated via ForwardDiff.jl. Alternatively, if you have your own custom gradient function g!, you can specify it as an argument to ForwardColorHesCache:

hescache = ForwardColorHesCache(f, x, colorvec, sparsity, g!)

Note that any user-defined gradient needs to have the signature g!(G, x), i.e. updating the gradient G in place.

Jacobian-Vector and Hessian-Vector Products

Matrix-free implementations of Jacobian-Vector and Hessian-Vector products is provided in both an operator and function form. For the functions, each choice has the choice of being in-place and out-of-place, and the in-place versions have the ability to pass in cache vectors to be non-allocating. When in-place the function signature for Jacobians is f!(du,u), while out-of-place has du=f(u). For Hessians, all functions must be f(u) which returns a scalar

The functions for Jacobians are:

auto_jacvec!(dy, f, x, v,
                      cache1 = ForwardDiff.Dual{DeivVecTag}.(x, v),
                      cache2 = ForwardDiff.Dual{DeivVecTag}.(x, v))

auto_jacvec(f, x, v)

# If compute_f0 is false, then `f(cache1,x)` will be computed
num_jacvec!(dy,f,x,v,cache1 = similar(v),
                     cache2 = similar(v),
                     cache3 = similar(v);
                     compute_f0 = true)
num_jacvec(f,x,v,f0=nothing)

For Hessians, the following are provided:

num_hesvec!(dy,f,x,v,
             cache1 = similar(v),
             cache2 = similar(v),
             cache3 = similar(v),
             cache4 = similar(v))

num_hesvec(f,x,v)

numauto_hesvec!(dy,f,x,v,
                 cache = ForwardDiff.GradientConfig(f,v),
                 cache1 = similar(v),
                 cache2 = similar(v),
                 cache3 = similar(v))

numauto_hesvec(f,x,v)

autonum_hesvec!(dy,f,x,v,
                 cache1 = similar(v),
                 cache2 = ForwardDiff.Dual{DeivVecTag}.(x, v),
                 cache3 = ForwardDiff.Dual{DeivVecTag}.(x, v))

autonum_hesvec(f,x,v)

In addition, the following forms allow you to provide a gradient function g(dy,x) or dy=g(x) respectively:

num_hesvecgrad!(dy,g,x,v,
                     cache1 = similar(v),
                     cache2 = similar(v),
                     cache3 = similar(v))

num_hesvecgrad(g,x,v)

auto_hesvecgrad!(dy,g,x,v,
                     cache2 = ForwardDiff.Dual{DeivVecTag}.(x, v),
                     cache3 = ForwardDiff.Dual{DeivVecTag}.(x, v))

auto_hesvecgrad(g,x,v)

The numauto and autonum methods both mix numerical and automatic differentiation, with the former almost always being more efficient and thus being recommended.

Optionally, if you load Zygote.jl, the following numback and autoback methods are available and allow numerical/ForwardDiff over reverse mode automatic differentiation respectively, where the reverse-mode AD is provided by Zygote.jl. Currently these methods are not competitive against numauto, but as Zygote.jl gets optimized these will likely be the fastest.

using Zygote # Required

numback_hesvec!(dy,f,x,v,
                     cache1 = similar(v),
                     cache2 = similar(v),
                     cache3 = similar(v))

numback_hesvec(f,x,v)

# Currently errors! See https://github.com/FluxML/Zygote.jl/issues/241
autoback_hesvec!(dy,f,x,v,
                     cache2 = ForwardDiff.Dual{DeivVecTag}.(x, v),
                     cache3 = ForwardDiff.Dual{DeivVecTag}.(x, v))

autoback_hesvec(f,x,v)

J*v and H*v Operators

The following produce matrix-free operators which are used for calculating Jacobian-vector and Hessian-vector products where the differentiation takes place at the vector u:

JacVec(f,x::AbstractArray;autodiff=true)
HesVec(f,x::AbstractArray;autodiff=true)
HesVecGrad(g,x::AbstractArray;autodiff=false)

These all have the same interface, where J*v utilizes the out-of-place Jacobian-vector or Hessian-vector function, whereas mul!(res,J,v) utilizes the appropriate in-place versions. To update the location of differentiation in the operator, simply mutate the vector u: J.u .= ....

sparsedifftools.jl's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sparsedifftools.jl's Issues

Make Jacobian calculation with coloring compatible with GPUs

            for i in 1:length(cols_index)
                if color[cols_index[i]]==color_i
                    J[rows_index[i],cols_index[i]] = dx[rows_index[i]]
                end
            end

is the only part that doesn't play nicely with the GPU. If that can be made into a broadcast expression, then GPU-based Jacobians would be a reality. This would be very nice since finite differencing really doesn't do well with GPUs, and we know that if the code is GPU compatible then it is likely to be pure-Julia code and thus AD-compatible.

finite_difference_jacobian! and forwarddiff_color_jacobian!

Hi, I was wondering what is the difference between finite_difference_jacobian! and forwarddiff_color_jacobian! can I make use of both functions to solve my system f ?

using LinearAlgebra, ForwardDiff, SparseArrays, SparsityDetection, SparseDiffTools, BenchmarkTools, FiniteDiff

fcalls = 0
function f(dx,x) # in-place
global fcalls += 1
for i in 2:length(x)-1
for j = i-1:i+1
dx[i] += 2x[i]*x[j]
end
end
dx[1] = -2x[1] + x[2]
dx[end] = x[end-1] - 2x[end]
#nothing
dx
end

input = rand(10)
output = similar(input)
sparsity_pattern = jacobian_sparsity(f,output,input)
jac = Float64.(sparse(sparsity_pattern));

colors = matrix_colors(jac);

@time FiniteDiff.finite_difference_jacobian!(jac, f, rand(10), colorvec=colors) #Does not work

jac
10×10 SparseMatrixCSC{Float64,Int64} with 28 stored entries:
[1 , 1] = -2.0
[2 , 1] = 0.813004
[1 , 2] = 1.0
[2 , 2] = 2.72799e7
[3 , 2] = 1.08469
[2 , 3] = 0.813004
[3 , 3] = 3.42811e7
[4 , 3] = 1.60417
[3 , 4] = 1.08469
[4 , 4] = 5.36339e7
[5 , 4] = 0.138877
[4 , 5] = 1.60417

[6 , 6] = 8.37575e6
[7 , 6] = 1.14902
[6 , 7] = 0.265017
[7 , 7] = 3.84164e7
[8 , 7] = 0.0726532
[7 , 8] = 1.14902
[8 , 8] = 2.43784e6
[9 , 8] = 1.95844
[8 , 9] = 0.0726532
[9 , 9] = 6.18958e7
[10, 9] = 1.0
[9 , 10] = 1.95844
[10, 10] = -2.0

@time forwarddiff_color_jacobian!(jac, f, input, colorvec = colors) #Works

jac
10×10 SparseMatrixCSC{Float64,Int64} with 28 stored entries:
[1 , 1] = -2.0
[2 , 1] = 1.65884
[1 , 2] = 1.0
[2 , 2] = 5.08291
[3 , 2] = 0.171734
[2 , 3] = 1.65884
[3 , 3] = 2.02608
[4 , 3] = 0.0237699
[3 , 4] = 0.171734
[4 , 4] = 0.895975
[5 , 4] = 0.676701
[4 , 5] = 0.0237699

[6 , 6] = 5.01525
[7 , 6] = 1.32135
[6 , 7] = 1.5086
[7 , 7] = 4.84081
[8 , 7] = 0.689509
[7 , 8] = 1.32135
[8 , 8] = 4.03742
[9 , 8] = 1.33705
[8 , 9] = 0.689509
[9 , 9] = 3.42855
[10, 9] = 1.0
[9 , 10] = 1.33705
[10, 10] = -2.0

Issue Precompiling SparseDiffTools.jl

Using a clean julia 1.2.0 install on Ubuntu 64 bit 18.04.2. I get a precompilation error when trying to install SparseDiffTools:

julia> using Pkg

julia> Pkg.add("SparseDiffTools")
  Updating registry at `~/.julia/registries/General`
  Updating git-repo `https://github.com/JuliaRegistries/General.git`
 Resolving package versions...
 Installed Adapt ──────────────── v1.0.0
 Installed Inflate ────────────── v0.1.1
 Installed Tokenize ───────────── v0.5.6
 Installed SparseDiffTools ────── v0.9.0
 Installed CommonSubexpressions ─ v0.2.0
 Installed BlockBandedMatrices ── v0.5.0
 Installed BinaryProvider ─────── v0.5.6
 Installed BlockArrays ────────── v0.10.0
 Installed URIParser ──────────── v0.4.0
 Installed Requires ───────────── v0.5.2
 Installed CSTParser ──────────── v0.6.2
 Installed VertexSafeGraphs ───── v0.1.0
 Installed Compat ─────────────── v2.1.0
 Installed DataStructures ─────── v0.17.0
 Installed OrderedCollections ─── v1.1.0
 Installed FillArrays ─────────── v0.7.0
 Installed BandedMatrices ─────── v0.11.0
 Installed ForwardDiff ────────── v0.10.3
 Installed LightGraphs ────────── v1.3.0
 Installed DiffEqDiffTools ────── v1.3.0
 Installed DiffRules ──────────── v0.0.10
 Installed ArrayInterface ─────── v1.2.1
 Installed StaticArrays ───────── v0.11.0
 Installed NaNMath ────────────── v0.3.2
 Installed DiffResults ────────── v0.0.4
 Installed MatrixFactorizations ─ v0.1.0
 Installed SimpleTraits ───────── v0.9.0
 Installed LazyArrays ─────────── v0.11.0
 Installed MacroTools ─────────── v0.5.1
 Installed SpecialFunctions ───── v0.8.0
 Installed BinDeps ────────────── v0.8.10
 Installed ArnoldiMethod ──────── v0.0.4
  Updating `~/.julia/environments/v1.2/Project.toml`
  [47a9eef4] + SparseDiffTools v0.9.0
  Updating `~/.julia/environments/v1.2/Manifest.toml`
  [79e6a3ab] + Adapt v1.0.0
  [ec485272] + ArnoldiMethod v0.0.4
  [4fba245c] + ArrayInterface v1.2.1
  [aae01518] + BandedMatrices v0.11.0
  [9e28174c] + BinDeps v0.8.10
  [b99e7846] + BinaryProvider v0.5.6
  [8e7c35d0] + BlockArrays v0.10.0
  [ffab5731] + BlockBandedMatrices v0.5.0
  [00ebfdb7] + CSTParser v0.6.2
  [bbf7d656] + CommonSubexpressions v0.2.0
  [34da2185] + Compat v2.1.0
  [864edb3b] + DataStructures v0.17.0
  [01453d9d] + DiffEqDiffTools v1.3.0
  [163ba53b] + DiffResults v0.0.4
  [b552c78f] + DiffRules v0.0.10
  [1a297f60] + FillArrays v0.7.0
  [f6369f11] + ForwardDiff v0.10.3
  [d25df0c9] + Inflate v0.1.1
  [5078a376] + LazyArrays v0.11.0
  [093fc24a] + LightGraphs v1.3.0
  [1914dd2f] + MacroTools v0.5.1
  [a3b82374] + MatrixFactorizations v0.1.0
  [77ba4419] + NaNMath v0.3.2
  [bac558e1] + OrderedCollections v1.1.0
  [ae029012] + Requires v0.5.2
  [699a6c99] + SimpleTraits v0.9.0
  [47a9eef4] + SparseDiffTools v0.9.0
  [276daf66] + SpecialFunctions v0.8.0
  [90137ffa] + StaticArrays v0.11.0
  [0796e94c] + Tokenize v0.5.6
  [30578b45] + URIParser v0.4.0
  [19fa3120] + VertexSafeGraphs v0.1.0
  [2a0f44e3] + Base64 
  [ade2ca70] + Dates 
  [8bb1440f] + DelimitedFiles 
  [8ba89e20] + Distributed 
  [b77e0a4c] + InteractiveUtils 
  [76f85450] + LibGit2 
  [8f399da3] + Libdl 
  [37e2e46d] + LinearAlgebra 
  [56ddb016] + Logging 
  [d6f4376e] + Markdown 
  [a63ad114] + Mmap 
  [44cfe95a] + Pkg 
  [de0858da] + Printf 
  [9abbd945] + Profile 
  [3fa0cd96] + REPL 
  [9a3f8284] + Random 
  [ea8e919c] + SHA 
  [9e88b42a] + Serialization 
  [1a1011a3] + SharedArrays 
  [6462fe0b] + Sockets 
  [2f01184e] + SparseArrays 
  [10745b16] + Statistics 
  [8dfed614] + Test 
  [cf7118a7] + UUIDs 
  [4ec0a83e] + Unicode 
  Building SpecialFunctions → `~/.julia/packages/SpecialFunctions/ne2iw/deps/build.log`

julia> Pkg.build("SparseDiffTools")
  Building SpecialFunctions → `~/.julia/packages/SpecialFunctions/ne2iw/deps/build.log`
false

julia> using SparseDiffTools
[ Info: Precompiling SparseDiffTools [47a9eef4-7e08-11e9-0b38-333d64bd3804]
ERROR: Failed to precompile SparseDiffTools [47a9eef4-7e08-11e9-0b38-333d64bd3804] to /home/mattjohnson/.julia/compiled/v1.2/SparseDiffTools/w4L8R.ji.
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1253
 [3] _require(::Base.PkgId) at ./loading.jl:1013
 [4] require(::Base.PkgId) at ./loading.jl:911
 [5] require(::Module, ::Symbol) at ./loading.jl:906

Sparse Diff Roadmap

  1. @pkj-m is doing graph coloring methods, so methods that will give a coloring given a sparsity pattern. Sparsity patterns will be passed just by passing a sparse matrix itself, since that's essentially as concise as you can make that information.
  2. Graph coloring algorithms will output a colorvec of integers, where each integer gives a valid direction for integration. colorvec=[1,1,2,2,3] for example is 3 directional derivatives, and when differentiation is done that builds the compressed Jacobian.
  3. @pkj-m will build the tooling for getting the sparse Jacobian from the compressed Jacobian (which is simply finding the non-zero column in the row, since you shouldn't overlap non-zeros. Then there are approximate colorings...)
  4. @huanglangwen you can make overloads for types like BandedMatrix (since there's an easy analytical solution to those colorings), and then make DiffEqDiffTools and ForwardDiff specialize on color vectors.
  5. @huanglangwen with the help of @YingboMa make sure that these are usable from the stiff ODE solvers by specifying jac_prototype and a new arg for a color vector (so it doesn't have to be recomputed each time).
  6. @shashi is doing a Cassette-based automatic sparsity detection which will look at the function AST and return a sparse matrix for the Jacobian/Hessian.
  7. @HarrisonGrodin we need to make sure that sparse on a ModelingToolkit matrix of variables with a lot of zeros actually returns a sparse matrix, and maybe specialize the computation to make it cheaper if we know beforehand that we don't need to differentiate some things? That would be our analytical solution sparse Jacobian generator.
  8. @shashi and maybe a new hire will get this all integrated with Zygote.

Julia 1.0

Project.toml allows Julia 1.0 but, with this version, the package returns an error
eachrow not defined

Jacobian row partitioning and reverse-mode AD

Reverse-mode AD implementations calculate a Jacobian row by row instead of column by column. Thus it would be nice to have a way to do row-wise matrix partitioning and coloring. Once we have these color vectors, we have to specialize the Jacobian function in Zygote to allow for and exploit a color vector.

auto_hesvec

using ForwardDiff
function seed_duals(x::AbstractArray{V},::Type{T},::ForwardDiff.Chunk{N} = ForwardDiff.Chunk(x,typemax(Int64))) where {V,T,N} 
    seeds = ForwardDiff.construct_seeds(ForwardDiff.Partials{N,V}) 
    duals = [ForwardDiff.Dual{T}(x[i],seeds[i]) for i in eachindex(x)] 
end 
function g(x)
    x[1]^3*x[2]^3
end
x = [2.0, 3.0]
v = [4.0, 5.0]
ForwardDiff.gradient(g,x)'*v # 2376
ForwardDiff.hessian(g,x)*v # 2916 2016

test1 =  seed_duals(Dual{Nothing}.(x,v),Nothing)
ForwardDiff.partials.(g(test1).partials)

test2 = Dual{Nothing}.(seed_duals(x,Nothing),v)
g(test2).partials[1].partials

Does these make sense?

Cassette depency warning

Hello,

i am running Julia 1.0.4 just updated some packages which now require SparseDiffTools.
I just wanted to let you know that warning:

┌ Warning: Package SparseDiffTools does not have Cassette in its dependencies:
│ - If you have SparseDiffTools checked out for development and have
│ added Cassette as a dependency but haven't updated your primary
│ environment's manifest file, try Pkg.resolve().
│ - Otherwise you may need to report an issue with SparseDiffTools
└ Loading Cassette into SparseDiffTools from project dependency, future warnings for SparseDiffTools are suppressed.

Forward over reverse Hessian methods

An implementation of forward-over-reverse, i.e. applying forward mode autodiff to the result of reverse autodiff, can be a way to exploit AD for the computation of Hessians. It might make sense to directly implement this into Zygote.

Automated sparsity detection

@shashi got the first round going. From talks with @vchuravy and all, it seems like there are a few ways to go about this:

  • Trace with ModelingToolkit. Shown here. This is nice because it gets the analytical solution for the Jacobian as well, but can be expensive to form the computational graphs.
  • Shashi's PR #9 . This improves upon the ModelingToolkit form because it doesn't form a whole computational graph and instead just stores an interaction vector, making it scale much better memory-wise. However, it runs with a given input X, and so it just takes the branches that input sees. If X is a good representative input, then it'll get the true sparsity pattern.
  • Branch elimination. Essentially it's what Shashi's PR was, except at every branch, you just eliminate the branch and inline both sides of it. This could give an overestimate to the sparsity pattern, but being slightly larger is okay (being slightly smaller isn't). However, because this doesn't have a representative value, it won't work well with while loops with state-dependent flow. I think one way to do this is to just inline the while loop (making it run like a do-while false), but that will always be a problem with this method. However, for a large portion of DiffEq codes, this will work and likely be all that's needed.
  • Concolic fuzzing. This will give the full true estimate, but would require a SAT solver. This is likely what you'd want if you were running it only once, but not likely something you'd default to running on the fly.

So I think the API on sparsity should have a dispatch between the different methods since they all have trade-offs. sparsity!(f!, Y, X, method=TraceGraph(),S=Sparsity(length(Y), length(X))). A last method could be a DefaultDectection() that works like:

  • Checks for any control flow. If none, just do TraceGraph().
  • Otherwise, if there's no state-dependent while loops, do BranchEliminationFlow().
  • Otherwise, do CFuzz()

As for tests, the test equations would use (du,u)->f(du,u,p,t) from definitions of differential equations. It would also be nice if there's just a dispatch on DEProblem that remakes the problem with sparse matrix support, but let's leave that for later. The Lorenz equation is good for unit tests: http://docs.juliadiffeq.org/latest/tutorials/ode_example.html#Example-2:-Solving-Systems-of-Equations-1 . For a bigger example, https://github.com/JuliaDiffEq/DiffEqProblemLibrary.jl has a few. The Bruss problem is particularly interesting

https://github.com/JuliaDiffEq/DiffEqProblemLibrary.jl/blob/master/src/ode/brusselator_prob.jl

since it matches a lot of things we'd typically see in GPU-based PDE code. The different forms of the PDE from http://juliadiffeq.org/DiffEqTutorials.jl/html/introduction/optimizing_diffeq_code.html also are interesting and we should make sure we support the optimized and unoptimized forms well.

Is this Curtis-Powell-Reid seeding?

this sounds like Curtis-Powell-Reid seeding (per Griewank and Walther, Evaluating Derivative chapter 8).
but no mention of this is given in the readme.

Use standard package layout

We should be able to use the standard Pkg tooling with the repo, so the structure needs to be identical.

The following commands need to work:

julia> ]dev https://github.com/pkj-m/SparseDiffTools.jl
pkg> test SparseDiffTools

Hessian coloring

For Hessians, one can exploit symmetry in order to further reduce the number of computations that are necessary by finding and using a star coloring.

Error with matrix_colors and BandedBlockBandedMatrix

matrix_colors cannot compute the colors of this BandedBlockBandedMatrix

using BlockBandedMatrices, SparseDiffTools
l = 4
B = BandedBlockBandedMatrix(Zeros(2 * l , 2 * l), ([l, l] ,[l, l]), (1, 1), (1, 1))
matrix_colors(B)
#> ERROR: ArgumentError: reducing over an empty collection is not allowed

(I think it is a valid BandedBlockBandedMatrix so it should work)

Coloring benchmark problem

using SparseDiffTools, LinearAlgebra, SparseArrays
N = 256
Mx = Tridiagonal([1.0 for i in 1:N-1],[-2.0 for i in 1:N],[1.0 for i in 1:N-1])
My = copy(Mx)
Iy = SparseMatrixCSC(I,N,N)
Ix = SparseMatrixCSC(I,N,N)
fJ = ones(3,3)
Dz = [1 0 0
      0 0 0
      0 0 0]
jacsparsity = kron(Dz,Iy,sparse(Mx)) + kron(Dz,sparse(My),Ix) + kron(fJ,Iy,Ix)
@time colorvec = matrix_colors(jacsparsity)

Move to a Julia org

We should make sure this moves to a Julia org so it can get routine maintenance by many individuals. I would be fine with it in JuliaDiffEq but it may find a better home in JuliaDiff (if that's still a thing?). I can take one of the admin spots and will commit to maintenance for whatever future.

out of place Jacobian decomposition mutates

Performance results are highly variable when using cached out of place methods. I've gotten segfaults, although I cannot reliably reproduce that portion of the issue. Using CuArrays seems to amplify the issue.

using Revise
using Flux, BenchmarkTools, CuArrays, CUDAnative, ForwardDiff, LinearAlgebra, Random

function mwe(N, ::Type{T}=Float32) where T<:Real
  A::Matrix{T} = rand(T, N,N)
  cuA = A |> gpu

  function f!(out, A)
    out .= A .+ A .* A .+ 1f0
  end

  krn(x) = x + x*x + 1f0
  function f!(out, A::CuMatrix{Float32})
    out .= krn.(A)
  end

  function f(A)
    return A .+ A .* A .+ 1f0
  end

  function f(A::CuMatrix{Float32})
    return krn.(A)
  end

  J = rand(T, N^2, N^2)
  @info "test cpu (inplace)"
  cache = SparseDiffTools.ForwardColorJacCache(f!,A, dx = similar(A))
  SparseDiffTools.forwarddiff_color_jacobian!(J, f!, A, cache)
  (N<5) && @info "test ∇f cpu inplace: $(J)"
  (N>5) && @btime SparseDiffTools.forwarddiff_color_jacobian!($J, $f!, $A, $cache)

  @info "test cpu (out of place)"
  cacheoos = SparseDiffTools.ForwardColorJacCache(f,A, dx = similar(A))
  J = SparseDiffTools.forwarddiff_color_jacobian(f, A, cacheoos)
  (N<5) && @info "test ∇f cpu oop: $(J)"
  (N>5) && @btime SparseDiffTools.forwarddiff_color_jacobian($f, $A, $cacheoos)


  @info "test gpu (inplace)"
  cuJ = J |> gpu
  cucache = SparseDiffTools.ForwardColorJacCache(f!,cuA, dx = similar(cuA))
  SparseDiffTools.forwarddiff_color_jacobian!(cuJ, f!, cuA, cucache)
  (N<5) && @info "test ∇f gpu inplace: $(cuJ)"
  (N>5) && @btime SparseDiffTools.forwarddiff_color_jacobian!($cuJ, $f!, $cuA, $cucache)

  @info "test gpu (outofplace)"
  cucacheoop = SparseDiffTools.ForwardColorJacCache(f,cuA, dx = similar(cuA))
  cuJ = SparseDiffTools.forwarddiff_color_jacobian(f, cuA, cucacheoop)
  (N<5) && @info "test ∇f  gpu oop: $(cuJ)"
  (N>5) && @btime SparseDiffTools.forwarddiff_color_jacobian($f, $cuA, $cucacheoop)

end

mwe(12)

Output:

[ Info: test cpu (inplace)
  46.500 μs (8 allocations: 320 bytes)
[ Info: test cpu (out of place)
  181.946 ms (80271 allocations: 853.10 MiB)
[ Info: test gpu (inplace)
  12.860 ms (10965 allocations: 402.14 KiB)
[ Info: test gpu (outofplace)
  3.110 s (3516919 allocations: 122.65 MiB)

Hessian-Vector Products

Another way to exploit sparsity is via direct computation of Hessian-vector products. Like the Jacobian vector products, but just forward-over-reverse.

Get rid of Int64

There are a lot of Int64 floating around in the code. I suggest replacing all of them with Int.

Drop DataStructures v0.17?

From @devmotion:

DataStructures 0.18 now actually deprecated find_root, so it might be preferable to drop support of DataStructures 0.17 and replace find_root with find_root!.

Hessian is wrong with intervals

julia> using IntervalArithmetic, SparseDiffTools

julia> f(X) = ( (x, y) = X; x^2 + y^2 )
f (generic function with 2 methods)

julia> @edit forwarddiff_color_jacobian((f), IntervalBox(0..0, 2).v)

julia> forwarddiff_color_jacobian((f), IntervalBox(0..0, 2).v)
2×2 StaticArrays.MArray{Tuple{2,2},Interval{Float64},2,4} with indices SOneTo(2)×SOneTo(2):
 [0, 2]  [0, 0]
 [0, 0]  [0, 2]

The result should be [2..2 0..0; 0..0; 2..2].

oop jacobian can't preserve array-like objects

forwarddiff_color_jacobian automatically converts array-like input object into array. When f depend on specific property of the input object, the jacobian will fail. MWE:

using OrdinaryDiffEq, SparseDiffTools
mutable struct SimTypeg{T,T2} <: DEDataMatrix{T}
  x::Array{T,2} # two dimensional
  f1::T2
end
function f(u)
  u.f1*u[1]
end
u0 = SimTypeg(fill(0.00001,2,2),0.0)
forwarddiff_color_jacobian(f,u0)

Tag?

I'm not sure about the tagging policy here but it may be nice to have a release that admits compat with Adapt v2.

First pass

  • Convert sparse matrix to graph
  • Perform coloring on graph
  • spit out color vector
  • make finite differencing choose directions based on color vector
  • add the Jacobian decompression
  • make dual numbers in the directions of the color vector
  • decompress the partials into the Jacobian

sparse array not returned

I have been willing to try this fantastic tool but I am not able to get the result. Basically, the call to forwarddiff_color_jacobian seems slow and it does not return a sparse jacobian.

It is possible that I am not using your package very well... I am posting this in case of a bug:

using SparsityDetection, SparseArrays, SparseDiffTools, ForwardDiff, Test

source_term(x; a = 0.5, b = 0.01) = 1 + (x + a*x^2)/(1 + b*x^2)
function F_chan(x, α, β = 0.)
	f = similar(x)
	n = length(x)
	f[1] = x[1] - β
	f[n] = x[n] - β
	for i=2:n-1
		f[i] = (x[i-1] - 2 * x[i] + x[i+1]) * (n-1)^2 + α * source_term(x[i], b = β)
	end
	return f
end

N = 100
_input = rand(N)
_output = similar(_input)

The jacobian is

_J = ForwardDiff.jacobian(x -> F_chan(x, 3.,0.01), _input) |> sparse
colors = matrix_colors(_J)
@show maximum(colors) # this returns 3

Then, the issue:

# does not return sparse array
J = @time forwarddiff_color_jacobian(x -> F_chan(x, 3.,0.01), _output; colorvec = colors, sparsity = _J, jac_prototype = _J)

Thank you for your help,

Performance of tall Jacobians

Let H be a matrix, x a vector, and f! a matrix-valued function. To make this fit the interface, I allow H to be a vector and reshape H :

using SparseArrays, SparseDiffTools, BenchmarkTools, ProfileView
function f!(H, x)
    if length(size(H)) == 1
        H = reshape(H, 500000, 4)
    end
    M = size(H)[1]
    N = size(H)[2]

    for m in 1:M
        for n in 1:N
            if isodd(n)
                H[m, n] = 2*x[1]
            else
                H[m, n] = 3*x[2]
            end
        end
    end
end
H = zeros(500000, 4)
x = 5.0 * ones(200)
@views f!(H[:], x) # same as f!(H, x)

Of the 200 variables only the first two are relevant, but even they collapse to one color.

pattern = sparse([1, 3, 2, 4], [1, 1, 2, 2], 1, size(H, 2), length(x))
jac = Float64.(sparse(pattern))
colors = matrix_colors(jac)
pattern = kron(pattern, ones(size(H, 1)))
jac = Float64.(sparse(pattern))
cache = ForwardColorJacCache(f!, x, dx=H, colorvec=colors ,sparsity=pattern);
@btime forwarddiff_color_jacobian!(jac, f!, x, cache);

108.402 ms (4 allocations: 160 bytes)

It is still much faster than full ForwardDiff:

full_jac = Matrix(jac);
full_jac .= 0;
@time ForwardDiff.jacobian!(full_jac, f!, H, x);

2.270 s (6 allocations: 198.39 MiB)

but @profview forwarddiff_color_jacobian!(jac, f!, x, cache); shows that most time is spent in FiniteDiff._colorediteration!.

Precompile issue with mybinder environment

Hi, it seems to be the same issue
#62

I use mybinder.org environment with Julia 1.2.0 and during using OrdinaryDiffEq I encounter

Info: Precompiling OrdinaryDiffEq [1dea7af3-3e70-54e6-95c3-0bf5283fa5ed]
└ @ Base loading.jl:1242
ERROR: LoadError: Failed to precompile SparseDiffTools [47a9eef4-7e08-11e9-0b38-333d64bd3804] to /srv/julia/pkg/compiled/v1.2/SparseDiffTools/w4L8R.ji.
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1253
 [3] _require(::Base.PkgId) at ./loading.jl:1013
 [4] require(::Base.PkgId) at ./loading.jl:911
 [5] require(::Module, ::Symbol) at ./loading.jl:906
 [6] include at ./boot.jl:328 [inlined]
 [7] include_relative(::Module, ::String) at ./loading.jl:1094
 [8] include(::Module, ::String) at ./Base.jl:31
 [9] top-level scope at none:2
 [10] eval at ./boot.jl:330 [inlined]
 [11] eval(::Expr) at ./client.jl:432
 [12] top-level scope at ./none:3
in expression starting at /srv/julia/pkg/packages/OrdinaryDiffEq/UsWa2/src/OrdinaryDiffEq.jl:49

Failed to precompile OrdinaryDiffEq [1dea7af3-3e70-54e6-95c3-0bf5283fa5ed] to /srv/julia/pkg/compiled/v1.2/OrdinaryDiffEq/DlSvy.ji.

Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1253
 [3] _require(::Base.PkgId) at ./loading.jl:1013
 [4] require(::Base.PkgId) at ./loading.jl:911
 [5] require(::Module, ::Symbol) at ./loading.jl:906
 [6] top-level scope at In[2]:1

Troubles with `ForwardColorJacCache`

It seems using ForwardColorJacCache does not work with anonymous functions or when the function fed to it is different from the function fed to forwarddiff_color_jacobian!. (Sorry I couldn't think of anything more specific for the title.)

Minimal example

Below, I'm simply trying to get the Jacobian of a (in-place) linear function x -> jac * x:

using SparseDiffTools, LinearAlgebra # SparseDiffTools v1.13.0
nt, nb = 3, 4
jac = vcat((hcat((Diagonal(ones(nb)) for _ in 1:nt)...) for _ in 1:nt)...) # nt×nt blocks of diagonals
colors = matrix_colors(jac)
x = ones(nt*nb)
jac2 = similar(jac)
# works without ForwardColorJacCache:
forwarddiff_color_jacobian!(jac2, (du,u)->mul!(du,jac,u), x, colorvec=colors, sparsity=jac)
# does not work with ForwardColorJacCache:
jac_cache = ForwardColorJacCache((du,u)->mul!(du,jac,u), x, colorvec=colors, sparsity=jac)
forwarddiff_color_jacobian!(jac2, (du,u)->mul!(du,jac,u), x, jac_cache)

stack trace

julia> forwarddiff_color_jacobian!(jac2, (du,u)->mul!(du,jac,u), x, jac_cache)
ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{var"#27#28", Float64}, Float64, 3})
Closest candidates are:
  (::Type{T})(::Real, ::RoundingMode) where T<:AbstractFloat at rounding.jl:200
  (::Type{T})(::T) where T<:Number at boot.jl:760
  (::Type{T})(::AbstractChar) where T<:Union{AbstractChar, Number} at char.jl:50
  ...
Stacktrace:
  [1] convert(#unused#::Type{Float64}, x::ForwardDiff.Dual{ForwardDiff.Tag{var"#27#28", Float64}, Float64, 3})
    @ Base ./number.jl:7
  [2] convert(#unused#::Type{ForwardDiff.Dual{ForwardDiff.Tag{var"#25#26", Float64}, Float64, 3}}, x::ForwardDiff.Dual{ForwardDiff.Tag{var"#27#28", Float64}, Float64, 3})
    @ ForwardDiff ~/.julia/packages/ForwardDiff/sqhTO/src/dual.jl:371
  [3] setindex!(A::Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#25#26", Float64}, Float64, 3}}, x::ForwardDiff.Dual{ForwardDiff.Tag{var"#27#28", Float64}, Float64, 3}, i1::Int64)
    @ Base ./array.jl:839
  [4] macro expansion
    @ ./broadcast.jl:984 [inlined]
  [5] macro expansion
    @ ./simdloop.jl:77 [inlined]
  [6] copyto!
    @ ./broadcast.jl:983 [inlined]
  [7] copyto!
    @ ./broadcast.jl:936 [inlined]
  [8] materialize!
    @ ./broadcast.jl:894 [inlined]
  [9] materialize!
    @ ./broadcast.jl:891 [inlined]
 [10] forwarddiff_color_jacobian!(J::SparseArrays.SparseMatrixCSC{Float64, Int64}, f::var"#27#28", x::Vector{Float64}, jac_cache::ForwardColorJacCache{Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#25#26", Float64}, Float64, 3}}, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#25#26", Float64}, Float64, 3}}, Vector{Float64}, Vector{Vector{Tuple{Bool, Bool, Bool}}}, Vector{Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}})
    @ SparseDiffTools ~/.julia/packages/SparseDiffTools/AhWaW/src/differentiation/compute_jacobian_ad.jl:277
 [11] top-level scope
    @ REPL[26]:1

Additional notes

The problem seems to go away when the function is not anonymous:

f!(du,u) = mul!(du,jac,u) # use f! instead of anonymous function
jac_cache = ForwardColorJacCache(f!, x, colorvec=colors, sparsity=jac)
forwarddiff_color_jacobian!(jac2, f!, x, jac_cache) # works!

But the problem comes back when the function used to build ForwardColorJacCache is a different instance than that used to call forwarddiff_color_jacobian:

g!(du,u) = mul!(du,jac,u)  # A function with a different name
jac_cache = ForwardColorJacCache(g!, x, colorvec=colors, sparsity=jac) # use g!
forwarddiff_color_jacobian!(jac2, f!, x, jac_cache) # use f! -> throws the same error

FWIW, I think I could use this second example with actually different functions that share the same sparsity for the Jacobian.

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

Depwarn in matrix_colors

┌ Warning: `find_root` is deprecated, use `find_root!` instead.
│   caller = find at acyclic_coloring.jl:195 [inlined]
└ @ Core C:\Users\accou\.julia\dev\SparseDiffTools\src\coloring\acyclic_coloring.jl:195

@pkj-m

Value type

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.