Comments (10)
Hi @Magalame !
Can you maybe elaborate on where the bottle neck is? Is your suspicion that if you could thread the evaluation of the model to calculate the residuals on multiple cores, your performance issues would be mitigated?
If you can maybe post some code that illustrates the issue, we can try to experiment / locate the issue.
Ps: is support for lazy structures (like iterators, or lazy arrays) planned?
What kind of lazy arrays and where in the chain? Do you mean as the parameter vector, or the residual vector?
from lsqfit.jl.
Hi!
My project basically consists of a simulation of a physical model, and then extracting some properties from it. Running the simulation itself is fine: extracting the properties I want is the bottle neck, and I need to fit a curve in order to do so.
I have fairly large input/output arrays (~ 1 million cells), and applying curve_fit
to them constitutes about 72% of the run time. I got some good improvements by providing the analytical jacobian, and then optimizing it as much as I could.
Here's the relevant portion of the code:
function numerical(ttL::Array{Complex{Float64},1},params::Params)
GtimeL2,ttn = fourier1d(ttL,params.dE)
tind::Array{Int64,1}=findall(ttn .> 0)
tfft::Array{Float64,1} = ttn[tind]
Gfft::Array{Complex{Float64},1} = GtimeL2[tind]
logabs::Array{Float64} = log.(abs.(Gfft))
taulin = 0
durbinlin = 0
let
top = logabs[1]
down = logabs[end]
a0 = (down-top)/(tfft[end]-tfft[1])
b0 = logabs[1]
fit = curve_fit(linmodel, jacobianlin, tfft, logabs,[a0,b0]) #This takes about 12% of the run time
taulin = fit.param
durbinlin = durbinwatson(fit.resid)
end
taupiece = 0
durbinpiece = 0
let
# [...] lots of preliminary computations go there, mainly to obtain a good estimate on the guess parameters
fit = curve_fit(piecemodel, jacobianpiece, tfft, logabs,[a,b,m,c,x0]) #Takes about 50% of run time
taupiece = fit.param
durbinpiece = durbinwatson(fit.resid)
end
#[...] some postprocessing computations
end
The functions fitted (a linear function, and a piecewise composed of two linear functions) and their jacobian are the following:
@. linmodel(x, p) = p[1]*x+p[2]
function jacobianlin(x,p)
J = Array{Float64}(undef, length(x),length(p))
J[:,1] .= x #dmodel/dp[1]
J[:,2] .= 1 #dmodel/dp[2]
J
end
@. piecemodel(x,p) = piecewise(x,p..)
function piecewise(x::Float64,a,b,m,c,x0)
if x <= x0
b + a*x;
else
c + m*x;
end
end
function jacobianpiece(x,p)::Array{Float64,2}
J::Array{Float64,2} = zeros(Float64,length(x),length(p))
dx = x[2]-x[1]
i::Int = 1
while i <= length(x) && x[i] <= p[end]
J[i,1] = x[i]
i += 1
end
if x[end] + dx >= p[end] #otherwise no x[end] dependence
if i <= length(x) #checks that we are within data range
J[i,5] = (piecewise(x[i], p[1],p[2],p[3],p[4],p[5]+dx)-piecewise(x[i], p[1],p[2],p[3],p[4],p[5]-dx))/(2*dx)
end
if i-1 >= 1
J[i-1,5] = (piecewise(x[i-1], p[1],p[2],p[3],p[4],p[5]+dx)-piecewise(x[i-1], p[1],p[2],p[3],p[4],p[5]-dx))/(2*dx)
end
if i-2 >= 1
J[i-2,5] = (piecewise(x[i-2], p[1],p[2],p[3],p[4],p[5]+dx)-piecewise(x[i-2], p[1],p[2],p[3],p[4],p[5]-dx))/(2*dx)
end
end
j::Int = 1
while j <= length(x) && x[j] <= p[end]
J[j,2] = 1
j += 1
end
while i <= length(x) && x[i] > p[end]
J[i,3] = x[i]
i += 1
end
while j <= length(x) && x[j] > p[end]
J[j,4] = 1
j += 1
end
J
end
Thanks for your help!
Also about the lazy structure, I was thinking at the level of the residuals array it could help reduce memory consumption when the input vector is large.
from lsqfit.jl.
Okay, so did you actually profile this? I could imagine re-allocating that Jacobian all the time could be expensive. This is something we allow for in Optim and NLsolve: just pre-allocate that Jacobian once and reuse it. I should add that possibility to LsqFit as well. What happens if you check out the master branch and use
curve_fit(f, p0; autodiff=:forward)
?
from lsqfit.jl.
Indeed I did. Reusing the Jacobian would probably be great, and actually the curve_fit
function accounts for 99% of the memory allocation of the whole program.
With automatic differenciation I get an error:
MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{getfield(NLSolversBase, Symbol("#ff!#1")){getfield(LsqFit, Symbol("#f#10")){typeof(piecemodel),Array{Float64,1},Array{Float64,1}}},Float64},Float64,5}) ...
It breaks when it comes to fitting the piecewise function, it'd make sense that auto differentiation doesn't support functions that aren't differentiable on the entire real axis.
I then tried using autodiff only for the linear fits, but it makes the memory allocation skyrocket (from 25GB to 190GB) and it largely slows down the execution (~4.7 times slower).
from lsqfit.jl.
It's actually complaining about your use of ::Float64
because ForwardDiff wants to pass in a special number type (that's the Dual
part). You just need to make that's parametric. Instead of
function f(x)
cache = Array{Float64, 3}(undef, 3, 3, 3)
end
You'd put
function f(x::AbstractArray{T}) where T
cache = Array{T, 3}(undef, 3, 3, 3)
end
or
function f(x)
cache = Array{eltype(x), 3}(undef, 3, 3, 3)
end
from lsqfit.jl.
Until I get around to enabling inplace-operations, you can use a closure or a callable type.
Xdata = ...
p0 = ...
struct LinJacobian{T}
J::T
end
LinJacobian(x, p) = LinJacobian(zeros(eltype(x), length(x), length(p))
function (lj::LinJacobian)(x,p)
J = lj.J
J[:,1] .= x #dmodel/dp[1]
J[:,2] .= 1 dmodel/dp[2]
J
end
struct PieceJacobian{T}
J::T
end
PieceJacobian(x, p) = PieceJacobian(zeros(eltype(x),length(x), length(p)))
function (pj::PieceJacobian)(x, p)
J = pj.J
dx = x[2]-x[1]
i::Int = 1
while i <= length(x) && x[i] <= p[end]
J[i,1] = x[i]
i += 1
end
if x[end] + dx >= p[end] #otherwise no x[end] dependence
if i <= length(x) #checks that we are within data range
J[i,5] = (piecewise(x[i], p[1],p[2],p[3],p[4],p[5]+dx)-piecewise(x[i], p[1],p[2],p[3],p[4],p[5]-dx))/(2*dx)
end
if i-1 >= 1
J[i-1,5] = (piecewise(x[i-1], p[1],p[2],p[3],p[4],p[5]+dx)-piecewise(x[i-1], p[1],p[2],p[3],p[4],p[5]-dx))/(2*dx)
end
if i-2 >= 1
J[i-2,5] = (piecewise(x[i-2], p[1],p[2],p[3],p[4],p[5]+dx)-piecewise(x[i-2], p[1],p[2],p[3],p[4],p[5]-dx))/(2*dx)
end
end
j::Int = 1
while j <= length(x) && x[j] <= p[end]
J[j,2] = 1
j += 1
end
while i <= length(x) && x[i] > p[end]
J[i,3] = x[i]
i += 1
end
while j <= length(x) && x[j] > p[end]
J[j,4] = 1
j += 1
end
J
end
and then you create
jacobianlin = LinJacobian(Xdata, p0)
jacobianpiece = PieceJacobian(Xdata, p0)
and pass those to curve_fit. I think it'll be a big performance improvement.
from lsqfit.jl.
It seems like the slow down I observed from switching to autodiff is actually due to moving to master alone. If I go back to the initial version of the code and stay on master, the code is as slow. I should have been more careful.
Thanks for the tip on callable types! I think I'm not using it correctly though:
> jacobianlin = LinJacobian(xdata, p0)
> fit = curve_fit(linmodel, jacobianlin, xdata, ydata,p0)
MethodError: no method matching curve_fit(::typeof(linmodel), ::LinJacobian{Array{Float64,2}}, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1})
from lsqfit.jl.
Ah, I see. This is because curve_fit
specifies that they have to be <:Function
- that is something we need to change. You can probably do
> jacobianlin = LinJacobian(xdata, p0)
> fit = curve_fit(linmodel, (x,p)->jacobianlin(x,p), xdata, ydata,p0)
such that it's an anonymous function instead.
It bothers me that you say that master is slower than the official version. That is something I have to investigate. Thanks.
I don't know if you've noticed, I'm trying to get this package up to speed again so I can tag a new version :)
from lsqfit.jl.
One thing I did do was to up the number of maxIter
so maybe it just capped out at 100 all the time before, and now it runs for longer?
from lsqfit.jl.
The maxIter certainly seems to play for something, it goes back to similar performances, running some small tests with maxIter=100:
31.835962 seconds (451.36 k allocations: 33.421 GiB, 12.11% gc time) # master
27.380634 seconds (438.64 k allocations: 25.311 GiB, 9.40% gc time) # release
21.280030 seconds (423.66 k allocations: 11.331 GiB, 7.31% gc time) # release + callable type
The run time has some variability, but memory consumption/allocations /gc time are about exactly constant.
The callable type is a clear improvement, particularly memory-wise, thanks!
Also thanks a lot for working on this project, curve fitting is hugely important in the field I work in! If you need any help with anything, I'll be glad to contribute.
from lsqfit.jl.
Related Issues (20)
- Progress bar HOT 1
- Support for NTuple inputs and outputs
- Fitting of 2D data is broken HOT 1
- Exports from `StatsBase`
- Release HOT 5
- n_buffer has wrong dimensions HOT 2
- More documentation issues HOT 1
- Trust - Region
- No method matching curve_fit HOT 1
- (1) hoping you won't remove lmfit, and (2) seeking help re Geodesic acceleration HOT 2
- Citing LsqFit.jl HOT 1
- Support for CUDA HOT 4
- Support for geodesic accelaration with vector function
- Feature request: Seeking stopval or callback, to give greater control over stopping criteria HOT 1
- Weighted LSQfit fails, Message: DomainError [...] sqrt will only return a complex result if called with a complex argument. HOT 4
- Proposal: Add more verbose interface HOT 3
- `stderror` throws `LinearAlgebra.LAPACKException(1)` with weighted least squares fitting. HOT 4
- Geodesic Acceleration 3D function
- Problems using Lsqfit inside a module in Julia 1.8.5 HOT 1
- Crash if the Jacobian is singular
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lsqfit.jl.