numericaleft / mcintegration.jl Goto Github PK

View Code? Open in Web Editor NEW

35.0 3.0 3.0 1.45 MB

Robust and fast Monte Carlo algorithm for high dimension integration

Home Page: https://numericaleft.github.io/MCIntegration.jl/dev

License: MIT License

Julia 90.93% C 1.62% Python 5.30% Fortran 2.15%

monte-carlo integration julia high-performance markov-chain vegas high-dimensionality parallel-computing

mcintegration.jl's People

Contributors

Stargazers

Watchers

Forkers

dgfug hudanasrulloh lxvm

mcintegration.jl's Issues

add Graph variable to support Discrete variables sampled from their neighbors

Some discrete variable is better sampled from a set of neighbors instead of sampling globally. Once such discrete variable is implemented, there is no need for the MC update changeOrder in :mcmc. One can merge :mcmc solver with :vegasmc.

Once this is implemented, it probably makes sense to rename :vegas to :mc and :vegasmc to :mcmc. Indeed, Vegas is the name for the adaptive map used in the continuous variable. It is better not to use it as a name for the entire MC algorithm.

simplify measurement interface

Currently, the measurement interface is complicated. For example,

function measure(config)
obs = config.observable
factor = 1.0 / config.reweight[config.curr]
extidx = config.var[3][1]
weight = integrand(config)
obs[extidx] += weight / abs(weight) * factor
end

User should not be exposed with concepts like config.reweight

Incompatibility with built-in functions when `var` is provided as a tuple.

Hi,

When the var parameter is provided as a tuple of e.g. Continuous variables, the integration gives the following error:

julia> θ = Continuous(0.0, π / 2)
Adaptive Continuous variable ∈ [0.0, 1.5707963267948966). Max number = 16. Learning rate = 2.0.

julia> φ = Continuous(0.0, 2π)
Adaptive Continuous variable ∈ [0.0, 6.283185307179586). Max number = 16. Learning rate = 2.0.

julia> ℓ=0.1
0.1

julia> XY = Continuous(-ℓ / 2, ℓ / 2)
Adaptive Continuous variable ∈ [-0.05, 0.05). Max number = 16. Learning rate = 2.0.

julia> int_config = Configuration(var=(θ, φ, XY), dof=[[1, 1, 2]])
Configuration for 1 integrands involves 3 types of variables.
Number of variables for each integrand: [[1, 1, 2]].


julia> prob = integrate((x, c) -> (sin(x[1]) * cos(x[1])^2), config=int_config)
ERROR: MethodError: no method matching sin(::Continuous{Vector{Float64}})
Closest candidates are:
  sin(::T) where T<:Union{Float32, Float64} at special/trig.jl:29
  sin(::LinearAlgebra.Hermitian{var"#s884", S} where {var"#s884"<:Complex, S<:(AbstractMatrix{<:var"#s884"})}) at ~/julia-downloads/julia-1.8.3/share/julia/stdlib/v1.8/LinearAlgebra/src/symmetric.jl:731
  sin(::Union{LinearAlgebra.Hermitian{var"#s885", S}, LinearAlgebra.Symmetric{var"#s885", S}} where {var"#s885"<:Real, S}) at ~/julia-downloads/julia-1.8.3/share/julia/stdlib/v1.8/LinearAlgebra/src/symmetric.jl:727
  ...
Stacktrace:
 [1] (::var"#6#7")(x::Tuple{Continuous{Vector{Float64}}, Continuous{Vector{Float64}}, Continuous{Vector{Float64}}}, c::Configuration{1, Tuple{Continuous{Vector{Float64}}, Continuous{Vector{Float64}}, Continuous{Vector{Float64}}}, Nothing, Vector{Float64}, Float64})
   @ Main ./REPL[12]:1
 [2] montecarlo(config::Configuration{1, Tuple{Continuous{Vector{Float64}}, Continuous{Vector{Float64}}, Continuous{Vector{Float64}}}, Nothing, Vector{Float64}, Float64}, integrand::var"#6#7", neval::Float64, print::Int64, save::Int64, timer::Vector{Any}, debug::Bool; measure::Nothing, measurefreq::Int64, inplace::Bool)
   @ MCIntegration.VegasMC ~/.julia/packages/MCIntegration/LUtOl/src/vegas_mc/montecarlo.jl:109
 [3] _block!(configs::Vector{Configuration{1, Tuple{Continuous{Vector{Float64}}, Continuous{Vector{Float64}}, Continuous{Vector{Float64}}}, Nothing, Vector{Float64}, Float64}}, obsSum::Vector{Vector{Float64}}, obsSquaredSum::Vector{Vector{Float64}}, summedConfig::Vector{Configuration{1, Tuple{Continuous{Vector{Float64}}, Continuous{Vector{Float64}}, Continuous{Vector{Float64}}}, Nothing, Vector{Float64}, Float64}}, solver::Symbol, progress::ProgressMeter.Progress, integrand::Function, nevalperblock::Float64, print::Int64, save::Int64, timer::Vector{Any}, debug::Bool, measure::Nothing, measurefreq::Int64, inplace::Bool, parallel::Symbol)
   @ MCIntegration ~/.julia/packages/MCIntegration/LUtOl/src/main.jl:230
 [4] integrate(integrand::var"#6#7"; solver::Symbol, config::Configuration{1, Tuple{Continuous{Vector{Float64}}, Continuous{Vector{Float64}}, Continuous{Vector{Float64}}}, Nothing, Vector{Float64}, Float64}, neval::Float64, niter::Int64, block::Int64, print::Int64, printio::Base.TTY, save::Int64, saveio::Nothing, timer::Vector{Any}, gamma::Float64, adapt::Bool, debug::Bool, reweight_goal::Nothing, ignore::Int64, measure::Nothing, measurefreq::Int64, inplace::Bool, parallel::Symbol, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ MCIntegration ~/.julia/packages/MCIntegration/LUtOl/src/main.jl:140
 [5] top-level scope
   @ REPL[12]:1

I think in the tutorials the only case pertaining to a tuple of input variables is the "Measure Histogram". Does this suggest that a custom "measure" function is required when providing more than 1 variable? Thanks for the explanation.

Best,
Charles

Support quasi Monte Carlo

It looks like currently only RNG -based sampling is supported. As discussed in SciML/Integrals.jl#11 Cuba supports using QMC. It is also supported in https://github.com/ranjanan/MonteCarloIntegration.jl . Would it be possible to integrate this into MCIntegration.jl too?

You can use e.g. https://github.com/SciML/QuasiMonteCarlo.jl for a generic interface to different low-discrepancy sequences.

support callback functions and signal mechansim

For each operation, support two kinds of callback functions: accept() and reject(), so that user can update their part after each update.

DoStringextensions / HDF5 Outdated

Currently, MCIntegration required DocStringExtensions v0.8.6. This is incompatible with the newest versions of ITensors.jl . The install dropped me down to v0.3.36. Is is possible to upgrade this and HDF5 to v0.17.1 ?

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

cached variable

Sometimes, a n-dimensional variable needs to be transformed before it could be used in the integrand. For example, the (r, theta, phi) spherical coordinates to (x, y, z). It will be more efficient for the variable to cache the transformed values if the integrand involves many variables.

One possible implementation is to add a cached array and add a transform function as the member of the variable struct. If the transform function is initialized, the variable will be transformed and saved to the cached array after every update.

error handling of complex observable has a bug

The above lines can not handle complex numbers

adapative variable sampling

For each kind of variables, allowing user to supply a discrete grid. The code will accumulates weight on this grid, which will be used to do important sampling

The metropolis algorithm should be more careful with zero weight input

The code needs to take care of zero weight input.

Sample from learned variables directly

Hi,

Thanks for making this library for integration. I'm wondering if it's possible to sample from the learned variables (in the Configuration) directly? This would be helpful to evaluate other quantities that depend on the variable distribution after the integration and optimization is done. It would also be helpful to output the pdf of the optimized variables.

Best,
Charles

add autosave for MC calculation. Automatically save after given time interval or received kill signal

Batch support

Fails examples

The following two examples give wrong answers:

https://github.com/numericalEFT/MCIntegration.jl/blob/7cf0620f4d252b06fbfa8e7ec91b7cb9601d821f/example/vegas/vegas2.jl
likely due to the variable initialization issue.
https://github.com/numericalEFT/MCIntegration.jl/blob/7cf0620f4d252b06fbfa8e7ec91b7cb9601d821f/example/cuba/cuba.jl
The code is not able to resolve the singularity.

improve MC sampling algorithm

The following is the roadmap

Allow two different types of mesh: a) adaptive mesh from the vegas algorithm, b) user-defined mesh
keep track of a histogram for the mesh points.
sample the histogram to propose the variable, so that the acceptance ratio is optimized.

Remark:

As long as one uses a high-quality mesh, the histogram will be relatively flat. Therefore, one could sample the histogram with a naive MC algorithm (namely, sample the variable uniformly, then reject or accept accordingly to the distribution derived from the histogram).
If the histogram is not so flat, the variable sampled from the histogram may be strongly self-correlated. Then the integral evaluation (expansive!) with the variables could be strongly self-correlated, too. To avoid this problem, one could sample the histogram before actual MC integration to learn the autocorrelation time Nt. Then one may choose to do one integral evaluation per Nt histogram samplings (the latter is extremely efficient anyway).

Fail to deal with integrands with singularities: a simple case

Dear Prof. Chen,

Recently I am working on the following integration, where the integrand has a singularity at $x=2$. It turns out that both your package and the standard 1D numerical integration package QuadGK.jl fail to deal with it.

using MCIntegration
using SpecialFunctions
using QuadGK

z0(x) = 2 ≤ x ≤ 3 ? 3 + 2√(3 - x) - (x / 2)^2 : 4 * √(3 - x)
z1(x) = 2 ≤ x ≤ 3 ? 4 * √(3 - x) : 3 + 2 * √(3 - x) - (x / 2)^2
dos(x) = 1 / (√(z0(x)) * π^2) * ellipk(z1(x) / z0(x))

f_integrand(x, c) = dos(x[1])
rslt1 = integrate(f_integrand, var=Continuous(-6, 2 - 1e-5)) # return error
quadgk(dos, -6, 2 - 1e-5) # return error

As a workaround, one can skip this singular point manually by separating the integration regime into several parts if necessary. Thus a naive way to make such kind of integration to work is by adding a key word argument specifying the points to be excluded in the integration. However, I guess there must be better ways to deal with this problem, at a more fundamental level relating to the algorithms used.

BTW, I've checked that this integration can be perfectly done in commercial softwares like Mathematica.

Best,
Junsen Wang

Make MPI into a package extension

It's a heavy dependency, and thus requiring it by default when it's only used by a small subset of applications is not necessary. If it's made into a package extension then I do not think any functionality is lost.

refactor user API

All variables should be initialized with size = 1 by default. (Better to call it copy instead of size)
In MC update, when passing a variable x to the integrand, if it has size = 1, then the first element of x pool will be passed, so that the user can call the variable with something like x, instead of x[1].
In the initialization of Configuration, if the user provides dof, which specifies more than one copies for the variable x, then the program will automatically resize the x variable to accommodate more copies. In this case, MC update directly passes x pool to integrand. User can call the variable with x[1], x[2], ...
Add a paragraph in abstract to explain how to specify dof properly.
if there is only one integrand, result.mean only have one observable; otherwise, there will be a vector of observables. It can cause confusion if the observables themselves are arrays. For example, result.mean[1] will return the first element of the observable in the first case, while return the first observable in the second case.

GPU support

Support batch sampling
Parallelize the batches on GPU

Variables must be initialized randomly

Currently, the variables are initialized with fixed values. It could be dangerous.

For example, assume k1 and k2 are both initialized to exactly the same value, then a singular function 1/(k1-k2)^2 could be divergent.

Bug with vegasmc and vegas with user-defined datatype

In the montecarlo.jl of the vegasmc algorithm, if the user is calculating one integral with a user-defined vector-like datatype (say, TYPE), then the following lines of code has a bug:

The _weights before the "integrand" call is something like [TYPE(data),], while the _weights after the call become TYPE(data).

Then the line 15 will cause an error, because weights is something like [TYPE(data),], while _weights is TYPE(data). If TYPE is iterable, _weights[1] will be of the different type of weights[1].