juliamath / sobol.jl Goto Github PK
View Code? Open in Web Editor NEWgeneration of Sobol low-discrepancy sequence (LDS) for the Julia language
License: Other
generation of Sobol low-discrepancy sequence (LDS) for the Julia language
License: Other
This was working fine on 1.5.0-DEV.854 (2020-05-05)
(9 days old) but today gives an error:
$ /Applications/Julia-1.6.app/Contents/Resources/julia/bin/julia --startup-file=no
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.6.0-DEV.41 (2020-05-15)
_/ |\__'_|_|_|\__'_| | Commit 8370133584 (0 days old master)
|__/ |
julia> using Sobol
[ Info: Precompiling Sobol [ed01d8cd-4d21-5b2a-85b4-cc3bdc58bad4]
ERROR: LoadError: LoadError: syntax: expression too large
Stacktrace:
[1] top-level scope at /Users/me/.julia/packages/Sobol/UKDQ8/src/soboldata.jl:2171
[2] include at ./Base.jl:368 [inlined]
[3] include(::String) at /Users/me/.julia/packages/Sobol/UKDQ8/src/Sobol.jl:1
[4] top-level scope at /Users/me/.julia/packages/Sobol/UKDQ8/src/Sobol.jl:5
[5] include(::Module, ::String) at ./Base.jl:368
[6] top-level scope at none:2
[7] eval at ./boot.jl:331 [inlined]
[8] eval(::Expr) at ./client.jl:448
[9] top-level scope at ./none:3
in expression starting at /Users/me/.julia/packages/Sobol/UKDQ8/src/soboldata.jl:2171
in expression starting at /Users/me/.julia/packages/Sobol/UKDQ8/src/Sobol.jl:1
ERROR: Failed to precompile Sobol [ed01d8cd-4d21-5b2a-85b4-cc3bdc58bad4] to /Users/me/.julia/compiled/v1.6/Sobol/ohkOh_nZr7I.ji.
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1328
[3] _require(::Base.PkgId) at ./loading.jl:1030
[4] require(::Base.PkgId) at ./loading.jl:928
[5] require(::Module, ::Symbol) at ./loading.jl:923
julia>
Not necessarily an issue, but this seems like the best place to put this.
There seems to be tables for higher dimensional Sequences available online. Would this be interesting to incorporate?
I'm interested in using Sobol sequences in financial applicaitons where the dimension needs to be very large, but from what I've read quasi-monte-carlo does poorly in high dimension. Perhaps it is unnecessary to code a Sobol sequence generator for such high dimensional problems?
Thanks for this great library, I find it very useful. However, I am not sure that the library supports the iteration interface of Julia. Eg Sobol.SobolSeq
returns not an iterator, but an iterator and a state. So, for example,
import Sobol
function collect_take(iter, n)
v = Vector{Float64}(0)
for (i, x) in enumerate(iter)
push!(v, x[1])
if (i >= n)
break
end
end
v
end
double_take(iter, n) = vcat(collect_take(iter,5), collect_take(iter,5))
double_take(Sobol.SobolSeq(1), 5)
returns the first 10 elements (instead of the first 5, twice). Cf, for example,
double_take(countfrom(1),5)
Also, idioms like
[x[1] for x in take(Sobol.SobolSeq(1),5)]
don't work.
PackageEvaluator.jl is a script that runs nightly. It attempts to load all Julia packages and run their tests (if available) on both the stable version of Julia (0.3) and the nightly build of the unstable version (0.4). The results of this script are used to generate a package listing enhanced with testing results.
Tests pass.
Package doesn't load.
Tests pass.
means that PackageEvaluator found the tests for your package, executed them, and they all passed.
Package doesn't load.
means that PackageEvaluator did not find tests for your package. Additionally, trying to load your package with using
failed.
This issue was filed because your testing status became worse. No additional issues will be filed if your package remains in this state, and no issue will be filed if it improves. If you'd like to opt-out of these status-change messages, reply to this message saying you'd like to and @IainNZ will add an exception. If you'd like to discuss PackageEvaluator.jl please file an issue at the repository. For example, your package may be untestable on the test machine due to a dependency - an exception can be added.
Test log:
>>> 'Pkg.add("Sobol")' log
INFO: Cloning cache of Sobol from git://github.com/stevengj/Sobol.jl.git
INFO: Installing BinDeps v0.3.0
INFO: Installing MathProgBase v0.2.5
INFO: Installing NLopt v0.1.1
INFO: Installing SHA v0.0.2
INFO: Installing Sobol v0.1.0
INFO: Installing URIParser v0.0.2
INFO: Building NLopt
INFO: Package database updated
INFO: METADATA is out-of-date a you may not have the latest version of Sobol
INFO: Use `Pkg.update()` to get the latest versions of your packages
>>> 'using Sobol' log
ERROR: type: ccall: expected Symbol, got Array{Any,1}
in include at ./boot.jl:245
in include_from_node1 at ./loading.jl:128
in reload_path at loading.jl:152
in _require at loading.jl:67
in require at loading.jl:54
in include at ./boot.jl:245
in include_from_node1 at ./loading.jl:128
in reload_path at loading.jl:152
in _require at loading.jl:67
in require at loading.jl:51
in include at ./boot.jl:245
in include_from_node1 at loading.jl:128
in process_options at ./client.jl:285
in _start at ./client.jl:354
in _start_3B_1699 at /home/idunning/julia03/usr/bin/../lib/julia/sys.so
while loading /home/idunning/pkgtest/.julia/v0.3/NLopt/src/NLopt.jl, in expression starting on line 375
while loading /home/idunning/pkgtest/.julia/v0.3/Sobol/src/Sobol.jl, in expression starting on line 3
while loading /home/idunning/pkgtest/.julia/v0.3/Sobol/testusing.jl, in expression starting on line 1
>>> test log
no tests to run
>>> end of log
Might be nice to use StaticArrays (i.e. return an SVector
, not a Vector
), since dimension of a SobolSeq
is part of the type already.
However, for very high dimensions, a Vector
is probably more appropriate, so the array type should probably be an additional type parameter.
Maybe this is kind of neat for sub-random number generation:
https://en.wikipedia.org/wiki/Low-discrepancy_sequence#Additive_recurrence
You can use the golden ratio and some other ones:
https://en.wikipedia.org/wiki/Silver_ratio
https://en.wikipedia.org/wiki/Plastic_number
Luckily the wikipedia entries have the values in hexadecimal for coding and to check if the final digit in 32 or 64 bit is odd, which is better than even.
In my application, I draw parameters from a SobolSeq
and do some expensive simulations with them. I distribute the calculations across machines, so each machine is supposed to use a certain range of Sobol numbers.
Lets say, I want Sobol numbers n:(n+m)
. How do I get them other than
[next!(s) for 1:n+m][n+1:end]
Is there any benefit in using the existing skip function? Does it do something clever, or does it also compute and throw away the unwanted ones?
Is there interest in adding this function to the package?
function skip_exact(s, n)
n_eff = 2 ^ floor(Int, log(n) / log(2))
Δ = n - n_eff
skip(s, n)
for i in 1:Δ
next!(s)
end
s
end
# want to get the nth element
n = 10
s1 = SobolSeq(1)
x1 = [next!(s1)[1] for i in 1:n][n]
s2 = SobolSeq(1)
skip_exact(s2, n-1)
x2 = next!(s2)[1]
@assert x1 == x2
https://github.com/stevengj/Sobol.jl/blob/bc88e65341786898e2be250f53b447d9871ccd7e/src/Sobol.jl#L65
The function trailing_zeros
seems to be different from what's mentioned in some of the papers I found. Also see https://web.maths.unsw.edu.au/~fkuo/sobol/joe-kuo-notes.pdf#page=2&zoom=auto,-207,420 — some test values are given below eqn(5).
I didn't understand your implementation entirely, so I might be wrong!
It would be interesting to compare against this implementation, both for correctness and performance.
I believe your "skip" function should be altered to actually skip one less than it currently does. As a simple benchmark, if I take a vector of floats and convert each to boolean by testing with >=0.5, I would like it to be the case that a set of 2^n samples covers each binary vector exactly once. This works if I skip 2^n - 1
times:
s = SobolSeq(4); for u=1 : 15;next!(s);end
sort(map(bs ->
sum(map(x->x[2] >= 0.5 ? x[1] : 0, zip(2 .^ [0 1 2 3], bs))),
map(x->next!(s), 1:16)
))
returns all numbers from 0 to 15 exactly once. If I instead run
s = SobolSeq(4);for u=1:16;next!(s);end
sort(map(bs ->
sum(map(x->x[2] >= 0.5 ? x[1] : 0, zip(2 .^ [0 1 2 3], bs))),
map(x->next!(s), 1:16)
))
then I get a repeat of 12, and no instance of 8. But skipping the 2^n - 1 seems to produce correct results for all cases I tried.
The advice of "skip the largest power of 2 below your desired one" seems quite bad, actually. Suppose I only want to produce 15 values of 4-bits -- I would hope to get no repeats, and one missing; maybe something less perfect than that. But it's actually quite terrible!
julia> s = SobolSeq(4);skip(s,15)
julia> sort(map(bs ->
sum(map(x->x[2] >= 0.5 ? x[1] : 0, zip(2 .^ [0 1 2 3], bs))),
map(x->next!(s), 1:15)
))
15-element Array{Int64,1}:
4
5
5
6
6
7
7
8
8
9
9
10
10
11
11
That is to say, every vector produced has (bit 2 XOR bit 3) true! That should only be true in half of them! This is the result of only skipping 8. If you skip 15, as I suggest, then you get better results. (And, again: this seems true at various higher numbers as well.)
from julia-users:
https://groups.google.com/forum/#!topic/julia-users/Jpj_Zb7EmcQ
Fixed in 28c1860. Should we add a test?
I haven't looked into the issue itself, but could have a look at the end of the week.
This issue is used to trigger TagBot; feel free to unsubscribe.
If you haven't already, you should update your TagBot.yml
to include issue comment triggers.
Please see this post on Discourse for instructions and more details.
If you'd like for me to do this for you, comment TagBot fix
on this issue.
I'll open a PR within a few hours, please be patient!
There seems to be a bit of a performance dip (10x) in 0.5-rc3 compared to 0.4.6, especially for scaled Sobol Sequences
e.g.
The benchmark on 0.5-rc3
S=SobolSeq(1,4,6)
@benchmark next(S)
BenchmarkTools.Trial:
samples: 10000
evals/sample: 9
time tolerance: 5.00%
memory tolerance: 1.00%
memory estimate: 96.00 bytes
allocs estimate: 1
minimum time: 2.81 μs (0.00% GC)
median time: 2.91 μs (0.00% GC)
mean time: 2.90 μs (0.00% GC)
maximum time: 6.16 μs (0.00% GC)
While on 0.4.6
S=SobolSeq(1,4,6)
@benchmark next(S)
BenchmarkTools.Trial:
samples: 10000
evals/sample: 251
time tolerance: 5.00%
memory tolerance: 1.00%
memory estimate: 80.00 bytes
allocs estimate: 1
minimum time: 306.00 ns (0.00% GC)
median time: 321.00 ns (0.00% GC)
mean time: 329.61 ns (1.27% GC)
maximum time: 6.40 μs (93.92% GC)
Any ideas?
First of all, thank youfor writing Sobol.jl. I have been learning about Quasi Monte Carlo (QMC) recently and I'm so glad that there is a Sobol module in Julia. I want to suggest a fairly significant API change, but I hope to convince you that it's a very good idea.
1) Include the initial
The first value in the Sobol sequence is zero. This recent paper by Art Owen shows that the common practice of dropping the
using PyPlot
for x in 0:(1/4):1
for y in 0:(1/4):1
plot([x,x],[0,1],"C1--")
plot([0,1],[y,y],"C1--")
end
end
dim = 2
seq = SobolSeq(dim)
# Include the (0,0) point, as per Owen (2021).
plot([0],[0],"ko")
for j in 1:(N-1)
local x,y = next!(seq)
plot([x],[y],"C0o")
end
# The 'extra' point.
x,y = next!(seq)
plot([x],[y],"ro")
Notice how Sobol.jl leaves the bottom-left square empty and instead puts two points in another square. So that ruins the digital net, and the convergence. You can fix this by skipping the next
2) Add support for scrambled Sobol points
Back in the 90's Owen proposed scrambling digital nets to create a hybrid method that brings the best features of MC and QMC. In Randomized QMC (RQMC) you scramble the points in a way that preserves the digital net but creates a random instance and removes the corner points like
Since Owen's proposal, several authors have proposed various ways to scramble the points and have shown empirically that RQMC has better convergence than QMC. But more importantly, RQMC recovers a feature from MC that is lost in QMC: The ability to measure uncertainty. In MC, if you draw
A scrambled Sobol is already implemented in scipy.stats.qmc.Sobol and in Owen's R package (rsobol). Unfortunately I was not able to figure out either method works, so I could not translate them to Julia. It doesn't look complicated at all; I just don't understand SciPy, R, or Sobol direction vectors well enough. They don't seem to be doing exactly the same thing, but they both seem to be doing something short and clever with the direction vectors.
I should mention that the R code from Owen is BSD-licensed, so if you can read R, you can just take his method and convert it to Julia.
Notice that the SciPy and R implementations both default to scrambled points and you have to explicitly ask for unscrambled points.
3) Don't produce Sobol points one at a time, draw
At this point I hope I have convinced you that drawing anything other than a power of 2 worth of Sobol points ruins the digital net, and misses what makes Sobol powerful. In Owen's article he says
Using 1000 points of a Sobol’ sequence may well be less accurate than using 512.
The SciPy API strongly encourages you to draw next!()
method and instead draw a power of two.
I realize that this was a long request and that I'm suggesting breaking the API, but I hope I have made a convincing case that this would turn Sobol.jl into a very powerful tool for anyone computing high-dimensional integrals with RQMC methods. Not coincidentally, this is precisely what I am hoping to do with it.
Let me know what you think.
References:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.