juliastats / klara.jl Goto Github PK
View Code? Open in Web Editor NEWMCMC inference in Julia
License: Other
MCMC inference in Julia
License: Other
I'm really like the MCMC package, in particular the neat interface, and the autodiff possibility.
Unfortunately, I cant parse model expressions. For example:
modelxpr = quote
v ~ Normal(0, 1)
end
mymodel3 = model(modelxpr, v=ones(3))
throws an error:
ERROR: [parseModel] unmanaged expr type macrocall
in error at error.jl:21
in explore at /home/ich/.julia/MCMC/src/autodiff/parsing.jl:86
in explore at /home/ich/.julia/MCMC/src/autodiff/parsing.jl:85
in explore at /home/ich/.julia/MCMC/src/autodiff/parsing.jl:100
in explore at /home/ich/.julia/MCMC/src/autodiff/parsing.jl:85
in parseModel! at /home/ich/.julia/MCMC/src/autodiff/parsing.jl:119
in generateModelFunction at /home/ich/.julia/MCMC/src/autodiff/parsing.jl:429
in MCMCLikelihoodModel at /home/ich/.julia/MCMC/src/modellers/likmodel.jl:93
in model at /home/ich/.julia/MCMC/src/modellers/mcmcmodels.jl:28
I'm using MCMC 0.0.3 and the
versioninfo()
output is:
Julia Version 0.3.0-prerelease
Platform Info:
System: Linux (x86_64-linux-gnu)
CPU: Intel(R) Core(TM) i7-3687U CPU @ 2.10GHz
WORD_SIZE: 64
BLAS: libblas.so.3
LAPACK: liblapack.so.3
LIBM: libopenlibm
I hope that helps.
The current codebase seems to be a little liberal with memory. If we're hoping to achieve good performance, we probably need to avoid lines like:
oldbeta = copy(beta)
beta += randn(model.size) * scale
I'm still working through everything, but this code seems to involve allocating a new vector for every sample, which we shouldn't really be doing. At least in principle, we can perform all memory allocation at the start of an MCMC run.
How general would you want the specification of a Bayes GLM model to be? If the prior on the coefficients for the linear predictor is multivariate normal then the calculation of the estimates is straightforward. Just replace the iteratively reweighted least squares (IRLS) algorithm by a penalized iteratively reweighted least squares (PIRLS) algorithm. We do this when fitting GLMMs. If a more general specification of the logprior density is desired then other methods will be needed.
I'm trying to get the simple examples to run.
julia> using MCMC
Loading RMW(scale) sampler
Loading MALA(driftStep, monitorPeriod) sampler
Loading HMC(innerSteps, stepSize) sampler
julia> mymodel = Model(v-> -dot(v,v), 3, ones(3))
Model(# function,[:beta=>PDims(1,(3,))],3,[1.0,1.0,1.0])
julia> res = mymodel * RWM(0.1) * (100:1000)
ERROR: no method MCMCChain(Dict{Any,Any},Array{Float64,1},MCMCTask,Float64)
in run at /home/simon/.julia/MCMC/src/runners/run.jl:17
in * at operators.jl:47
My build of Julia is as follows:
Version 0.2.0-prerelease+2820
"Commit 44c9887* 2013-07-26 06:32:24 UTC"
x86_64-linux-gnu
This is a placeholder for now, will get back to it.
Following loading messages, in my humble opinion, are not very beneficial, especially when batch execution.
Loading MCMCLikModel model
Loading ARS(logCandidate, logCandidateScalingFactor, scale, tuner) sampler
Loading IMH(logCandidate, randCandidate!) sampler
Loading RWM(scale, tuner) sampler
Loading RAM(scale, rate) sampler
Loading MALA(driftStep, tuner) sampler
...
So I propose that this kind of loading messages should be printed to the screen only in the interactive session.
The isinteractive()
function is helpful to judge whether the execution is interactive or not.
Moreover, this macro will scrap the boilerplate code:
macro loading(component)
if isinteractive()
quote
println("Loading $(string($component)) $(super($component)):")
for method in methods($component)
print(" ")
println(method)
end
end
end
end
...
@loading HMC
This prints much more informative messages (though the file locations look pesky):
Loading HMC MCMCSampler:
HMC(nLeaps::Int64,leapStep::Float64) at /Users/kenta/.julia/v0.3/MCMC/src/samplers/HMC.jl:78
HMC(nLeaps::Int64,leapStep::Real,tuner::Union(Nothing,MCMCTuner)) at /Users/kenta/.julia/v0.3/MCMC/src/samplers/HMC.jl:71
HMC() at /Users/kenta/.julia/v0.3/MCMC/src/samplers/HMC.jl:80
HMC(tuner::Union(Nothing,MCMCTuner)) at /Users/kenta/.julia/v0.3/MCMC/src/samplers/HMC.jl:76
HMC(nLeaps::Int64) at /Users/kenta/.julia/v0.3/MCMC/src/samplers/HMC.jl:77
HMC(nLeaps::Int64,tuner::Union(Nothing,MCMCTuner)) at /Users/kenta/.julia/v0.3/MCMC/src/samplers/HMC.jl:77
HMC(leapStep::Float64) at /Users/kenta/.julia/v0.3/MCMC/src/samplers/HMC.jl:79
HMC(leapStep::Float64,tuner::Union(Nothing,MCMCTuner)) at /Users/kenta/.julia/v0.3/MCMC/src/samplers/HMC.jl:79
Automatic differentiation is currently used only when the model is specified via expression parsing. We should enable optional use of automatic differentiation when the model is specified explicitly via a function or distribution too.
Would there be interest in some code for a couple of convergence diagnostics?
The routines that are meant to compute the asymptotic variance of Monte Carlo chains are not functional anymore. This is because of recent relevant advances in DataFrames
and StatsBase
. The issue is that we store the MCMC samples in a dataframe in the MCMC
package. We then look at these samples columnwise, therefore we work with DataArray
s. The problem is that at the moment all the autocovariance and autocorrelation functions in StatsBase
don't operate on DataArray
s. How shall we bridge this gap between the JuliaStats packages to ensure portability and integration of code?
@fredo-dedup, @StefanKarpinski, @johnmyleswhite, @lindahua, @dmbates
Andrew Gelman contacted me about the possibility of a Julia interface to Stan, similar to the rstan package for R. I'll just quote our exchange here. Let me know if it would be more appropriate to move this to the milestones issue under StatsBase or to one of the discussion groups.
On Mon, Apr 28, 2014 at 6:01 AM, Andrew Gelman [email protected] wrote:
Hi Doug. I hope all is well with you. Bob Carpenter was telling me that you’ve been active in the Stan list recently. I’ve also heard you’ve been working with Julia. We have some interest in putting together a Julia/Stan link, at the very least having a way to run Stan from Julia. One reason for this is that soon Stan should have the capability of quickly fitting a lot of models that we now do in R (lm, glm, lmer, etc) using point estimation of various sorts as well as full HMC. Do you have any suggestions/thoughts about how a Julia link could be built? I don’t think any of us in stan-core use Julia, but we’ve been having a lot of contributions from outsiders (for example, Allen Riddell who wrote PyStan and now is part of stan-core). Any advice would be appreciated. Thanks.
-- My reply
There is definitely interest in the Julia community in Stan. To get an idea of the current statistical capabilities in Julia take a look at the JuliaStats group on github.com. A more complete list of packages is at
http://julia.readthedocs.org/en/latest/packages/packagelist/
An interface between Julia and Stan is, in principle, no more complicated than rstan and could, in fact, be somewhat easier than rstan. A package implementing such an interface would typically be called something like JuliaStan (camel-case names for packages are preferred) and stored in a repository named JuliaStan.jl. There are sophisticated, although somewhat complicated, ways of building binary dependencies on various operating systems. I am currently struggling with implementing such dependencies for the Metis graph-partitioning package, now mostly complete.
Julia has very flexible facilities for calling C functions, see the section on "calling C and Fortran code" in http://docs.julialang.org/en/latest/ (note that the latest Julia release, 0.2, is badly out-of-date now relative to the 0.3-prerelease version and developers mostly work in 0.3). At present it is not easy to call C++ functions or methods directly because of name mangling but anything declared within
extern "C" {
...
}
is okay.
The package structure is reasonably easy to pick up for anyone familiar with github.
I can imagine several ways of proceeding.
It really depends on whether you want to emphasize the Julia aspects or the Stan aspects of the project.
If you wish, I could block out a JuliaStan.jl repository based on rstan under my github login and then we could determine where to transfer it.
May I forward this message or a summary of this message to the JuliaStats members to indicate that the Stan community is interested in Julia?
There is an MCMC.jl project that seeks to implement HMC and NUTS samplers in Julia and there is active development of forward and backward automatic differentiation. I don't know the developers working on that project well and cannot vouch for their capabilities.
How can I retrieve the function values from an MCMC object, say, if I want to find out which sample point had the highest (posterior) density?
PackageEvaluator.jl is a script that runs nightly. It attempts to load all Julia packages and run their tests (if available) on both the stable version of Julia (0.3) and the nightly build of the unstable version (0.4). The results of this script are used to generate a package listing enhanced with testing results.
Tests pass.
Package doesn't load.
Tests pass.
means that PackageEvaluator found the tests for your package, executed them, and they all passed.
Package doesn't load.
means that PackageEvaluator did not find tests for your package. Additionally, trying to load your package with using
failed.
This issue was filed because your testing status became worse. No additional issues will be filed if your package remains in this state, and no issue will be filed if it improves. If you'd like to opt-out of these status-change messages, reply to this message saying you'd like to and @IainNZ will add an exception. If you'd like to discuss PackageEvaluator.jl please file an issue at the repository. For example, your package may be untestable on the test machine due to a dependency - an exception can be added.
Test log:
>>> 'Pkg.add("MCMC")' log
INFO: Cloning cache of MCMC from git://github.com/JuliaStats/MCMC.jl.git
INFO: Cloning cache of ReverseDiffSource from git://github.com/JuliaDiff/ReverseDiffSource.jl.git
INFO: Installing ArrayViews v0.4.8
INFO: Installing Distributions v0.5.5
INFO: Installing MCMC v0.0.9
INFO: Installing PDMats v0.2.4
INFO: Installing ReverseDiffSource v0.0.5
INFO: Installing StatsBase v0.6.9
INFO: Package database updated
INFO: METADATA is out-of-date a you may not have the latest version of MCMC
INFO: Use `Pkg.update()` to get the latest versions of your packages
>>> 'using MCMC' log
Julia Version 0.3.1
Commit c03f413* (2014-09-21 21:30 UTC)
Platform Info:
System: Linux (x86_64-unknown-linux-gnu)
CPU: Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
WORD_SIZE: 64
BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
LAPACK: libopenblas
LIBM: libopenlibm
LLVM: libLLVM-3.3
ERROR: invalid method definition: not a generic function
in include at ./boot.jl:245
in include_from_node1 at ./loading.jl:128
in include at ./boot.jl:245
in include_from_node1 at ./loading.jl:128
in reload_path at loading.jl:152
in _require at loading.jl:67
in require at loading.jl:51
in include at ./boot.jl:245
in include_from_node1 at loading.jl:128
in process_options at ./client.jl:285
in _start at ./client.jl:354
in _start_3B_1712 at /home/idunning/julia03/usr/bin/../lib/julia/sys.so
while loading /home/idunning/pkgtest/.julia/v0.3/MCMC/src/jobs/jobs.jl, in expression starting on line 1
while loading /home/idunning/pkgtest/.julia/v0.3/MCMC/src/MCMC.jl, in expression starting on line 94
while loading /home/idunning/pkgtest/.julia/v0.3/MCMC/testusing.jl, in expression starting on line 2
>>> test log
no tests to run
>>> end of log
@scidom, here are a few ideas, tell me what you think before I proceed...
model()
that would act as a single entry point for creating models, this would be more user friendly and less intimidating. The large set of model types that we have created (and will certainly continue to grow) would remain available for more advanced users.scale
field to the model type, either a scalar or a vector (or left empty), to give a hint to the samplers as to how the parameters should be moved (I am thinking of RWM, but it would surely be useful elsewhere). This kind of information more logically belongs to the model type and we could still force another scaling at the sampler type creation if needed.I'm a little worried by the duplication of so many distributions in this codebase. Julia should have exactly one implementation of all distributions. If something's wrong with how Distributions.jl is implemented, we should find a way to fix that.
Apart from the samples
field of the MCMCChain
type, another field is needed, say gradient
, which will hold the computed gradient at each point of the chain. The gradient is needed for post-processing the chain (for calculating control variates for instance). @fredo-dedup, would you agree that this extra field is the most plausible way of returning the gradient values to the user?
We may need to consider how to organize the fields of MCMCChain
more generally - this relates to the seqMC work in progress too.
DataArray isn't defined in SerialMC.jl ; a fix is to explicitly import it
import DataArrays.DataArray
and/or to add DataArrays to REQUIRE
Hi @scidom
I have a procedural question. Starting to work using the MCMC framework, occasionally I will have questions. Would you like to see those raised as issues here or in some other way? I do realize we are all busy.
As a concrete example, the "accept" key in diagnostics is ignored in the MCMC stats functions like mean(). This is ok for MH, but not for ARS (Acceptance Rejection Sampling).
The accept key right now does not indicate valid posterior samples. There are many ways around this, but just wanted to get your input from the procedural side.
@IainNZ I tried to update all the badges on README. travis and readthedocs are now fine, while the pkgeval badge doesn't seem to be ok yet (I have changed MCMC->Lora in the pkgeval URLs). No rush with this, it is not urgent, just brought it to your attention.
This issue has been opened solely for keeping a record on how the MCJob
and MCRunner
types are interweaved and designed (in case these thoughts need to be reproduced).
MCJob subtypes hold a field of type MCRunner
. The main concern has been whether it would be more sensible to define the field to be of type MCRunner
or Vector{MCRunner}
. The answer is that the field should be of type MCRunner
.
To understand why, notice that no matter whether a serial, sequential or parallel tempering Monte Carlo runner is defined, it regulates (and in that sense characterizes) the whole MCJob
. In other words, the runner dictates how to run the job. This make sense using plain English at a first verbose level, i.e. the runner runs the job. Moreover, it makes sense logically, meaning that an algorithm indicates how to run a specific job, and the name "runner" is attached to this algorithm.
The only vague point that needs to be clarified is that a runner may hold fields such as the burnin or total Monte Carlo length or the thinning interval (see coded runners). Now, all existing runners assume that these fields are unique, therefore scalar Float64
entities. For example, a sequential Monte Carlo runner runs several chains in parallel, but even then the runner presumes that the burnin, thinning and total length are shared between the parallel chains.
if a new sequential or tempering method is designed one day, it may be the case that a different burnin, thinning or total length will be designated to each of the involved chains. Then it would make sense to have a vector holding the burnin lengths of the chains, and similarly a vector for the thinning and a vector for the total length of the chains. Nevertheless, the method is unique, that is there is a unique algorithm (runner) that consumes the job. For this reason, the right design is to encapsulate the burnin, thinning and total lengths (vectors) in the runner instead of creating a vector of runners, each one holding the burnin, thinning and total length (scalars) of a single chain.
So, this is right:
type NewMCRunner <: MCRunner
burnin::Vector{Float64}
thinning::Vector{Float64}
nsteps::Vector{Float64}
end
type NewMCJob <: MCJob
runner::NewMCRunner
...
end
run(r::NewMCRunner, ...)
On the other hand, this is conceptually wrong:
type NewMCRunner <: MCRunner
burnin::Float64
thinning::Float64
nsteps::Float64
end
type NewMCJob <: MCJob
runner::Vector{NewMCRunner}
...
end
run(r::Vector{NewMCRunner}, ...)
E.g., after:
@time mcchain = run(m * [HMC(1, 0.025), NUTS()] * SerialMC(1000:10000))
describe(mcchain) would produce something like:
HMC(1, 0.025) | NUTS() | |
---|---|---|
Parameter 1 | ||
------------------ | ||
Min | -0.9911931944813289 | -0.9784462069915573 |
Mean | -0.7659593771072393 | -0.7652180273205432 |
Max | -0.5299229353887581 | -0.5228078563712695 |
MC Error | 0.0007079890147378 | 0.0006563835134460041 |
ESS | 6561.211265417861 | 7905.80020498207 |
AC Time | 1.3718503544370717 | 1.1385311754182401 |
----------------- | --------------------------------- | ---------------------------------- |
Parameter 2 | ||
------------------ | ||
Min | -1.3720937309757602 | -1.3720937309757602 |
... | ... | ... |
------------------ | --------------------------------- | ----------------------------------- |
If this is reasonable, I can take a stab at a first version. Clearly in this example I'm not showing aspects like thinning.
@johnmyleswhite do you manage the premissions for JuliaStats? I would like to kindly ask for re-granting me moderator access to MCMC.jl; I can't access the package's settings so as to enable the readthedocs webhook. Cheers.
In the most recent update we have declared abstract MCJob
in input.jl and then in jobs.jl
we try to define various methods named MCJob
. This doesn't work.
Another thing to notice about the current state of jobs.jl is that the MCJob
function is not type stable. depending on the j
parameter it returns a PlainMCJob
or a TaskMCJob
.
This issue is being filed by a script, but if you reply, I will see it.
PackageEvaluator.jl is a script that runs nightly. It attempts to load all Julia packages and run their test (if available) on both the stable version of Julia (0.2) and the nightly build of the unstable version (0.3).
The results of this script are used to generate a package listing enhanced with testing results.
The status of this package, MCMC, on...
'No tests, but package loads.' can be due to their being no tests (you should write some if you can!) but can also be due to PackageEvaluator not being able to find your tests. Consider adding a test/runtests.jl
file.
'Package doesn't load.' is the worst-case scenario. Sometimes this arises because your package doesn't have BinDeps support, or needs something that can't be installed with BinDeps. If this is the case for your package, please file an issue and an exception can be made so your package will not be tested.
This automatically filed issue is a one-off message. Starting soon, issues will only be filed when the testing status of your package changes in a negative direction (gets worse). If you'd like to opt-out of these status-change messages, reply to this message.
@fredo-dedup I was thinking that the next massive improvement for our package would be to generalize some aspects of model specification, namely:
I haven't touched model specification at all until today. Does the relevant code reside in src/parsers/modelparser.jl? Have you made any progress in any other branch or repo, to make sure we avoid duplication of work? I was thinking of using the Julia Graphs package for model parsing (still in an imperative fashion, I don't plan to go for declarative programming), what do you think, would that be a good direction?
Instaling the MCMC package and then loading it, gives a warning that Distributions is not found.
WARNING:
Distributions not found
at C:\Users\K.julia\MCMC\src\autodiff/distributions.jl:9
at C:\Users\K.julia\MCMC\src\MCMC.jl:33
at In[2]:1
in require at loading.jl:44
in include at boot.jl:238
in include_from_node1 at loading.jl:118
in include at boot.jl:238
in include_from_node1 at loading.jl:118
in reload_path at loading.jl:144
I was trying to test a simple model using the MCMC package, but noticed that generated code differs between invocations of "model" and usually this code returns the wrong result. After some looking around and testing, I have figured out that the ReverseDiffSource package causes the problem, it seems to return the wrong value. I'm reporting the issue here because ReverseDiffSource has drastically changed since 0.0.4.
My example code is published as a gist: https://gist.github.com/tomhaber/691bd1a8b8672cdd894f
The lowered code clearly shows the problem, but is easily overlooked :)
Tom
@fredo-dedup, as I was writing a test file for the empirical tuner, I wondered if we need the tuner field in the NUTS sampler. I was thinking that if we choose an adaptive version of NUTS it will probably be standalone (in a similar way that HMCDA is separate from HMC). If you agree with this line of thinking and approve it, I will then remove tuner
from NUTS
.
@fredo-dedup, do you know what causes the failure? See also papamarkou/ZVSimulator.jl#1
In runners.jl there is a lot of manual type checking (isa(t.runner, SerialMC)
)
Would it be better to parameterize the MCMCTask
type on the type of the runner so that it can be dispatched via julia's type checking rather than manually checking types?
This is a set of notes towards designing model specification using factor graphs as part of the MCMC package.
To start with, here is a first thought on how to define a factor graph as a type. There are three main components tha fully specify a factor graph:
V
, holding the local variables. The terminology comes from [Frey, 1997]. To comply with the Graphs.jl package, this would be termed also as the set of variable vertices.S
(also called the subset vertices), holding the associated local functions. in an MCMC context these functions would be the log-targets of the associated variable vertix blocks, specified as functions or as distributions. Apparently V and S are disjoint.E
. Since a factor graph is a bipartite graph, there should be only edges connecting a vertex from S to a vertex from V, which would tell which variables from V
the corresponding function from S
updates. It should be relatively easy to code an algorithm is_bipartite()
operating on existing graphs, since the Graphs.jl package supports depth-first as well as breadth-first traversal.The Mamba.jl package does something similar by using stochastic, logical and input nodes. Using a factor graph, all these nodes would belong to V
. The functions in S
, which could be priors or likelihoods, would update the corresponding blocks of nodes.
This of course still needs to be pushed forward some more, but it is already rather clear that an imperative exploitation of factor graphs is not all that hard to achieve.
PackageEvaluator.jl is a script that runs nightly. It attempts to load all Julia packages and run their tests (if available) on both the stable version of Julia (0.3) and the nightly build of the unstable version (0.4). The results of this script are used to generate a package listing enhanced with testing results.
Tests pass.
Tests fail, but package loads.
Tests pass.
means that PackageEvaluator found the tests for your package, executed them, and they all passed.
Tests fail, but package loads.
means that PackageEvaluator found the tests for your package, executed them, and they didn't pass. However, trying to load your package with using
worked.
Special message from @IainNZ: This change may be due to breaking changes to Dict
in JuliaLang/julia#8521, or the removal of deprecated syntax in JuliaLang/julia#8607.
This issue was filed because your testing status became worse. No additional issues will be filed if your package remains in this state, and no issue will be filed if it improves. If you'd like to opt-out of these status-change messages, reply to this message saying you'd like to and @IainNZ will add an exception. If you'd like to discuss PackageEvaluator.jl please file an issue at the repository. For example, your package may be untestable on the test machine due to a dependency - an exception can be added.
Test log:
>>> 'Pkg.add("MCMC")' log
INFO: Installing ArrayViews v0.4.6
INFO: Installing Distributions v0.5.4
INFO: Installing MCMC v0.0.7
INFO: Installing PDMats v0.2.4
INFO: Installing ReverseDiffSource v0.0.4
INFO: Installing StatsBase v0.6.6
INFO: Package database updated
INFO: METADATA is out-of-date a you may not have the latest version of MCMC
INFO: Use `Pkg.update()` to get the latest versions of your packages
>>> 'using MCMC' log
WARNING: deprecated syntax "(T=>Int)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/scalarstats.jl:98.
Use "Dict{T,Int}()" instead.
WARNING: deprecated syntax "(T=>Int)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/scalarstats.jl:122.
Use "Dict{T,Int}()" instead.
WARNING: deprecated syntax "(T=>Float64)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/counts.jl:162.
Use "Dict{T,Float64}()" instead.
WARNING: deprecated syntax "(T=>Int)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/counts.jl:192.
Use "Dict{T,Int}()" instead.
WARNING: deprecated syntax "(T=>W)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/counts.jl:193.
Use "Dict{T,W}()" instead.
WARNING: deprecated syntax "(T=>Int)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/misc.jl:66.
Use "Dict{T,Int}()" instead.
WARNING: deprecated syntax "(T=>Int)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/misc.jl:77.
... truncated ...
Use "Dict{Any,Any}(a=>b, ...)" instead.
WARNING: deprecated syntax "{a=>b, ...}" at /home/idunning/pkgtest/.julia/v0.4/MCMC/src/samplers/HMC.jl:140.
Use "Dict{Any,Any}(a=>b, ...)" instead.
WARNING: deprecated syntax "{a=>b, ...}" at /home/idunning/pkgtest/.julia/v0.4/MCMC/src/samplers/HMC.jl:146.
Use "Dict{Any,Any}(a=>b, ...)" instead.
WARNING: deprecated syntax "{a=>b, ...}" at /home/idunning/pkgtest/.julia/v0.4/MCMC/src/samplers/HMCDA.jl:111.
Use "Dict{Any,Any}(a=>b, ...)" instead.
WARNING: deprecated syntax "{a=>b, ...}" at /home/idunning/pkgtest/.julia/v0.4/MCMC/src/samplers/HMCDA.jl:116.
Use "Dict{Any,Any}(a=>b, ...)" instead.
WARNING: deprecated syntax "{a=>b, ...}" at /home/idunning/pkgtest/.julia/v0.4/MCMC/src/samplers/NUTS.jl:176.
Use "Dict{Any,Any}(a=>b, ...)" instead.
WARNING: deprecated syntax "{a=>b, ...}" at /home/idunning/pkgtest/.julia/v0.4/MCMC/src/samplers/SMMALA.jl:103.
Use "Dict{Any,Any}(a=>b, ...)" instead.
WARNING: deprecated syntax "{a=>b, ...}" at /home/idunning/pkgtest/.julia/v0.4/MCMC/src/samplers/SMMALA.jl:108.
Use "Dict{Any,Any}(a=>b, ...)" instead.
WARNING: deprecated syntax "{a=>b, ...}" at /home/idunning/pkgtest/.julia/v0.4/MCMC/src/samplers/PMALA.jl:120.
Use "Dict{Any,Any}(a=>b, ...)" instead.
WARNING: deprecated syntax "{a=>b, ...}" at /home/idunning/pkgtest/.julia/v0.4/MCMC/src/samplers/PMALA.jl:126.
Use "Dict{Any,Any}(a=>b, ...)" instead.
WARNING: deprecated syntax "{a=>b, ...}" at /home/idunning/pkgtest/.julia/v0.4/MCMC/src/samplers/RMHMC.jl:164.
Use "Dict{Any,Any}(a=>b, ...)" instead.
WARNING: deprecated syntax "{a=>b, ...}" at /home/idunning/pkgtest/.julia/v0.4/MCMC/src/samplers/RMHMC.jl:168.
Use "Dict{Any,Any}(a=>b, ...)" instead.
WARNING: deprecated syntax "{a=>b, ...}" at /home/idunning/pkgtest/.julia/v0.4/MCMC/src/runners/SerialMC.jl:41.
Use "Dict{Any,Any}(a=>b, ...)" instead.
WARNING: deprecated syntax "{a=>b, ...}" at /home/idunning/pkgtest/.julia/v0.4/MCMC/src/runners/SeqMC.jl:108.
Use "Dict{Any,Any}(a=>b, ...)" instead.
Julia Version 0.4.0-dev+998
Commit e24fac0 (2014-10-07 22:02 UTC)
Platform Info:
System: Linux (x86_64-unknown-linux-gnu)
CPU: Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
WORD_SIZE: 64
BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
LAPACK: libopenblas
LIBM: libopenlibm
LLVM: libLLVM-3.3
>>> test log
WARNING: deprecated syntax "(T=>Int)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/scalarstats.jl:98.
Use "Dict{T,Int}()" instead.
WARNING: deprecated syntax "(T=>Int)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/scalarstats.jl:122.
Use "Dict{T,Int}()" instead.
WARNING: deprecated syntax "(T=>Float64)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/counts.jl:162.
Use "Dict{T,Float64}()" instead.
WARNING: deprecated syntax "(T=>Int)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/counts.jl:192.
Use "Dict{T,Int}()" instead.
WARNING: deprecated syntax "(T=>W)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/counts.jl:193.
Use "Dict{T,W}()" instead.
WARNING: deprecated syntax "(T=>Int)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/misc.jl:66.
Use "Dict{T,Int}()" instead.
WARNING: deprecated syntax "(T=>Int)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/misc.jl:77.
... truncated ...
WARNING: deprecated syntax "{a=>b, ...}" at /home/idunning/pkgtest/.julia/v0.4/MCMC/src/samplers/PMALA.jl:126.
Use "Dict{Any,Any}(a=>b, ...)" instead.
WARNING: deprecated syntax "{a=>b, ...}" at /home/idunning/pkgtest/.julia/v0.4/MCMC/src/samplers/RMHMC.jl:164.
Use "Dict{Any,Any}(a=>b, ...)" instead.
WARNING: deprecated syntax "{a=>b, ...}" at /home/idunning/pkgtest/.julia/v0.4/MCMC/src/samplers/RMHMC.jl:168.
Use "Dict{Any,Any}(a=>b, ...)" instead.
WARNING: deprecated syntax "{a=>b, ...}" at /home/idunning/pkgtest/.julia/v0.4/MCMC/src/runners/SerialMC.jl:41.
Use "Dict{Any,Any}(a=>b, ...)" instead.
WARNING: deprecated syntax "{a=>b, ...}" at /home/idunning/pkgtest/.julia/v0.4/MCMC/src/runners/SeqMC.jl:108.
Use "Dict{Any,Any}(a=>b, ...)" instead.
ERROR: `Dict{K,V}` has no method matching Dict{K,V}(::Array{Symbol,1}, ::Array{(Int64,(Int64,)),1})
in MCMCLikelihoodModel at /home/idunning/pkgtest/.julia/v0.4/MCMC/src/modellers/MCMCLikModel.jl:123
in model at /home/idunning/pkgtest/.julia/v0.4/MCMC/src/modellers/models.jl:28
in include at ./boot.jl:245
in include_from_node1 at ./loading.jl:128
in anonymous at no file:14
in include at ./boot.jl:245
in include_from_node1 at loading.jl:128
in process_options at ./client.jl:293
in _start at ./client.jl:362
in _start_3B_3789 at /home/idunning/julia04/usr/bin/../lib/julia/sys.so
while loading /home/idunning/pkgtest/.julia/v0.4/MCMC/test/test_hmc.jl, in expression starting on line 23
while loading /home/idunning/pkgtest/.julia/v0.4/MCMC/test/runtests.jl, in expression starting on line 12
Running tests:
* test_hmc.jl *
Testing basic HMC constructors...
Testing that HMC sampler can run...
INFO: Testing MCMC
================================[ ERROR: MCMC ]=================================
failed process: Process(`/home/idunning/julia04/usr/bin/julia /home/idunning/pkgtest/.julia/v0.4/MCMC/test/runtests.jl`, ProcessExited(1)) [1]
================================================================================
INFO: No packages to install, update or remove
ERROR: MCMC had test errors
in error at error.jl:21
in test at pkg/entry.jl:719
in anonymous at pkg/dir.jl:28
in cd at ./file.jl:20
in cd at pkg/dir.jl:28
in test at pkg.jl:68
in process_options at ./client.jl:221
in _start at ./client.jl:362
in _start_3B_3789 at /home/idunning/julia04/usr/bin/../lib/julia/sys.so
>>> end of log
Note to self: define jobs as parametric types, parameterized by the choice of runner. This is because some job fields (such as samplers and tasks) may need to be vectors depending on the runner type.
I am working on writing a generic Gibbs sampler that shares design principles with the code here. (the WIP is on the gibbs branch of my fork)
I was wondering why we don't always return a fresh task after running a chain? Specifically, I noticed that a task is stopped before being returned in parallel (run_serialmc_exit
), but not if you run in serial. Unless I'm mistaken this is just a way to tell Julia that it is ok to serialize the task so it can be returned from another process.
When I wrote a simple random walk metropolis routine I found that it was easier to always spin a new task at the end of running a chain (in serial or in parallel) because that way the task is always ready to resume (again, in serial or in parallel). I found that I could get the task to resume at the same place without losing any samples simply by consuming the task one more time to find a new model.init
value, spinning a new task with that updated model
, then making sure that the first produce
in the sampler was model.init
.
Is there a reason we don't do that here?
Hi Julians,
As you may have seen, the package has gone through a (semi-complete) major upgrade. I am currently documenting the package so that users can catch up with the changes. This will take me a few more days, as I am writing up bits of doc sections daily whenever I can squeeze some time in my timetable.
In the meantime, we have been thinking how we could possibly amplify the range of models accommodated by the package. For this reason, I would like to invite those of you interested in MCMC to catch up on Google Hangout. @spencerlyon2 and I are going to have a chat tomorrow at 11:00am New York time (3:00pm London time). Please feel free to drop in if interested in participating. My hope is that we can make these e-meetings more regular, meeting once per forthnight or month to discuss progress. I will set up a new repo to upload minutes, possibly under JuliaStats.
I have made some thoughts on how factor graphs could be possibly used for MCMC model specification, see issue #72 . Nevertheless, tomorrow's chat will be focused mostly on Gibbs sampling. @spencerlyon2 will explain how he has coded Gibbs to fit the existing state of MCMC.jl. I think it would be good for us to understand how he set up the updating mechanism and get his code merged soon so that the relevant functionality is available. We can return to the more general issue of model specification later.
If you have any other MCMC issues that you would like to discuss in future e-meetings, you can mention them tomorrow.
@fredo-dedup @spencerlyon2 @goedman @johnmyleswhite @lindahua @dmbates @StefanKarpinski (please forward this to people I may have missed out)
PackageEvaluator.jl is a script that runs nightly. It attempts to load all Julia packages and run their tests (if available) on both the stable version of Julia (0.2) and the nightly build of the unstable version (0.3). The results of this script are used to generate a package listing enhanced with testing results.
Tests pass.
Package doesn't load.
Tests pass.
means that PackageEvaluator found the tests for your package, executed them, and they all passed.
Package doesn't load.
means that PackageEvaluator did not find tests for your package. Additionally, trying to load your package with using
failed.
This issue was filed because your testing status became worse. No additional issues will be filed if your package remains in this state, and no issue will be filed if it improves. If you'd like to opt-out of these status-change messages, reply to this message saying you'd like to and @IainNZ will add an exception. If you'd like to discuss PackageEvaluator.jl please file an issue at the repository. For example, your package may be untestable on the test machine due to a dependency - an exception can be added.
Test log:
INFO: Installing ArrayViews v0.4.6
INFO: Installing Distributions v0.5.2
INFO: Installing MCMC v0.0.6
INFO: Installing PDMats v0.2.1
INFO: Installing ReverseDiffSource v0.0.4
INFO: Installing StatsBase v0.5.3
INFO: Package database updated
ERROR: NumericExtensions not found
in require at loading.jl:47
in include at ./boot.jl:245
in include_from_node1 at ./loading.jl:128
in reload_path at loading.jl:152
in _require at loading.jl:67
in require at loading.jl:54
in include at ./boot.jl:245
in include_from_node1 at ./loading.jl:128
in reload_path at loading.jl:152
in _require at loading.jl:67
in require at loading.jl:54
in include at ./boot.jl:245
in include_from_node1 at ./loading.jl:128
in reload_path at loading.jl:152
in _require at loading.jl:67
in require at loading.jl:51
in include at ./boot.jl:245
in include_from_node1 at loading.jl:128
in process_options at ./client.jl:285
in _start at ./client.jl:354
while loading /home/idunning/pkgtest/.julia/v0.3/PDMats/src/PDMats.jl, in expression starting on line 9
while loading /home/idunning/pkgtest/.julia/v0.3/Distributions/src/Distributions.jl, in expression starting on line 4
while loading /home/idunning/pkgtest/.julia/v0.3/MCMC/src/MCMC.jl, in expression starting on line 9
while loading /home/idunning/pkgtest/.julia/v0.3/MCMC/testusing.jl, in expression starting on line 1
INFO: Package database updated
In examples/probit_regression.jl
, for example, it is better to use
vaso = readdlm(Pkg.dir("MCMC","examples","vaso.txt"), ' ')
than the current
vaso = readdlm("vaso.txt", ' ')
so that the script can be run from other directories.
I can submit a pull request if desired.
In the near future, once code and documentation are satisfactorily settled. See Gadfly as an example of how the badges would look.
I haven't quite formulated in my mind how we would do this, but it would make sense to define two types related to MCMCChain, one for vectors and one for dataframes, following the paradigm of DataFrame and DataArrays being separate entities. When one works on summary stats on MCMC output it becomes obvious that some restructuring is needed towards this direction. This is open for discussion as to how it will be settled.
@LaurenceA, your suggestion in #34 to somehow allow use of MCMCChain and samples with external models is very useful. I want to work towards this direction and my first impression is that your suggestion is feasible and attainable. I would like to ask you for some input, from a user's perspective in order to see how exactly I am going to achieve the end goal.
Let's say that you have a matrix of Monte Carlo samples that you have somehow computed. This matrix will populate the MCMCChain.samples
field. You may (or may not) have the relevant scores (grad of the log-likelihood for instance) stored in a separate matrix, which could populate the MCMCChain.gradients
field. Assuming this situation, I would like to ask you the following:
MCMC
package. There is a reason of course for this, since tasks make it much easier to handle the simulations. This task needs in turn a triplet (model, sampler, runner) as its fields. The sampler and runner are easy to be instantiated using the existing samplers and runners. What you want is to define your own model rather than using the likelihood model of the MCMC
package. Do I understand correct what you are trying to achieve?MCMCModel
or else we would have to simply change the MCMCTask.model
field so as to be a subtype of Model
instead of MCMCModel
. The main question is what kind of statistical model you want to define. This is a more general question, since one may want to input here a model from GLM
or from any other Julia package. @fredo-dedup, @johnmyleswhite, @dmbates, @andreasnoackjensen, @lindahua I am copying you in here so that you follow up the discussion (and so that you contribute if you have some ideas). If you can all give me an idea of possible models and situations that one may want to use with MCMC
then I can start extending the codebase in MCMC
to better work with external models.I am not sure what is the most Julia-compatible way of naming the various samplers, models and runners? I think that camel case is not desired among the Julians, is this correct? For example, we have the HMC
type, which is probably fine to use capitals for it, but I guess the actual external constructors should be called hmc
instead? Furthermore, is it better to call the sampler file HMC.jl
or hmc.jl
?
To calculate marginal likelihoods via power posteriors (Friel and Pettitt 2008; DOI: 10.1111/j.1467-9868.2007.00650.x) or stepping stone sampling (doi: 10.1093/sysbio/syq085), one needs to run MCMC, sampling from
I was thinking whether it would be better to rename the existing HMC sampler to LMC (standing for Lagrangian Monte Carlo). If I am right, the existing sampler relies on the velocities and Lagrangian, rather than the momentum and Hamiltonian. These two samplers are mathematically equivalent, yet it may be better to have two different implementations, called LMC and HMC respectively. If you are ok with this @fredo-dedup, I will rename the existing HMC to LMC and will port its Hamitonian-equivalent under then name HMC.
@scidom and @fredo-dedup - Great job so far on MCMC.jl. I had already tweaked RWM.jl to accept a vector of scale parameters, but it looks as though you're already on top of this. A few requests (that I am happy to help with):
The way the samplers and runners are integrated makes it probably impossible for the samplers to access the diagnostics field of the MCMCChain. It is necessary for the samplers to access the diagnostics; for example, it is needed for calculating the acceptance rate for tuning. Do we have an easy way to change this or we need to drastically redesign the sampler-runner-modeller scheme?
We currently have the following 7 Julia sources for automatic and symbolic differentiation, with varying approaches to the matter:
It may be challenging to merge all this code because the above approaches to autodiff and symbolic differentiation vary. We may need to open a discussion as to how to achieve the unification.
I opened this issue as a placeholder-reminder to add expression parsing and autodiff for correlated model parameters, i.e. for mutlivariate distributions with a covariance matrix that is neither isotropic nor diagonal. For example, we would want to draw Monte Carlo samples from a multivariate Normal:
modelxpr = quote
v ~ MultivariateNormal(zeros(2), [1 0.3; 0.3 1])
end
One of the most important aspects of MCMC simulation is the adaptive (in the broad sense of the word) tuning of the Monte Carlo algorithm's parameters (ex MALA drift step or HMC leap step). Our MCMC
code lacks this crucial feature so far. This could be done either on the basis of a user-provided function, which tunes the step size according to the current acceptance ratio, or in a more automated way in lieu of such user input. A first mplementation of this adaptation is available in the GeometricMCMC
package (in src/setStep.jl
), but I would like to discuss how this can be done more elegantly (and possibly more efficiently). I believe this is one of the most high-priority enhancements from the user's point of view.
I am going to write up a yml Travis test for our package by the end of the day. @fredo-dedup if you manage to get the autodiff related tests working by then, this would be great; if you need more time, this is ok, I will temporarily disable the autodiff testing until it runs successfully.
Syntax :
Use of Julia Tasks :
Looking forward :
I'm really impressed by the number of samplers and diagnostic techniques you have here, and it'd be awesome to incorporate some of that functionality into work that doesn't neatly fit into the MCMC.jl framework. Perhaps you might want to:
Number 1 seems pretty straightforward, especially if you can thin out all but the last sample. But (at some point in the distant future) it might be nice to have functions that are optimized for giving you just one sample, rather than a full MCMCChain object.
Number 2 is a bit more difficult - I can't see how to construct an MCMCChain object with only samples from my custom sampler.
I'm particularly interested in this, because I'd like to incorporate some of your functionality in Church.jl, a probabilistic programming language that focuses more on non-parametrics. But I think there are lots of people in the ML community who would appreciate modular implementations of MCMC methods and diagnostics that they can use easily in their own samplers.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.