joshspeagle / dynesty Goto Github PK
View Code? Open in Web Editor NEWDynamic Nested Sampling package for computing Bayesian posteriors and evidences
Home Page: https://dynesty.readthedocs.io/
License: MIT License
Dynamic Nested Sampling package for computing Bayesian posteriors and evidences
Home Page: https://dynesty.readthedocs.io/
License: MIT License
Although no one should ever use this, I should add in a simple rejection sampler that always samples uniformly from the unit N-cube. This will be useful for shrinkage tests and baseline performance tests.
Now's the time to actually write stuff up! Testing should proceed simultaneously as the final features get added.
While the current implementation of dynesty
computes the first and second moments of the volume weights logwt
, the distribution of the full set of weights is (in theory) exactly known.
t_i
at iteration i
is i.i.d. Beta(K,1)
where K
is the current set of live points. This leads to shrinkage of t_1 * ... * t_i
.j
iterations once we've terminated our run after collecting N
dead points) is distributed as Beta(K-j+1, j)
. The entire set of points can be jointly distributed as Y_1+...+Y_{K-j+1} / Y_1 + ... + Y_{K+1}
where the K+1
Y's are i.i.d. Expo (standard exponential).Add in a function that can quickly simulate these statistical uncertainties by drawing N
t_i ~ Beta(K,1)
and K+1
Y_i ~ Expo
random variables.
Need to fix those utilities to deal with results from DynamicSampler
. Also, maybe I should implement a "approximate" vs "exact" mode to simulate from marginals rather than joint (depending on timing tests).
Users might want to "continue" a previous run after they've dumped the set of previous results to disk. Either:
Results
dictionary to be used to initialize a sampler, ormerge_runs
).When using dynesty on large parallel cpu clusters, matplotlib is many times not needed and sometimes not available resulting in a ImportError. Therefore, it would be better/more efficient if dynesty only imported matplotlib if the user wanted to use the visualization submodule.
According to @guillochon, six
is better for Python 2/3 compatibility.
Although the code is written in a parallelizable-ish form, I haven't done any actual tests. Those need to happen!
Two issues here. Currently the target efficiency for rwalk
and rtraj
are hard-coded. I should make this an argument that can be passed by the user in case they want to play around with these.
In addition, the blob
arguments I send back to the chosen NestedSampler
are only used to update the scale and are not saved anywhere. Users might want access to these if they want to tune parameters and examine additional arguments besides just scalefactors. I should have both the scalefactors and blobs returned to the user. This should be implemented using a return_tuning=True
flag, which returns scalefactors (scale
) and tuning parameters (blob
).
This is a multi-part issue. Essentially, the typical way we compute uncertainties (using the KL divergence, aka "information gain" from prior to posterior) only works if the live points are fixed. I believe this generalizes in a straightforward manner to deal with uncertainties with variable prior volume compression (logarithmic vs uniform, and for varying numbers of points). Implement this fix in both the NestedSampler
and DynamicSampler
.
Buchner (2014) outlined a simple algorithm to propose new points that passed shrinkage tests using the Euclidean (RADFRIENDS
) and sup (SUPFRIENDS
) norm. I'd like to implement these within the collection of local sampler classes.
Currently some defaults are accidentally hard-coded. Go through and fix this so that users can plot things with more flexibility.
While I'm sure there are bugs that I'll need to rigorously test out, I think the general version has enough base functionality to finally think about adding in dynamic sampling. There are two basic approaches that I can take.
The simple approach
The integrated approach
nlive=2*ndim
and then one of the other approaches (default 'multi'
) afterwards set by the user. There also need to be some criterion to trigger proposal updates using the live points at each iteration. The user should be able to determine this schedule (both proposal changes and updates).Both approaches will require the ability to regenerate past runs, which will require some additional functionality. This might require some/a lot of internal changes to how I currently have organized the Sampler
objects but not a lot of base method modifications.
In the PolyChord paper, they discuss slice sampling in spaces that have been transformed such that the appropriate step size is ~1. I had interpreted this as saying we should propose steps and directions based on, e.g., the set of bounding ellipsoids. However, I now believe that this may be incorrect: the correct way is to always propose orthogonal steps on the native unit cube, but scale the initial step size based on the proposal distribution in that direction. This is because linear transformations make it more likely that we propose in certain directions vs others, which we fail to correct for. This doesn't matter as much for random walks since we always satisfy detailed balance with any static proposal (and the "right" way to do proposals is Quite Hard), but does matter for slices since we've now distorted our sampling space.
This shouldn't be a huge change I think: we just need to map the orthogonal unit cube basis onto the ellipsoid to get proper scale factors.
Skilling (2006) demonstrated that different nested sampling runs can be trivially added together. This property of nested sampling has its roots (I think) in the properties of Poisson processes. One of the key insights presented in Higson et al. (2017) is that this property also works in reverse, and so it is straightforward to deconstruct a nested sampling run with K
live points into a series of K
independent runs with 1 live point (a "strand"). This enables you to probe some of the underlying sampling errors (which are separate from the simulated weights) within a single run using, e.g., bootstrap techniques.
split
function that takes a run with K
live points and splits it into K
strands.merge
function that takes a set of N
runs with K_1, ..., K_N
live points and merges them into 1 run with K_1 + ... + K_N
live points.resample
function which splits a run with K
live points into K
strands, samples K
strands with replacement, and recombines them into one run.simulate_error
function which resamples the run and then simulates a corresponding set of weights.In my initial set of dynesty
commits when changing over from nestle
, I removed the entire test suite (along with other things) to clear up the repository. However, I'd like to add them back in both as an executable and/or as an interactive notebook. This should also include shrinkage tests.
Add this to the default output print function regardless of whether we're sampling in a batch or not.
The current version of dynesty
implements a first-order integration scheme to estimate the volume weights logwt
using the current expected volume logvol
both for the nested set of "dead" points and the final remaining set of "live" points. We want to switch to a second order scheme to both reduce integration error and improve clarity.
Buchner (2014) outlined a nice "shrinkage test" to see whether nested sampling algorithms are shrinking the prior volume at the expected (proper) rate (i.e. by Beta(K,1)
for K
live points). He found the ellipsoidal decomposition algorithm employed by MultiNEST
(similar to the one used in nestle
and dynesty
) shrinks too quickly, and suggested a fix using bootstrapping to organically derive the ellipsoid enlarge factors to mitigate against this. A similar scheme was implemented in the construction of his RADFRIENDS
algorithm.
As I've gone along and started adding new sampling methods, I've noticed that no sampler can really function completely independently of some method for bounding the distribution -- they all require some tunable proposal distributions, especially to deal with multiple modes. As such, it makes more and more sense to allow for a sampler to bound with one method and propose with another. A list of these methods (so far) are below.
Bounding:
Sampling:
I should change how the samplers work to allow users to specify both options via some bound
and sampling
keywords. This also solves a several other problems: you want to propose at different modes based on their volume, you want your proposal tuned to where you are (to some extent) based on the distribution of live points, etc.
When computing the estimated logzerr
oftentimes the estimates can be ill-defined, leading to warnings being raised during runtime. These interrupt the print statements and are generally annoying because they occur reasonably often, so by default these should probably be suppressed.
After some work, I realized that the simple 1/K_i compression scheme outlined in Higson et al. (2017) actually is incorrect. When there is a deceasing number of live points, we actually are sampling from the set of order statistics associated with the previous (local) maximum number of live points. This leads to the following behavior:
This is fine, but slightly changes the weights that we will assign our samples. I need to change the default behavior in DynamicSampler
to accommodate this.
Plotting the raw weights has so far confused a lot of initial users. I think the better strategy is to natively implement KDE to generate the smoothed weight distribution in runplot
but give users an option to disable it if they want to see the actual weights. My guess is this should be implemented in all plots that actually show the weights explicitly.
I should add one.
When running dynamic nested sampler, it failed (presumably after completing its inner loop), with the following message:
run: 0 | iter: 1513 | nc: 1 | ncall: 51328 | eff(%): 2.950 | logz: -7727.260 +/- 0.227 | dlogz: 13.417 > 0.500 run: 0 | iter: 1514 | nc: 1 | ncall: 51329 | eff(%): 2.952 | logz: -7716.199 +/- 0.660 | dlogz: 2.083 > 0.500 run: 0 | iter: 1515 | nc: 1 | ncall: 51330 | eff(%): 2.953 | logz: -7714.875 +/- nan | dlogz: 0.660 > 0.500 Traceback (most recent call last): File "pcmd_integrate.py", line 55, in <module> results = fit_model.nested_integrate(**args) File "/n/home01/bcook/pixcmd/pcmdpy/fit_model.py", line 202, in nested_integrate sampler.run_nested(ninit=N_points, maxcall_init=max_call, dlogz_init=dlogz) File "/n/home01/bcook/dynesty/dynesty/dynamicsampler.py", line 1121, in run_nested logl_bounds = wt_function(res, wt_kwargs) File "/n/home01/bcook/dynesty/dynesty/dynamicsampler.py", line 82, in weight_function bounds = (min(bounds) - lpad, min(max(bounds) + lpad, nsamps - 1)) ValueError: min() arg is an empty sequence
It seems like, for some reason, the default weight function is not finding any "bounds", in dynamicsampler.py line 81:
bounds = np.arange(nsamps)[weight > maxfrac * max(weight)]
First of all, thanks for this awesome tool!
However, there is a small bug in version 0.8.0:
Line 1145 in cd655fd
The return value of the function make_eigvals_positive
is not initialized of the if statement fails.
In high dimensions, we cannot rely on standard rejection sampling to properly probe the distribution -- no matter how clever we make our proposals (ellipsoids, MCMC, etc.), sooner or later we will inevitably lose to the ~ e^-d
falloff as the volume of the ellipsoids blows up. To solve this, we can use slice sampling, which scales as ~ d^3
instead and has been used in nested sampling via, e.g., PolyChord.
To start, take a look at @pacargile's slice sampling nestle
fork based on @bd-j's squish
code, and combine this with the basic algorithm outlined in the PolyChord paper.
Notes:
This is typically referred to "Galilean Monte Carlo (GMC)" (see, e.g., this). The basic idea is to initialize a particle with some velocity v
and launch it in a random direction. It then proceeds unimpeded until it reflects off the likelihood boundary set by L_j > L_i
, computed using the normal vector at that location. If the reflected particle is itself still outside the boundary, we reverse our trajectory. This procedure is analogous to Hamiltonian Monte Carlo (HMC) methods, except that our potential function is a uniform well with infinitely steep boundaries rather than a smooth log-posterior.
While the name makes sense (I believe it's a derivative of "Hamiltonian"), I find the whole MCMC analogues (and renaming) to be more confusing that enlightening at this point, especially concerning the underlying dynamics. For instance, the "MCMC"-like procedure employed to choose new live points is really a bounded random walk, so why not just call it a random walk? Likewise, Galilean MC is just a bounded random trajectory bouncing off surfaces, so why not just call it a random trajectory?
Notes:
It might be cleaner to put the logic in bounding_ellipsoids
for generating the ellipsods into the ellipsoids themselves (as a method). So you could supply points and pointvol to this method and the ellipsoid would update itself (instead of having to instantiate a new Ellipsoid each time)
It might be useful to allow the sampling technique to vary during a fit as a function of efficiency (or in emcee parlance, 'acceptance fraction') and number of dimensions. For example, if the efficiency of a uniform sampler decreases to 10% or less in a high dimensional space, it would save time to switch to a slice sampler.
Several people have mentioned trying to add some type of emcee
-style logo. The best recommendation has been a crown in a bird's nest (or something similar).
At each iteration, we also want to save the relevant scale factor. This is important to initialize good guess for the dynamic sampler when using non-uniform sampling methods such as slice
, since bad scale factors near the posterior can lead to extremely correlated samples that can corrupt our results.
xrange
only exists for python2. However, the concurrent.futures
package with ThreadPoolExecutor
or ProcessPoolExecutor
only exists in python3
The user defined "live_points" input is not being sent to the DynamicSampler method within the DynamicNestedSampler function.
The URL for the documentation in the README.md should be http://dynesty.readthedocs.io/ not dynesty.readthedocs.io. Currently it interprets the latter as a local file.
While dynamic sampling might eliminate the need for this, I want to make sure I keep things structured so this is still possible. The idea is that once we have a results
object (or more generally a list of samples and associated proposals) that we'll be able to run this over the results to get additional "bonus" weights as an add-on. This should be fine as long as I keep track of all the rejected proposals during each iteration.
When running a test using the dynamic nested sampler, logzvar came out negative after a couple thousand iterations:
` ^Mrun: 0 | iter: 1592 | nc: 1 | ncall: 50649 | eff(%): 3.145 | logz: -7747
.346 +/- 0.219 | dlogz: 36.407 > 0.500 ^Mrun: 0 | iter: 1593 | nc: 1 | n
call: 50650 | eff(%): 3.147 | logz: -7744.512 +/- 0.454 | dlogz: 33.439 >
0.500 ^Mrun: 0 | iter: 1594 | nc: 1 | ncall: 50651 | eff(%): 3.149 | logz
: -7741.552 +/- 0.457 | dlogz: 30.325 > 0.500 ^Mrun: 0 | iter: 1595 | nc
: 1 | ncall: 50652 | eff(%): 3.151 | logz: -7740.474 +/- 0.242 | dlogz: 29.
065 > 0.500 Traceback (most recent call last):
File "pcmd_integrate.py", line 55, in
results = fit_model.nested_integrate(**args)
File "/n/home01/bcook/pixcmd/pcmdpy/fit_model.py", line 202, in nested_inte
grate
sampler.run_nested(ninit=N_points, maxcall_init=max_call, dlogz_init=dlog
z)
File "/n/home01/bcook/dynesty/dynesty/dynamicsampler.py", line 1099, in run
_nested
logzerr = math.sqrt(logzvar)
ValueError: math domain error
`
Currently dynesty
is unable to regenerate the proposal distributions used over the course of the run for any sampler. Add in functionality that saves the proposal distribution directly (e.g., a single Ellipsoid
) or indirectly (via the initial state used to construct the proposal) so they can be re-generated later. This will be crucial for visualizing runs. It will also be useful if we want to start moving to utilizing (pseudo-)importance sampling weights.
@bd-j and @pacargile would like to turn the main nested sampling loop into some type of Iterator
similar to how emcee
works.
Currently, the only errors included are in runplot
and use the "first-order" approximation. However, simulate_run
, resample_run
, and sample_run
all can numerically simulate (entirely or portions of) the error. This feature should be incorporated into at least runplot
.
Could you add args= and kwargs= when initializing the NestedSampler object that would take an argument list and/or a keyword dictionary that get passed to the likelihood function? This would allow for likelihood functions of the type lnlike(v,*args,**kwargs), and would allow for backwards compatibility with EMCEE.
Many sampling codes (PyMC3, etc.) have some simple plotting utilities to visualize their final results and/or their overall runs. I'd like to add a few of these to dynesty
so users can make quick and dirty diagnostics without too much hassle.
To do:
Bounding the original set of points sampled from the unit cube is not helpful for two reasons. First, the shape of the points follows that of the cube, leading to weird decompositions that do not conform to the shape of the general likelihood surface. Second, and more importantly, the volume encompassed by the bound significantly exceeds the original cube by construction, increasing exponentially as a function of dimension. To fix this, I should try and institute some type of "bounding delay" on the first set of points where we continue to sample from the prior until the overall efficiency drops below some threshold. If possible, this should be implemented in both the standard nested sampler and the dynamic sampler.
As it stands, the entirety of the dynesty
is stored in one giant file whose overall structure is inherited from nestle. It would be great to break this up into a collection of files to allow for better organization of classes, etc.
While trying the prospector mock demo with Dynesty I get
RuntimeError: Negative h encountered (h=-1.5538444519). Please report this as a likely bug.
when using the NestedSampler in multi/unif mode with bootstrap=0
see
https://github.com/bd-j/prospector/blob/dydemo/demo/Interactive_Nested_Demo.ipynb
Currently, the dynamic sampler works by "hacking" an internal nestedsampler
object. One of the consequences of this is that the log-(prior) volume quantities tracked internally (and used to determine the minimum pointvol
) continue to shrink even though the actual log-volume probed at each batch is different. Since the internal results are garbage anyways and by default the dynamicsampler
object doesn't save any samples within the internal nestedsampler
object, I should implement a fix that replaces the internal sampler.saved_logvol
and sampler.dlv
values each time it updates with a new batch to ensure the compute pointvol
remains robust .
I need to test whether using the KL divergence between a "baseline" run and a "sampled" (resampled + reweighted) run can be used to determine convergence to the posterior distribution.
While I removed the mcmc
class and 'classic'
sampling method from nestle
in the current implementation of dynesty
, I'd like to add it back in soon.
Currently a number of utilities do not work properly with dynamic nested sampling runs. These include resample_run
(and by extension sample_run
), merge_runs
, and unravel_run
.
There are two issues. First, to properly merge runs with variable numbers of live points, I need to know the lower logl_min
bound that was used to initially add each point in. I can then use the same procedure as the one implemented in the dynamic nested sampler to merge strands in "one-at-a-time" (or in batches).
Second, I need a way to resample the strands that properly preserves the information content of a dynamic sampling run (i.e. the "many paths" interpretation that works so well for standard runs). This requires splitting the run into two groups of points:
Resampling then needs to be done within each group of points before combining the runs to preserve their respective information content. Afterwards, points can be combined using the same procedure as used for merging.
To get these to work, I have to include additional information within Results
from dynamic runs such as the batch ID associated with a group of points and the logl
bounds associated with each batch.
To accompany this bugfix, I should also make sure to test the veracity of our results using repeated nested sampling runs.
To do:
unravel_run
to store relevant batch information when unraveling a dynamic run (allowing for re-assembly).merge_runs
to properly deal with dynamic sampling runs.resample_run
to deal with dynamic sampling runs, as described above.Currently these are done sequentially, but there's absolutely no reason why these also can't be done in parallel using the same scheme as the live point proposals. Once we get the parallelization scheme finalized I can come back to this.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.