samplers¶
The samplers modules contains a set of classes which are meant to be inherited by model classes. These classes contain many of the ‘standard’ methods for sampling with their respective algorithm.
- class pbjam.samplers.DynestySampling[source]¶
Generic dynesty sampling methods to be inherited.
The inheriting class must have a callable lnlikelihood function, a dictionary of callable prior ppf functions, and an integer ndims attribute.
- initSamples(ndims, nlive, nliveMult=4, logl_kwargs={}, **kwargs)[source]¶
Initializes live points for nested sampling.
The method generates nliveMult * nlive points uniformly in the unit cube, and transforms these points into the parameter space using the prior transform. Only the finite log-likelihood values are kept, and the top nlive points are returned.
- Parameters:
ndims (int) – The number of dimensions in the parameter space.
nlive (int) – The number of live points to use in the nested sampling.
nliveMult (int, optional) – The multiplier for the number of live points generated initially. Default is 4.
logl_kwargs (dict, optional) – Additional keyword arguments to pass to the likelihood function. Default is an empty dictionary.
**kwargs (dict) – Additional keyword arguments.
- Returns:
A list containing: - u (ndarray): The live points in the unit cube. - v (ndarray): The live points transformed to the parameter space. - L (ndarray): The log-likelihood values of the live points.
- Return type:
list
- ptform(u)[source]¶
Transform a set of random variables from the unit hypercube to a set of random variables distributed according to specified prior distributions.
- Parameters:
u (jax device array) – Set of points distributed randomly in the unit hypercube.
- Returns:
theta – Set of random variables distributed according to specified prior distributions.
- Return type:
jax device array
Notes
This method uses the inverse probability integral transform (also known as the quantile function or percent point function) to transform each element of u using the corresponding prior distribution. The resulting transformed variables are returned as a JAX device array.
Examples
>>> from scipy.stats import uniform, norm >>> import jax.numpy as jnp >>> class MyModel: ... def __init__(self): ... self.priors = {'a': uniform(loc=0.0, scale=1.0), 'b': norm(loc=0.0, scale=1.0)} ... def ptform(self, u): ... theta = jnp.array([self.priors[key].ppf(u[i]) for i, key in enumerate(self.priors.keys())]) ... return theta ... >>> model = MyModel() >>> u = jnp.array([0.5, 0.8]) >>> theta = model.ptform(u) >>> print(theta) [0.5 0.84162123]
- runSampler(dynamic=False, progress=False, minSamples=5000, logl_kwargs={}, sampler_kwargs={}, **kwargs)[source]¶
Runs the nested sampling algorithm, either in static or dynamic mode, to sample the posterior distribution.
By default assumes 50*ndim nlive points and sampling method is rwalk.
- Parameters:
dynamic (bool, optional) – Whether to use dynamic nested sampling. Default is False (static nested sampling).
progress (bool, optional) – Whether to display progress during sampling. Default is False.
minSamples (int, optional) – The minimum number of samples to generate. Default is 5000.
logl_kwargs (dict, optional) – Additional keyword arguments to pass to the likelihood function. Default is an empty dictionary.
sampler_kwargs (dict, optional) – Additional keyword arguments to pass to the sampler. Default is an empty dictionary.
- Returns:
samples – An array of posterior samples.
- Return type:
array-like
- class pbjam.samplers.EmceeSampling[source]¶
Class used for handling MCMC sampling with Emcee
This class is meant to be inherited by various model classes to perform affine invariant sampling using the Emcee package.
- burnInSampler(nchains, DEfrac, earlyStop, walltime, nsteps, t0, ncontrol, DtauLimit)[source]¶
Runs the burn-in phase of an MCMC sampler with convergence checking. The sampling is performed in sets of nsteps to prevent memory issues. Only ncontrol chains are retained for the entire run, otherwise only nsteps are retained.
Convergence is primarily monitored using the integrated autocorrelation time (tau), where the change in the AC time is the main requirement for convergence, stopping the burn-in when the change in tau drops below DtauLimit.
The chains can be run using a mixture of differential evolution moves and stretch moves (see Emcee documentation).
Stops if run exceeds the specified walltime.
- Parameters:
nchains (int) – The number of chains (walkers) to use in the MCMC sampler.
DEfrac (float) – The fraction of differential evolution (DE) moves to use in the MCMC moves compared to stretch moves.
earlyStop (bool) – Whether to stop the burn-in early if convergence is detected.
walltime (float) – The maximum allowed runtime for the burn-in phase (in minutes).
nsteps (int) – The number of steps to take in each iteration of the burn-in. ~1000 is usually a good choice.
t0 (float) – The start time used to measure elapsed runtime.
ncontrol (int) – The number of chains to monitor for autocorrelation convergence, 10-20 is enough to get an OK average.
DtauLimit (float) – The target limit for the fractional change in autocorrelation time per step. Should be about 0.005%, setting it lower will make the sampler run longer.
- Returns:
sampler (emcee.EnsembleSampler) – The MCMC sampler after burn-in.
ctrlChain (ndarray) – The control chain used for autocorrelation diagnostics.
Notes
You may run into memory issues if you set this nsteps >20,000 steps.
It is recommended that nsteps is ~1000 to avoid running long when the chain has already converged.
- initSamples(nchains, spread=[0.45, 0.55])[source]¶
Initializes the starting samples for MCMC chains.
Draws the samples from the respective parameter priors according the percentiles given by spread.
- Parameters:
nchains (int) – The number of chains (starting points) to initialize.
spread (list of float, optional) – The range of percentiles used to draw from the initial samples from the priors. Default is [0.45, 0.55].
- Returns:
An array of shape (nchains, ndims) containing the initial samples for each chain.
- Return type:
ndarray
- Raises:
ValueError – If any of the initial parameter draws result in a non-finite posterior evaluation.
- lnpost(theta)[source]¶
Computes the log-probability of the posterior distribution for a given set of parameters.
- Parameters:
theta (array-like) – A vector of parameter values.
- Returns:
lnp – The log-probability of the posterior, which is the sum of the log-likelihood and the log-prior.
- Return type:
float
- lnprior(theta)[source]¶
Computes the log-probability of the prior for a given set of parameters.
Assumes the prior distributions for the parameters are independent.
If the computed log-prior is finite, it returns the value; otherwise, it returns negative infinity (-jnp.inf).
- Parameters:
theta (array-like) – A vector of parameter values.
- Returns:
lnp – The log-probability of the prior.
- Return type:
float
- runSampler(nchains=None, DEfrac=1, earlyStop=True, walltime=99999, nsamples=5000, checkEvery=1000, maxThin=1000, ncontrol=10, DtauLimit=5e-05)[source]¶
Runs the MCMC sampler, including burn-in and posterior sampling phases.
- Parameters:
nchains (int, optional) – The number of chains (walkers) to use in the MCMC sampler. If None, defaults to 6 times the number of dimensions.
DEfrac (float, optional) – The fraction of differential evolution (DE) moves to use in the MCMC moves. Default is 1.
earlyStop (bool, optional) – Whether to stop the burn-in early if convergence is detected. Default is True.
walltime (float, optional) – The maximum allowed runtime in minutes. Default is 60.
nsamples (int, optional) – The number of independent samples desired. Default is 5000.
checkEvery (int, optional) – The number of steps to take before checking for convergence. Default is 1000.
maxThin (int, optional) – The maximum thinning interval to use. Default is 1000.
ncontrol (int, optional) – The number of chains to monitor for autocorrelation convergence. Default is 10.
DtauLimit (float, optional) – The target limit for the fractional change in autocorrelation time per step. Default is 0.005%.
- Returns:
The posterior samples, flattened across thinned chains.
- Return type:
ndarray
- samplePosterior(pos, tau, DEfrac, nsamples, maxThin, walltime, t0, maxArrSize=500000000.0)[source]¶
Sample the posterior distribution using MCMC, starting from a given position. Assumes the chains have burnt in already.
Draws enough samples to ensure that, given the thinning, at least nsamples independent samples are obtained. The thinning is derived from the AC time, tau.
The sampling is done in sets corresponding to an integer number of the thinning factor given by tau. This is done to conserve memory. If you are confident in the amount of memory on your machine you can increase maxArrSize.
- Parameters:
pos (ndarray) – The initial positions of the walkers.
tau (array-like) – The integrated autocorrelation times for the parameters.
DEfrac (float) – The fraction of differential evolution (DE) moves to use in the MCMC moves.
nsamples (int) – The number of independent samples required.
maxThin (int) – The maximum thinning interval to use, to ensure the sampler doesn’t run forever if given a bad tau estimate. Can be large though (~10,000 steps).
walltime (float) – The maximum allowed runtime for sampling (in minutes).
t0 (float) – The start time (typically time.time()) used to measure elapsed walltime.
maxArrSize (float, optional) – The maximum size of the array to store the chain, in elements. Default is 5e8. If you run into memory issues you can safely decrease this.
- Returns:
samples (array-like) – The posterior samples, flattened across the thinned chains.
chain (array-like) – The full set of thinned MCMC chains.
sampler (emcee.EnsembleSampler) – The MCMC sampler from the last sampling step.