chains_utils
Utilities to be used with output MCMC chains of PTArcade.
Functions¶
params_loader ¶
Load a parameter file and return a dictionary of the parameters with their respective values or None.
PARAMETER | DESCRIPTION |
---|---|
file |
The path to the prior.txt file. |
RETURNS | DESCRIPTION |
---|---|
params
|
A dictionary with the parameters' prior ranges. |
Source code in src/ptarcade/chains_utils.py
convert_chains_to_hdf ¶
convert_chains_to_hdf(
chains_dir: str | Path,
burn_frac: float = 0.0,
quick_import: bool = False,
chain_name: str = "chain_1.txt",
dest_path: Path | None = None,
**kwargs
) -> None
Convert the raw output of PTArcade to HDF format and write to disk.
PARAMETER | DESCRIPTION |
---|---|
chains_dir |
Name of the directory containing the chains. |
burn_frac |
Fraction of the chain that is removed from the head (default is 0).
TYPE:
|
quick_import |
Flag to skip importing the rednoise portion of chains (default is False).
TYPE:
|
chain_name |
The name of the chain files, include the file extension of the chain files. Compressed files with extension ".gz" can be used (default is "chain_1.txt").
TYPE:
|
dest_path |
The destination path including filename (default is to save in chains_dir with a unique timestamp).
TYPE:
|
**kwargs |
Additional arguments passed to pandas.DataFrame.to_hdf
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
None
|
Saves an HDF5 file to |
Source code in src/ptarcade/chains_utils.py
import_to_dataframe ¶
import_to_dataframe(
chains_dir: str | Path,
burn_frac: float = 0.0,
quick_import: bool = False,
chain_name: str = "chain_1.txt",
merge_chains: bool = True,
) -> tuple[pd.DataFrame, pd.DataFrame]
Import the chains and their parameter file as a pandas dataframe.
Given a chain_dir
that contains chains, this function imports the chains, removes burn_frac
, and returns
a dictionary of parameters: values and a numpy array of the merged chains.
PARAMETER | DESCRIPTION |
---|---|
chains_dir |
Name of the directory containing the chains. |
burn_frac |
Fraction of the chain that is removed from the head (default is 0).
TYPE:
|
quick_import |
Flag to skip importing the rednoise portion of chains (default is False).
TYPE:
|
chain_name |
The name of the chain files, include the file extension of the chain files. Compressed files with extension ".gz" and HDF5 files with extension ".h5" can be used (default is "chain_1.txt").
TYPE:
|
merge_chains |
Whether to merge the chains into one dataframe (default is True).
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
params
|
Dataframe containing the parameter names and their values.
TYPE:
|
chains
|
Dataframe array containing the merged chains without the burn-in region. Can also optionally
return a list of unmerged chains if |
RAISES | DESCRIPTION |
---|---|
SystemExit
|
Raised when the chains have different parameters. |
Source code in src/ptarcade/chains_utils.py
123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 |
|
import_chains ¶
import_chains(
chains_dir: str | Path,
burn_frac: float = 1 / 4,
quick_import: bool = True,
chain_name: str = "chain_1.txt",
) -> tuple[dict, NDArray]
Import the chains and their parameter file.
Given a chain_dir
that contains chains, this function imports the chains, removes burn_frac
, and returns
a dictionary of parameters: values and a numpy array of the merged chains.
PARAMETER | DESCRIPTION |
---|---|
chains_dir |
Name of the directory containing the chains. |
burn_frac |
Fraction of the chain that is removed from the head (default is ¼).
TYPE:
|
quick_import |
Flag to skip importing the rednoise portion of chains (default is True).
TYPE:
|
chain_name |
The name of the chain files, include the file extension of the chain files. Compressed files with extension ".gz" and HDF5 files with extension ".h5" can be used (default is "chain_1.txt").
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
params
|
Dictionary containing the parameter names and their values.
TYPE:
|
mrgd_chain
|
Numpy array containing the merged chains without the burn-in region.
TYPE:
|
RAISES | DESCRIPTION |
---|---|
SystemExit
|
Raised when the chains have different parameters. |
Source code in src/ptarcade/chains_utils.py
chain_filter ¶
chain_filter(
chain: NDArray,
params: list[str],
model_id: int | None,
par_to_plot: list[str] | None,
) -> tuple[NDArray, list[str]]
Filter chains.
This function filters the rows in the provided chain according to the specified model and parameters. It selects rows that correspond to the specified model ID and parameters to plot their posteriors.
PARAMETER | DESCRIPTION |
---|---|
chain |
The Markov Chain Monte Carlo (MCMC) chain to be filtered. This should be a multi-dimensional array where each row represents a state in the chain, and each column represents a parameter.
TYPE:
|
params |
The names of the parameters in the chain. This should be a list of strings with the same length as the number of columns in the chain. |
model_id |
The ID of the model to filter the chain for. This should be either 0 or 1. If None, the function will select rows for model 0.
TYPE:
|
par_to_plot |
The names of the parameters to filter the chain for. If None, the function will select all parameters except 'nmodel', 'log_posterior', 'log_likelihood', 'acceptance_rate', and 'n_parall', and parameters containing '+' or '-'. |
RETURNS | DESCRIPTION |
---|---|
chain
|
The filtered chain, containing only rows corresponding to the specified model ID and parameters.
TYPE:
|
filtered_par
|
The list of filtered parameter names. |
RAISES | DESCRIPTION |
---|---|
SystemExit
|
If the provided |
Notes¶
This function filters the chain in-place, meaning that the original chain will be modified.
Source code in src/ptarcade/chains_utils.py
324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 |
|
calc_df ¶
Calculate dropout bayes factor
PARAMETER | DESCRIPTION |
---|---|
chain |
input dropout chain with shape assumed to be (n_bootstrap, n_samples)
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
bayes_facs
|
dropout bayes factor with shape (n_bootstrap,)
TYPE:
|
Source code in src/ptarcade/chains_utils.py
bf_bootstrap ¶
Compute mean and variance of bayes factor.
This function computes the mean and variance of the bayes factor after bootstrapping for a given chain.
PARAMETER | DESCRIPTION |
---|---|
chain |
The Markov Chain Monte Carlo (MCMC) chain to be analyzed. This should be a multi-dimensional array where each row represents a state in the chain, and each column represents a parameter.
TYPE:
|
burn |
The burn-in period to be discarded from the start of the chain. This should be a non-negative integer. If not provided, no burn-in period will be discarded.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
mean
|
The mean of the bootstrapped degrees of freedom distribution.
TYPE:
|
var
|
The variance of the bootstrapped degrees of freedom distribution.
TYPE:
|
Notes¶
This function uses the acor
library to compute the autocorrelation time of the chain, which is then used to thin
the chain. The thinned chain is then bootstrapped using the 'bootstrap' function with the calc_df
user statistic,
to obtain a distribution of degrees of freedom. The mean and variance of this distribution are then computed.
Source code in src/ptarcade/chains_utils.py
compute_bf ¶
Compute the Bayes factor and estimate its uncertainty.
PARAMETER | DESCRIPTION |
---|---|
chain |
The Markov Chain Monte Carlo (MCMC) chain to be analyzed. This should be a multi-dimensional array where each row represents a state in the chain, and each column represents a parameter. The 'nmodel' and 'log_posterior' columns should be used to specify the model index and the log of the posterior probabilities.
TYPE:
|
params |
The names of the parameters in the chain. This should be a list of strings with the same length as the number of columns in the chain. It is expected to contain 'nmodel' and 'log_posterior', which will be used to filter the chain based on the model index and compute the Bayes factor. |
bootstrap |
A flag indicating whether to compute the Bayes factor using a bootstrap method. If True, the Bayes factor will be computed using the 'get_bf' function. The bootsrap calculation is significantly slower. Defaults to False.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
bf
|
The computed Bayes factor. This gives the evidence for model 0 over model 1. A higher value provides stronger evidence for model 0.
TYPE:
|
unc
|
The computed uncertainty of the Bayes factor.
TYPE:
|
RAISES | DESCRIPTION |
---|---|
SystemExit
|
If |
Source code in src/ptarcade/chains_utils.py
bisection ¶
Find roots for real-valued function using bisection method.
This function implements the bisection method for root finding of a real-valued function. It recursively divides the interval [a, b] into two subintervals until the absolute value of f evaluated at the midpoint is less than the specified tolerance, at which point it returns the midpoint as an approximation of the root.
PARAMETER | DESCRIPTION |
---|---|
f |
The function for which the root is to be found. It must be real-valued and continuous on the interval [a, b]. |
a |
The left endpoint of the interval in which the root is sought. It must be less than b.
TYPE:
|
b |
The right endpoint of the interval in which the root is sought. It must be greater than a.
TYPE:
|
tol |
The tolerance for the root approximation. The function will return when the absolute value of f evaluated at the midpoint is less than tol. It must be greater than 0.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
float | None
|
The midpoint of the final subinterval if a root is found; None otherwise. The root approximation m is guaranteed to satisfy |f(m)| < tol if the function converges. |
RAISES | DESCRIPTION |
---|---|
SystemExit
|
If a is not less than b, or if tol is not greater than 0. |
Notes¶
This is a recursive implementation of the bisection method. The bisection method assumes that the function f changes sign over the interval [a, b], which implies that a root exists in this interval by the Intermediate Value Theorem.
Source code in src/ptarcade/chains_utils.py
k_ratio_aux_1D ¶
k_ratio_aux_1D(
sample: MCSamples,
bf: float,
par: str,
par_range: list[float],
k_ratio: float,
) -> float | None
Returns the bound value for a given k-ratio in a 1D posterior density plot.
PARAMETER | DESCRIPTION |
---|---|
sample |
An instance of the MCSamples class, containing the multivariate Monte Carlo samples on which the function is operating.
TYPE:
|
bf |
The Bayes factor for the exotic + SMBHB vs. SMBHB model. Represents the strength of evidence in favour of the exotic model.
TYPE:
|
par |
The name of the parameter for which the k-ratio bound should be computed.
TYPE:
|
par_range |
The lower and upper prior limits for the parameter. It is represented as a list where the first element is the lower limit and the second element is the upper limit. |
k_ratio |
The fraction of plateau height at which the height level is determined. This is used to compute the height_KB, which represents the height at which the bound is computed.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
k_val
|
The computed k-ratio bound value. This is the value of the parameter at which the 1D posterior density crosses the height_KB.
TYPE:
|
RAISES | DESCRIPTION |
---|---|
ValueError
|
If the integration or the bisection search fails due to numerical issues or if the specified parameter range does not contain a valid root. |
Source code in src/ptarcade/chains_utils.py
k_ratio_aux_2D ¶
k_ratio_aux_2D(
sample: MCSamples,
bf: float,
par_1: str,
par_2: str,
par_range_1: list[float],
par_range_2: list[float],
k_ratio: float,
) -> float
Returns the height level corresponding to the given k-ratio in a 2D posterior density plot.
PARAMETER | DESCRIPTION |
---|---|
sample |
An instance of the MCSamples class, containing the multivariate Monte Carlo samples on which the function is operating.
TYPE:
|
bf |
The Bayes factor for the exotic + SMBHB vs. SMBHB model. Represents the strength of evidence in favour of the exotic model.
TYPE:
|
par_1 |
The names of the two parameters for which the k-ratio bound should be computed.
TYPE:
|
par_2 |
The names of the two parameters for which the k-ratio bound should be computed.
TYPE:
|
par_range_1 |
The lower and upper prior limits for the parameters. Each is represented as a list where the first element is the lower limit and the second element is the upper limit. |
par_range_2 |
The lower and upper prior limits for the parameters. Each is represented as a list where the first element is the lower limit and the second element is the upper limit. |
k_ratio |
The fraction of plateau height at which the height level is determined. This is used to compute the height_KB, which represents the height at which the bound is computed.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
height_KB
|
The computed height level in the 2D posterior density plot. This is the height at which the density equals the computed height_KB.
TYPE:
|
RAISES | DESCRIPTION |
---|---|
ValueError
|
If the double integration fails due to numerical issues. |
Source code in src/ptarcade/chains_utils.py
get_k_levels ¶
get_k_levels(
sample: MCSamples,
pars: list[str],
priors: dict,
bf: float,
k_ratio: float,
) -> tuple[NDArray, NDArray]
Compute and return the 1D and 2D k-ratio bounds for a given set of parameters.
PARAMETER | DESCRIPTION |
---|---|
sample |
An instance of the MCSamples class, containing the multivariate Monte Carlo samples on which the function is operating.
TYPE:
|
pars |
The list of all parameters for which the k-ratio bounds should be computed. The parameters 'gw-bhb-0' and 'gw-bhb-1' are excluded from this computation. |
priors |
A dictionary containing the lower and upper prior limits for each parameter. Each key-value pair in the dictionary corresponds to a parameter and its limits, respectively.
TYPE:
|
bf |
The Bayes factor for the exotic + SMBHB vs. SMBHB model. Represents the strength of evidence in favour of the exotic model.
TYPE:
|
k_ratio |
The fraction of plateau height at which the height level is determined.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
NDArray
|
numpy array representing the 1D k-ratio bounds. Each element in the array is a list where the first elements are the parameter names and the last element is the computed k-ratio bound. |
NDArray
|
numpy array representing the 2D k-ratio bounds. Each element in the array is a list where the first elements are the parameter names and the last element is the computed k-ratio bound. |
Source code in src/ptarcade/chains_utils.py
get_bayes_est ¶
Compute and return the Bayesian estimates for a given set of parameters based on a sample of data.
PARAMETER | DESCRIPTION |
---|---|
samples |
An instance of the MCSamples class, containing the multivariate Monte Carlo samples on which the function is operating.
TYPE:
|
params |
The list of parameters for which the Bayesian estimates should be computed. |
RETURNS | DESCRIPTION |
---|---|
x
|
A dictionary representing the Bayesian estimates for each parameter. Each key-value pair in the dictionary corresponds to a parameter and its Bayesian estimate, respectively. Each estimate is represented as a tuple, where the first element is the mean and the second element is the standard deviation. |
Source code in src/ptarcade/chains_utils.py
get_max_pos ¶
get_max_pos(
params: list[str],
bayes_est: dict[str, tuple[float, float]],
sample: MCSamples,
priors: dict[str, tuple[float, float]],
spc: int = 10,
) -> dict[str, float]
Compute and return the maximum posterior position for a given set of parameters.
PARAMETER | DESCRIPTION |
---|---|
params |
The list of parameters for which the maximum posterior position should be computed. |
bayes_est |
A dictionary containing the Bayesian estimates for each parameter. Each key-value pair in the dictionary corresponds to a parameter and its Bayesian estimate, respectively. Each estimate is represented as a tuple, where the first element is the mean and the second element is the standard deviation. |
sample |
An instance of the MCSamples class, containing the multivariate Monte Carlo samples on which the function is operating.
TYPE:
|
priors |
A dictionary containing the lower and upper prior limits for each parameter. Each key-value pair in the dictionary corresponds to a parameter and its limits, respectively. |
spc |
The number of equally spaced points to be considered within the bounds of each parameter when searching for the maximum posterior position. Default is 10.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
out
|
A dictionary representing the maximum posterior positions for each parameter. Each key-value pair in the dictionary corresponds to a parameter and its maximum posterior position, respectively. |
Source code in src/ptarcade/chains_utils.py
get_c_levels ¶
Compute and return the highest posterior interval (HPI) for a given set of parameters and confidence levels.
PARAMETER | DESCRIPTION |
---|---|
sample |
An instance of the MCSamples class, containing the multivariate Monte Carlo samples on which the function is operating.
TYPE:
|
pars |
The list of parameters for which the HPI should be computed. |
levels |
The list of confidence levels for which the HPI should be computed. Each value in the list should be between 0 and 1. |
RETURNS | DESCRIPTION |
---|---|
NDArray
|
A numpy array representing the HPI for each parameter and each confidence level. Each element in the array is a list, where the first element is a parameter name and the second element is a list of HPIs for each confidence level. |