MCMC API¶

This page documents the SGLD-based MCMC helpers provided by deepuq.methods.mcmc. These helpers expose a lower-level workflow than the wrapper-style APIs, so the notes here focus on sample collection and predictive aggregation.

Public objects¶

SGLDOptimizer
collect_posterior_samples
predict_with_samples
predict_with_samples_uq

Parameter and variable conventions¶

Name	Meaning
`lr`	SGLD step size
`weight_decay`	L2 penalty added to the stochastic gradient
`n_steps`	total SGLD updates
`burn_in`	fraction of early updates discarded before collecting samples
`loss_fn`	loss used to compute stochastic gradients
`samples`	list of state-dict snapshots collected after burn-in
`apply_softmax`	convert logits to probabilities before aggregating
`device`	device used for optimization or evaluation

Workflow expectations¶

instantiate a deterministic model
call collect_posterior_samples(...) with a training loader and loss
reuse the returned samples with predict_with_samples(...) or predict_with_samples_uq(...)

Input and output shapes¶

collect_posterior_samples(...) expects minibatches (x, y) from data_loader.
predict_with_samples(...) returns tensors with the same trailing shape as one model forward pass.
classification helpers typically use outputs shaped [batch, n_classes].

`UQResult` mapping¶

predict_with_samples_uq(...) populates:

regression: mean, epistemic_var, total_var
classification (apply_softmax=True): mean, probs, probs_var, and epistemic_var

Common preconditions and failure modes¶

the architecture used for prediction must match the architecture used to collect samples
burn_in should be in [0, 1) to keep a meaningful number of posterior samples
loss_fn must match the task; the default is cross-entropy
apply_softmax=True should only be used when the model emits logits

Minimal example¶

samples = collect_posterior_samples(
    model,
    train_loader,
    n_steps=500,
    lr=1e-4,
    loss_fn=torch.nn.CrossEntropyLoss(),
    device="cuda",
)
uq = predict_with_samples_uq(model, samples, x_test, apply_softmax=True)

deepuq.methods.mcmc ¶

MCMC utilities based on Stochastic Gradient Langevin Dynamics (SGLD) and HMC.

CyclicalSGMCMC ¶

Cyclical Stochastic Gradient MCMC for posterior sampling.

Uses cosine annealing within each cycle and collects samples at the end of each cycle (low LR region).

Parameters:

Name	Type	Description	Default
`model`	`Module`	Neural network to sample from.	required
`base_optimizer_cls`		Optimizer class (e.g. SGHMCOptimizer or SGLDOptimizer).	required
`cycle_length`	`int`	Number of training steps per cycle.	`50`
`n_cycles`	`int`	Number of full cycles to run.	`4`
`samples_per_cycle`	`int`	Number of posterior samples to collect at the end of each cycle.	`3`

run ¶

run(train_loader, loss_fn) -> list[dict[str, torch.Tensor]]

Execute cyclical SGMCMC and return collected posterior samples.

Parameters:

Name	Type	Description	Default
`train_loader`		Iterable of (inputs, targets) mini-batches.	required
`loss_fn`		Loss function for computing gradients.	required

Returns:

Type	Description
`list[dict[str, Tensor]]`	Collected state-dict snapshots.

SGHMCOptimizer ¶

Bases: Optimizer

Stochastic Gradient Hamiltonian Monte Carlo optimizer.

Maintains a velocity buffer per parameter and applies the SGHMC update: v = (1 - momentum_decay) * v - lr * grad + N(0, 2momentum_decaylr) * noise_scale theta = theta + v

Parameters:

Name	Description	Default
`params`	Iterable of parameters to optimize.	required
`lr`	Step size.	`0.0001`
`momentum_decay`	Friction coefficient for the velocity.	`0.01`
`noise_scale`	Scaling factor for the injected noise.	`1.0`
`num_training_samples`	Number of training samples (used for gradient scaling context).	`1000`

step ¶

step()

Apply one SGHMC parameter update in-place.

SGLDOptimizer ¶

Bases: Optimizer

Stochastic Gradient Langevin Dynamics optimizer.

This optimizer performs an SGD-like update with additive Gaussian noise calibrated by the step size, following Welling & Teh (2011).

Parameters:

Name	Description	Default
`params`	Iterable of parameters to optimize.	required
`lr`	SGLD step size.	`0.001`
`weight_decay`	Optional L2 penalty added to the stochastic gradient.	`0.0`

step ¶

step()

Apply one SGLD parameter update in-place.

Returns:

Type	Description
`None`	The update is applied directly to the optimizer parameters.

collect_posterior_samples ¶

collect_posterior_samples(
    model: Module,
    data_loader,
    n_steps=1000,
    lr=0.0001,
    weight_decay=0.0001,
    burn_in=0.2,
    loss_fn=None,
    device="cpu",
)

Run SGLD and collect posterior parameter snapshots.

Parameters:

Name	Type	Description	Default
`model`	`Module`	Neural network to sample.	required
`data_loader`		Iterable of mini-batches.	required
`n_steps`		Total SGLD updates.	`1000`
`burn_in`		Fraction of updates to skip before collecting snapshots.	`0.2`
`loss_fn`		Loss used to compute stochastic gradients. Defaults to cross-entropy.	`None`
`device`		Device on which optimization runs.	`'cpu'`

Returns:

Type	Description
`list[dict[str, Tensor]]`	State-dict snapshots collected after burn-in. Each element can be fed into `predict_with_samples` or `predict_with_samples_uq`.

predict_with_samples ¶

predict_with_samples(
    model: Module,
    samples,
    x,
    apply_softmax=True,
    device="cpu",
)

Predictive mean and variance from stored parameter samples.

Parameters:

Name	Type	Description	Default
`model`	`Module`	Model architecture compatible with the saved state dicts.	required
`samples`		Posterior parameter snapshots, typically from `collect_posterior_samples`.	required
`x`		Evaluation inputs.	required
`apply_softmax`		If `True`, convert logits into probabilities before aggregation.	`True`
`device`		Device used for model evaluation.	`'cpu'`

Returns:

Type	Description
`(mean, var):`	Predictive mean and variance over the posterior sample dimension.

predict_with_samples_uq ¶

predict_with_samples_uq(
    model: Module,
    samples,
    x,
    apply_softmax=True,
    device="cpu",
) -> UQResult

Return posterior-sample predictive moments in UQResult form.

epistemic_var stores the variance across posterior samples. No separate aleatoric component is estimated.

MCMC API¶

Public objects¶

Parameter and variable conventions¶

Workflow expectations¶

Input and output shapes¶

UQResult mapping¶

Common preconditions and failure modes¶

Minimal example¶

Related docs¶

deepuq.methods.mcmc ¶

CyclicalSGMCMC ¶

run ¶

SGHMCOptimizer ¶

step ¶

SGLDOptimizer ¶

step ¶

collect_posterior_samples ¶

predict_with_samples ¶

predict_with_samples_uq ¶

`UQResult` mapping¶