Skip to content

MCMC API

This page documents the SGLD-based MCMC helpers provided by deepuq.methods.mcmc. These helpers expose a lower-level workflow than the wrapper-style APIs, so the notes here focus on sample collection and predictive aggregation.

Public objects

  • SGLDOptimizer
  • collect_posterior_samples
  • predict_with_samples
  • predict_with_samples_uq

Parameter and variable conventions

Name Meaning
lr SGLD step size
weight_decay L2 penalty added to the stochastic gradient
n_steps total SGLD updates
burn_in fraction of early updates discarded before collecting samples
loss_fn loss used to compute stochastic gradients
samples list of state-dict snapshots collected after burn-in
apply_softmax convert logits to probabilities before aggregating
device device used for optimization or evaluation

Workflow expectations

  1. instantiate a deterministic model
  2. call collect_posterior_samples(...) with a training loader and loss
  3. reuse the returned samples with predict_with_samples(...) or predict_with_samples_uq(...)

Input and output shapes

  • collect_posterior_samples(...) expects minibatches (x, y) from data_loader.
  • predict_with_samples(...) returns tensors with the same trailing shape as one model forward pass.
  • classification helpers typically use outputs shaped [batch, n_classes].

UQResult mapping

predict_with_samples_uq(...) populates:

  • regression: mean, epistemic_var, total_var
  • classification (apply_softmax=True): mean, probs, probs_var, and epistemic_var

Common preconditions and failure modes

  • the architecture used for prediction must match the architecture used to collect samples
  • burn_in should be in [0, 1) to keep a meaningful number of posterior samples
  • loss_fn must match the task; the default is cross-entropy
  • apply_softmax=True should only be used when the model emits logits

Minimal example

samples = collect_posterior_samples(
    model,
    train_loader,
    n_steps=500,
    lr=1e-4,
    loss_fn=torch.nn.CrossEntropyLoss(),
    device="cuda",
)
uq = predict_with_samples_uq(model, samples, x_test, apply_softmax=True)

deepuq.methods.mcmc

MCMC utilities based on Stochastic Gradient Langevin Dynamics (SGLD).

SGLDOptimizer

Bases: Optimizer

Stochastic Gradient Langevin Dynamics optimizer.

This optimizer performs an SGD-like update with additive Gaussian noise calibrated by the step size, following Welling & Teh (2011).

Parameters:

Name Type Description Default
params

Iterable of parameters to optimize.

required
lr

SGLD step size.

0.001
weight_decay

Optional L2 penalty added to the stochastic gradient.

0.0

step

step()

Apply one SGLD parameter update in-place.

Returns:

Type Description
None

The update is applied directly to the optimizer parameters.

collect_posterior_samples

collect_posterior_samples(
    model: Module,
    data_loader,
    n_steps=1000,
    lr=0.0001,
    weight_decay=0.0001,
    burn_in=0.2,
    loss_fn=None,
    device="cpu",
)

Run SGLD and collect posterior parameter snapshots.

Parameters:

Name Type Description Default
model Module

Neural network to sample.

required
data_loader

Iterable of mini-batches.

required
n_steps

Total SGLD updates.

1000
burn_in

Fraction of updates to skip before collecting snapshots.

0.2
loss_fn

Loss used to compute stochastic gradients. Defaults to cross-entropy.

None
device

Device on which optimization runs.

'cpu'

Returns:

Type Description
list[dict[str, Tensor]]

State-dict snapshots collected after burn-in. Each element can be fed into predict_with_samples or predict_with_samples_uq.

predict_with_samples

predict_with_samples(
    model: Module,
    samples,
    x,
    apply_softmax=True,
    device="cpu",
)

Predictive mean and variance from stored parameter samples.

Parameters:

Name Type Description Default
model Module

Model architecture compatible with the saved state dicts.

required
samples

Posterior parameter snapshots, typically from collect_posterior_samples.

required
x

Evaluation inputs.

required
apply_softmax

If True, convert logits into probabilities before aggregation.

True
device

Device used for model evaluation.

'cpu'

Returns:

Type Description
(mean, var):

Predictive mean and variance over the posterior sample dimension.

predict_with_samples_uq

predict_with_samples_uq(
    model: Module,
    samples,
    x,
    apply_softmax=True,
    device="cpu",
) -> UQResult

Return posterior-sample predictive moments in UQResult form.

epistemic_var stores the variance across posterior samples. No separate aleatoric component is estimated.