Variational Inference (Bayes by Backprop)¶
deepuq implements Bayes by Backprop through variational layers and vi_elbo_step.
1) Motivation¶
Exact Bayesian inference for neural-network weights is generally intractable. Variational inference replaces the true posterior with a tractable family and turns inference into optimization.
This gives a practical path to uncertainty-aware deep learning while keeping stochastic-gradient training workflows.
2) What Uncertainty Is Quantified¶
Variational inference in deepuq quantifies epistemic uncertainty via a learned distribution over weights.
Posterior predictive distribution:
Monte Carlo approximation:
3) Mathematical Setup / Notation¶
Let \(\mathcal D=\{(x_i,y_i)\}_{i=1}^N\), prior \(p(w)\), and variational family \(q_{\phi}(w)\).
Mean-field Gaussian parameterization:
A common unconstrained scale parameterization is:
Reparameterization trick:
4) Core Method Equations¶
Canonical ELBO (maximization form):
Equivalent minimization form used in training:
Mini-batch objective with \(N_b\) optimizer steps per epoch:
Relationship to posterior KL:
5) Inference / Prediction Equations¶
Regression predictive moments:
Classification predictive probabilities:
6) Practical Implications¶
- Fixed \(\beta\) is useful when comparing ELBO trends across epochs.
- Larger Monte Carlo sample counts reduce estimator variance but increase compute.
- Mean-field VI is scalable but cannot represent full posterior correlations.
- Monitoring NLL and KL separately helps diagnose underfitting vs over-regularization.
UQResult Field Mapping¶
predict_vi_uq(...) returns:
| Field | Regression | Classification (apply_softmax=True) |
|---|---|---|
mean | Predictive mean | Mean class probabilities |
epistemic_var | MC variance across weight samples | Probability variance across samples |
aleatoric_var | Optional user-supplied additive term | None |
total_var | epistemic_var + aleatoric_var (if provided) | Probability variance |
probs | None | Mean class probabilities |
probs_var | None | Probability variance |
metadata | Method/sample/task info | Method/sample/task info |
7) References¶
- Graves, A. (2011). Practical Variational Inference for Neural Networks. NeurIPS. Proceedings
- Blundell, C., Cornebise, J., Kavukcuoglu, K., & Wierstra, D. (2015). Weight Uncertainty in Neural Networks. ICML (PMLR 37). Proceedings
- Kingma, D. P., & Welling, M. (2014). Auto-Encoding Variational Bayes. ICLR. OpenReview
- Jordan, M. I., Ghahramani, Z., Jaakkola, T. S., & Saul, L. K. (1999). An Introduction to Variational Methods for Graphical Models. Machine Learning, 37, 183-233. DOI: 10.1023/A:1007665907178
- Blei, D. M., Kucukelbir, A., & McAuliffe, J. D. (2017). Variational Inference: A Review for Statisticians. Journal of the American Statistical Association, 112(518), 859-877. DOI: 10.1080/01621459.2017.1285773