Gaussian Processes¶
deepuq now includes a full GP family:
- exact GP regression (
GaussianProcessRegressor) - sparse variational GP regression (
SparseGaussianProcessRegressor) - binary GP classification (
GaussianProcessClassifier) - multiclass OvR GP classification (
OneVsRestGaussianProcessClassifier) - heteroscedastic GP regression (
HeteroscedasticGaussianProcessRegressor) - multi-task ICM GP regression (
MultiTaskGaussianProcessRegressor) - spectral mixture GP regression (
SpectralMixtureGaussianProcessRegressor) - deep kernel GP regression (
DeepKernelGaussianProcessRegressor)
1) Motivation¶
Gaussian processes provide Bayesian function-space inference. They are a strong UQ baseline because posterior uncertainty expands in regions with weak data support.
In Deep-UQ, the GP suite is designed to cover:
- calibrated interpolation baselines (exact/sparse)
- classification boundary uncertainty
- input-dependent noise modeling
- correlated multi-output regression
- rich spectral structure
- learned representations through deep kernels
2) What Uncertainty Is Quantified¶
For regression:
Predictive variance is decomposed as:
For classification, GP classifiers return class probabilities and probability spread near boundaries.
3) Core Models and Equations¶
3.1 Exact GP regression¶
Prior:
Posterior mean and covariance:
3.2 Sparse variational GP regression¶
Inducing variables \(u = f(Z)\) with \(M \ll N\) and variational posterior \(q(u)\).
A common ELBO form is:
with:
3.3 GP classification (binary + OvR multiclass)¶
Binary latent function with Bernoulli likelihood:
where \(\sigma(\cdot)\) is logistic sigmoid.
Deep-UQ uses Laplace approximation for the latent posterior around its mode. OvR multiclass fits one binary GP per class and normalizes scores.
3.4 Heteroscedastic GP regression¶
Noise depends on input:
The implementation alternates between:
- mean GP fit
- noise GP fit on \(\log((y-\hat{f})^2 + \delta)\)
3.5 Multi-task ICM GP regression¶
Intrinsic coregionalization uses:
Equivalent matrix form:
where \(B\) is learned PSD task covariance.
3.6 Spectral mixture GP regression¶
Spectral mixture kernel approximates stationary kernels using Gaussian mixtures in spectral domain:
3.7 Deep kernel GP regression¶
Feature map \(\phi_\psi(x)\) from an MLP is composed with an RBF GP head:
Parameters of \(\phi_\psi\) and GP hyperparameters are optimized jointly by marginal likelihood.
4) Kernel Support¶
Deep-UQ GP kernels include:
RBFKernel(scalar or ARD lengthscale)MaternKernel(nu=1.5or2.5)RationalQuadraticKernelPeriodicKernelLinearKernelSpectralMixtureKernelSumKernelviak1 + k2ProductKernelviak1 * k2
5) UQResult Field Mapping¶
| Model Type | mean | epistemic_var | aleatoric_var | total_var | probs | probs_var |
|---|---|---|---|---|---|---|
| Regression GPs | Posterior mean | Latent posterior variance | Noise term (constant or input-dependent) | Sum of epi + alea | None | None |
| Classification GPs | Probability mean tensor | None | None | None | Class probabilities | Probability variance proxy |
6) Practical Notes¶
- Exact GP gives strongest calibration for small/medium datasets.
- Sparse GP is preferred when \(N\) grows and exact \(\mathcal{O}(N^3)\) cost is too high.
- Heteroscedastic GP is useful when sensor noise varies by operating regime.
- Multi-task ICM helps when outputs are correlated.
- Spectral mixture kernels help with multi-frequency or quasi-periodic signals.
- Deep kernel GP helps when raw input space is not kernel-friendly.
7) References¶
- Rasmussen, C. E., & Williams, C. K. I. (2006). Gaussian Processes for Machine Learning. MIT Press. Book
- Titsias, M. (2009). Variational Learning of Inducing Variables in Sparse Gaussian Processes. AISTATS (PMLR 5). Proceedings
- Hensman, J., Fusi, N., & Lawrence, N. D. (2013). Gaussian Processes for Big Data. UAI. Paper
- Williams, C. K. I., & Barber, D. (1998). Bayesian Classification with Gaussian Processes. IEEE TPAMI, 20(12), 1342-1351. DOI: 10.1109/34.735807
- Álvarez, M. A., Rosasco, L., & Lawrence, N. D. (2012). Kernels for Vector-Valued Functions: A Review. Foundations and Trends in ML, 4(3), 195-266. DOI: 10.1561/2200000036
- Wilson, A. G., & Adams, R. P. (2013). Gaussian Process Kernels for Pattern Discovery and Extrapolation. ICML (PMLR). Proceedings
- Wilson, A. G., Hu, Z., Salakhutdinov, R., & Xing, E. P. (2016). Deep Kernel Learning. AISTATS (PMLR). Proceedings