Files
myWiki/concepts/pac-bayesian-bounds.md

47 lines
1.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "PAC-Bayesian 泛化界 (PAC-Bayesian Bounds)"
created: 2026-06-17
updated: 2026-06-17
type: concept
tags: [theory, generalization, pac-learning, bayesian]
sources: [raw/papers/ortega-phd-thesis-2026.md]
confidence: high
---
# PAC-Bayesian 泛化界 (PAC-Bayesian Bounds)
PAC-Bayesian 界是[[ortega-phd-thesis|Ortega (2026)]]理论框架的核心工具——将泛化误差界表示为**先验分布 P 与后验分布 Q 之间的 KL 散度**。
## 标准形式
```
E_{Q}[L_test] ≤ E_{Q}[L_train] + sqrt( KL(Q||P) + log(n/δ) / (2n) )
```
- Q学习到的后验分布在假设空间上
- P先验分布与数据无关
- KL(Q||P):惩罚偏离先验的程度
## 论文中的推广
Ortega 将 PAC-Bayesian 框架与**大偏差理论**结合,推导 PAC-Chernoff 界:
- **非渐进**:不依赖 n→∞ 极限
- **插值区间有效**:在 L_train ≈ 0 时仍提供非平凡界
- **分布依赖**:不假设 i.i.d.
## 三个泛化机制的 PAC-Bayesian 解读
| 机制 | PAC-Bayesian 解释 |
|------|------------------|
| 多样性 | 独立预测器降低后验方差 |
| 光滑性 | 平坦极小值 → 小 KL 惩罚 |
| 随机性 | SGD 噪声 → 隐式先验 |
## 参考
- [[generalization-bounds|泛化界]]
- [[double-descent|双下降]]
- [[amortized-variational-inference|变分推断]]
- [[ortega-phd-thesis|论文]]