Files
myWiki/concepts/double-descent.md

50 lines
1.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "双下降 (Double Descent)"
created: 2026-06-17
updated: 2026-06-17
type: concept
tags: [theory, generalization, deep-learning, phenomena]
sources: [raw/papers/ortega-phd-thesis-2026.md]
confidence: high
---
# 双下降 (Double Descent)
双下降是深度学习中**模型复杂度增加时测试误差先降、后升、再降**的经验现象——违背经典 U 型曲线的直觉。[[ortega-phd-thesis|Ortega (2026)]] 通过 PAC-Chernoff 界提供了定量解释。
## 三个阶段
```
测试误差
| ↓ (classical)
| ↑ (插值阈值)
| ↓ (过参数化区间)
└──────────────→ 模型复杂度
```
1. **经典区间**:偏差-方差权衡U 型最低点
2. **插值阈值**:模型刚好拟合训练数据 → 误差峰值
3. **过参数化区间**:越过插值 → 误差再次下降
## PAC-Chernoff 解释
传统界在插值区间失效L_train ≈ 0 → bound ≈ ∞。Ortega 的大偏差界:
- **非渐进**:不假设 n→∞
- **率函数**:捕获了损失景观的局部几何
- 在插值点:率函数 I(0) 分母为模型灵活性
- 过参数化后:增加灵活性 → 率函数增大 → 界收紧
## 与三个泛化机制的关联
- **光滑性**:平坦极小值 → 率函数更陡 → 界更紧
- **多样性**:集成 → 有效方差减小
- **随机性**SGD 噪声 → 自然偏向平坦极小值
## 参考
- [[generalization-bounds|泛化界]]
- [[pac-bayesian-bounds|PAC-Bayesian 界]]
- [[ortega-phd-thesis|论文]]