20260617:目前有914 页
This commit is contained in:
49
concepts/double-descent.md
Normal file
49
concepts/double-descent.md
Normal file
@@ -0,0 +1,49 @@
|
||||
---
|
||||
title: "双下降 (Double Descent)"
|
||||
created: 2026-06-17
|
||||
updated: 2026-06-17
|
||||
type: concept
|
||||
tags: [theory, generalization, deep-learning, phenomena]
|
||||
sources: [raw/papers/ortega-phd-thesis-2026.md]
|
||||
confidence: high
|
||||
---
|
||||
|
||||
# 双下降 (Double Descent)
|
||||
|
||||
双下降是深度学习中**模型复杂度增加时测试误差先降、后升、再降**的经验现象——违背经典 U 型曲线的直觉。[[ortega-phd-thesis|Ortega (2026)]] 通过 PAC-Chernoff 界提供了定量解释。
|
||||
|
||||
## 三个阶段
|
||||
|
||||
```
|
||||
测试误差
|
||||
↑
|
||||
| ↓ (classical)
|
||||
| ↑ (插值阈值)
|
||||
| ↓ (过参数化区间)
|
||||
└──────────────→ 模型复杂度
|
||||
```
|
||||
|
||||
1. **经典区间**:偏差-方差权衡,U 型最低点
|
||||
2. **插值阈值**:模型刚好拟合训练数据 → 误差峰值
|
||||
3. **过参数化区间**:越过插值 → 误差再次下降
|
||||
|
||||
## PAC-Chernoff 解释
|
||||
|
||||
传统界在插值区间失效(L_train ≈ 0 → bound ≈ ∞)。Ortega 的大偏差界:
|
||||
|
||||
- **非渐进**:不假设 n→∞
|
||||
- **率函数**:捕获了损失景观的局部几何
|
||||
- 在插值点:率函数 I(0) 分母为模型灵活性
|
||||
- 过参数化后:增加灵活性 → 率函数增大 → 界收紧
|
||||
|
||||
## 与三个泛化机制的关联
|
||||
|
||||
- **光滑性**:平坦极小值 → 率函数更陡 → 界更紧
|
||||
- **多样性**:集成 → 有效方差减小
|
||||
- **随机性**:SGD 噪声 → 自然偏向平坦极小值
|
||||
|
||||
## 参考
|
||||
|
||||
- [[generalization-bounds|泛化界]]
|
||||
- [[pac-bayesian-bounds|PAC-Bayesian 界]]
|
||||
- [[ortega-phd-thesis|论文]]
|
||||
Reference in New Issue
Block a user