44 lines
1.4 KiB
Markdown
44 lines
1.4 KiB
Markdown
---
|
||
title: "KL 阶 (KL Order)"
|
||
created: 2026-06-10
|
||
updated: 2026-06-10
|
||
type: concept
|
||
tags: ["singular-learning-theory", "information-geometry", "bridge-invariant"]
|
||
sources: ["[[dead-directions-geometric-singular-learning]]"]
|
||
---
|
||
|
||
# KL 阶 (KL Order)
|
||
|
||
**KL 阶** k 是沿 [[dead-direction|Dead Direction]] 接近奇异集时 KL 散度趋于零的速率:
|
||
|
||
```
|
||
K(theta(t)) = c · t^{2k} + O(t^{2k+1})
|
||
```
|
||
|
||
KL 散度在 t 处有 2k 阶零点——两倍于 KL 阶。
|
||
|
||
## 桥接不变量
|
||
|
||
KL 阶是[[singular-learning-theory|SLT]]和[[information-geometry|信息几何]]都可计算的少数不变量之一:
|
||
|
||
- **SLT 解读**:法交形式中的指数——与 [[real-log-canonical-threshold|RLCT]] 直接相关 (lambda = 1/(2k))
|
||
- **信息几何解读**:[[fisher-information-metric|Fisher 度量]]退化速率的驱动力 (u^T F u ~ t^{2(k-1)})
|
||
|
||
## 与 Deep Direction Fisher 衰减的关系
|
||
|
||
```
|
||
k = 1: Fisher decay rate 0 (正则方向)
|
||
k = 2: Fisher decay rate 2
|
||
k = 3: Fisher decay rate 4
|
||
```
|
||
|
||
## 从 Checkpoint 计算
|
||
|
||
Shirodkar (2026) 证明可通过一次前向+反向传播计算 KL 阶——无需后验采样,无需消解。这使 SLT 分析在大规模网络上首次进入实践领域。
|
||
|
||
## 参考
|
||
- [[dead-directions-geometric-singular-learning|Dead Directions]]
|
||
- [[dead-direction|Dead Direction]]
|
||
- [[real-log-canonical-threshold|RLCT]]
|
||
- [[watanabe-triple|Watanabe's Triple]]
|