20260625:很多新内容
This commit is contained in:
60
concepts/state-space-models.md
Normal file
60
concepts/state-space-models.md
Normal file
@@ -0,0 +1,60 @@
|
||||
---
|
||||
title: "状态空间模型 (State-Space Models)"
|
||||
created: 2026-06-18
|
||||
updated: 2026-06-18
|
||||
type: concept
|
||||
tags: [ssm, recurrence, architecture, state-tracking]
|
||||
sources:
|
||||
- mozer-topological-trouble-transformers-2026
|
||||
---
|
||||
|
||||
# 状态空间模型 (State-Space Models)
|
||||
|
||||
状态空间模型(SSM)是一类通过**隐状态横向传播**实现序列建模的架构([[step-recurrence|步级循环]]),在 Mozer et al. (2026) 的分类中占据步级循环轴的核心位置。
|
||||
|
||||
## 核心形式
|
||||
|
||||
SSM 在每层内维护一个隐状态,从前一步向后一步传播:
|
||||
```
|
||||
h_t = A * h_{t-1} + B * x_t (状态更新)
|
||||
y_t = C * h_t (输出投影)
|
||||
```
|
||||
|
||||
## 主要架构
|
||||
|
||||
| 架构 | 特点 |
|
||||
|------|------|
|
||||
| **线性注意力**(Katharopoulos et al., 2020) | 核化注意力 = 线性 SSM |
|
||||
| **Mamba**(Gu & Dao, 2024) | 输入依赖的选择性门控 |
|
||||
| **DeltaNet**(Schlag et al., 2021) | Delta 规则更新,快速权重 |
|
||||
| **RWKV-7**(Peng et al., 2025) | 线性注意力 + Delta 规则 |
|
||||
| **Canon Layers**(Allen-Zhu, 2025) | 规范形式层 |
|
||||
|
||||
## 表达能力边界
|
||||
|
||||
Merrill et al. (2025) 的关键结论:
|
||||
- **线性更新的 SSM** 不超过 Transformer 表达能力
|
||||
- **增强 SSM**(如 DeltaNet 负特征值扩展,Grazzi et al., 2025)可超越
|
||||
- 门控线性注意力 + Transformer 混合优于纯方案(Merrill et al., 2026)
|
||||
|
||||
## 优势与局限
|
||||
|
||||
**优势**:
|
||||
- 推理时 O(1) 记忆(不需要 KV cache 随序列增长)
|
||||
- 训练时可并行(关联扫描)
|
||||
|
||||
**局限**:
|
||||
- 标准形式不能实现无限状态追踪
|
||||
- 选择性门控(Mamba)增加了表达能力但仍有限
|
||||
|
||||
## 参考
|
||||
|
||||
- [[enhanced-state-space-models|增强状态空间模型]]
|
||||
- [[step-recurrence|步级循环]]
|
||||
- [[state-tracking|状态追踪]]
|
||||
- [[feedforward-depth-limitation|前馈深度局限]]
|
||||
- [[mozer-topological-trouble-transformers-2026|Topological Trouble With Transformers]]
|
||||
- [[gu-mamba|Mamba 论文]] (Gu & Dao, 2024)
|
||||
- [[dao-transformers-are-ssms-2024|Transformers are SSMs (Mamba-2)]] (Dao & Gu, 2024)
|
||||
- [[selective-state-space-models|选择性状态空间模型]]
|
||||
- [[selective-state-space|选择机制 (S6)]]
|
||||
Reference in New Issue
Block a user