20260625:很多新内容
This commit is contained in:
40
concepts/coarse-grained-recurrence.md
Normal file
40
concepts/coarse-grained-recurrence.md
Normal file
@@ -0,0 +1,40 @@
|
||||
---
|
||||
title: "粗粒度循环 (Coarse-Grained Recurrence)"
|
||||
created: 2026-06-18
|
||||
updated: 2026-06-18
|
||||
type: concept
|
||||
tags: [transformers, recurrence, efficiency, chunking]
|
||||
sources:
|
||||
- mozer-topological-trouble-transformers-2026
|
||||
---
|
||||
|
||||
# 粗粒度循环 (Coarse-Grained Recurrence)
|
||||
|
||||
粗粒度循环是 Mozer et al. (2026) 提出的有前景方向之一:在**比单个 token 更粗的粒度**上引入循环,以降低 token 级循环的计算负担。
|
||||
|
||||
## 核心思想
|
||||
|
||||
逐 token 的状态更新(标准 RNN 方式)存在**计算瓶颈**——每个 token 都需要串行处理。粗粒度循环通过**分组压缩**在效率和状态追踪之间寻求平衡。
|
||||
|
||||
## 实现方式
|
||||
|
||||
### 块循环 (Block-Recurrent)
|
||||
- **Block-Recurrent Transformers**(Hutchins et al., 2022):将固定长度 token 块并行处理,块间循环传递压缩记忆
|
||||
- **Chevalier et al. (2023)**:块级自回归训练
|
||||
|
||||
### 语言结构驱动分块
|
||||
- **Borazjanizadeh & McClelland (2025)**:以句子为单位的"思想"分块——将语言建模为离散思想序列
|
||||
- 句子边界作为自然的循环步分界
|
||||
|
||||
## 优势
|
||||
|
||||
- 降低**串行步数**(token 级→句子级/块级)
|
||||
- 保留**状态传播连续性**(块间循环)
|
||||
- 更接近人类的**概念级**认知节奏
|
||||
|
||||
## 参考
|
||||
|
||||
- [[recurrence-taxonomy|循环分类法]]
|
||||
- [[step-recurrence|步级循环]]
|
||||
- [[latent-thought-models|隐式思考模型]]
|
||||
- [[mozer-topological-trouble-transformers-2026]]
|
||||
Reference in New Issue
Block a user