Files
myWiki/concepts/coarse-grained-recurrence.md

41 lines
1.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "粗粒度循环 (Coarse-Grained Recurrence)"
created: 2026-06-18
updated: 2026-06-18
type: concept
tags: [transformers, recurrence, efficiency, chunking]
sources:
- mozer-topological-trouble-transformers-2026
---
# 粗粒度循环 (Coarse-Grained Recurrence)
粗粒度循环是 Mozer et al. (2026) 提出的有前景方向之一:在**比单个 token 更粗的粒度**上引入循环,以降低 token 级循环的计算负担。
## 核心思想
逐 token 的状态更新(标准 RNN 方式)存在**计算瓶颈**——每个 token 都需要串行处理。粗粒度循环通过**分组压缩**在效率和状态追踪之间寻求平衡。
## 实现方式
### 块循环 (Block-Recurrent)
- **Block-Recurrent Transformers**Hutchins et al., 2022将固定长度 token 块并行处理,块间循环传递压缩记忆
- **Chevalier et al. (2023)**:块级自回归训练
### 语言结构驱动分块
- **Borazjanizadeh & McClelland (2025)**:以句子为单位的"思想"分块——将语言建模为离散思想序列
- 句子边界作为自然的循环步分界
## 优势
- 降低**串行步数**token 级→句子级/块级)
- 保留**状态传播连续性**(块间循环)
- 更接近人类的**概念级**认知节奏
## 参考
- [[recurrence-taxonomy|循环分类法]]
- [[step-recurrence|步级循环]]
- [[latent-thought-models|隐式思考模型]]
- [[mozer-topological-trouble-transformers-2026]]