Files
myWiki/concepts/depth-recurrence.md

41 lines
1.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "深度循环 (Depth Recurrence)"
created: 2026-06-18
updated: 2026-06-18
type: concept
tags: [transformers, recurrence, depth, inference-time-scaling]
sources:
- mozer-topological-trouble-transformers-2026
---
# 深度循环 (Depth Recurrence)
深度循环是[[recurrence-taxonomy|循环分类法]]中沿**层深度轴**的循环模式:激活从深层回流浅层,形成循环 Transformer 块Mozer et al., 2026
## 典型形式
对应 Mozer et al. 图 5b 的展开模式:
- **Looped Transformer**Giannou et al., 2023; Dehghani et al., 2019单个/多个层被重复执行
- **RINS**Alabdulmohsin & Zhai, 2025自适应深度循环
- **推理时扩展**Inference-time scalingYang et al. (2024a), Chen et al. (2025b), Geiping et al. (2025) 等
## 关键局限
虽然深度循环增强了表达能力Saunshi et al., 2025但**不能实现无限状态追踪**
> 因为 s(t+1) 必须位于比 s(t) 更高的层——无论循环多少深度,状态表示仍然在垂直方向上移。
## 应用场景
- **推理时计算扩展**test-time compute scaling
- **微调适配**:预训练模型 + 深度循环微调Koishekenov et al., 2025
- **零训练循环**纯推理时方法提升推理Li et al., 2025b; Chen et al., 2026
## 参考
- [[step-recurrence|步级循环]]
- [[recurrence-taxonomy|循环分类法]]
- [[coarse-grained-recurrence|粗粒度循环]]
- [[mozer-topological-trouble-transformers-2026]]