Files
myWiki/concepts/representational-alignment.md

39 lines
1.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "表征对齐 (Representational Alignment)"
created: 2026-06-18
updated: 2026-06-18
type: concept
tags: [transformers, representation, residual-connections, depth]
sources:
- mozer-topological-trouble-transformers-2026
---
# 表征对齐 (Representational Alignment)
表征对齐是 Mozer et al. (2026) 指出的关键架构特性Transformer **残差连接**导致不同层的表征空间高度对齐,从而促进跨层通信。
## 为什么重要
- **可变深度模型**(如推理时动态选择层数)仅在微调甚至零训练下就取得成功——说明模型已通过残差连接天然预适应了跨层表征通信
- **Canon Layers**Allen-Zhu & Li, 2025利用相邻输入步的表征对齐实现高效的状态传播
- **跨层通信**:对齐的表征空间使深层信息能更有效地影响浅层
## 与循环架构的关系
表征对齐是循环架构成功的**隐性前提**
- 如果各层表征空间差异大,深层→浅层的信号传递将需要大量变换(学习代价高)
- 对齐降低了循环连接的适配成本
## 研究方向
Mozer et al. 推测存在更多利用这种对齐的方式,如:
- 更高效的跨层注意力机制
- 降低循环微调所需的数据量
- 利用对齐实现更紧凑的状态表示
## 参考
- [[depth-recurrence|深度循环]]
- [[recurrent-transformer-architectures|循环 Transformer 架构]]
- [[mozer-topological-trouble-transformers-2026]]