20260617:目前有914 页

2026-06-17 15:02:40 +08:00
parent e96b955fda
commit 91fac5b6fc
423 changed files with 20687 additions and 34 deletions
--- a/concepts/continuous-representation.md
+++ b/concepts/continuous-representation.md
@@ -0,0 +1,47 @@
+---
+title: "连续表征 (Continuous Representation)"
+created: 2026-06-17
+updated: 2026-06-17
+type: concept
+tags: [architecture, representation-learning, latent-reasoning]
+sources: [raw/papers/zhang-tarpo-2026.md]
+confidence: high
+---
+
+# 连续表征 (Continuous Representation)
+
+连续表征是 [[latent-reasoning|潜在推理]] 的数学基础——在 LLM 推理中使用高维连续向量而非离散 token 来表示推理状态。
+
+## 形式
+
+在 Transformer 架构中，连续的隐藏状态 `h_t ∈ R^d` 天然存在于每一层的输出中。标准 [[chain-of-thought|CoT]] 将这些 d 维连续向量坍缩为 1 个离散 token，而潜在推理直接操作这些连续表征。
+
+## 两种构造方式
+
+### 原始隐藏状态
+`u_continuous = h_t`（Transformer 最后一层输出）
+
+- 代表：[[coconut|COCONUT]]
+- 问题：表征流形不匹配（隐藏状态 ≠ token embedding 空间）
+
+### Token Embedding 混合
+`u_continuous = sum w_i * E(v_i)`（top-k embedding 加权混合）
+
+- 代表：[[tarpo|TARPO]]、[[hrpo|HRPO]]
+- 优势：与 token embedding 空间一致，缓解流形不匹配
+
+## 确定性问题
+
+连续表征是**确定性的**（给定相同输入，输出相同向量），这与离散 token 采样形成根本对立：
+
+- 离散 token 采样：`v ~ Categorical(logits)` → 随机
+- 连续表征：`u = f(h)` → 确定性
+
+这种确定性限制了 RL 策略探索，催生了 [[reparameterization-exploration|重参数化探索]] 和 [[hybrid-reasoning|混合推理]] 两条解决路线。
+
+## 参考
+
+- [[latent-reasoning|潜在推理]]
+- [[soft-token]]
+- [[hard-token]]
+- [[hybrid-reasoning|混合推理]]