myWiki/concepts/continuous-representation.md

---
title: "连续表征 (Continuous Representation)"
created: 2026-06-17
updated: 2026-06-17
type: concept
tags: [architecture, representation-learning, latent-reasoning]
sources: [raw/papers/zhang-tarpo-2026.md]
confidence: high
---

# 连续表征 (Continuous Representation)

连续表征是 [[latent-reasoning|潜在推理]] 的数学基础——在 LLM 推理中使用高维连续向量而非离散 token 来表示推理状态。

## 形式

在 Transformer 架构中，连续的隐藏状态 `h_t ∈ R^d` 天然存在于每一层的输出中。标准 [[chain-of-thought|CoT]] 将这些 d 维连续向量坍缩为 1 个离散 token，而潜在推理直接操作这些连续表征。

## 两种构造方式

### 原始隐藏状态
`u_continuous = h_t`（Transformer 最后一层输出）

- 代表：[[coconut|COCONUT]]
- 问题：表征流形不匹配（隐藏状态 ≠ token embedding 空间）

### Token Embedding 混合
`u_continuous = sum w_i * E(v_i)`（top-k embedding 加权混合）

- 代表：[[tarpo|TARPO]]、[[hrpo|HRPO]]
- 优势：与 token embedding 空间一致，缓解流形不匹配

## 确定性问题

连续表征是**确定性的**（给定相同输入，输出相同向量），这与离散 token 采样形成根本对立：

- 离散 token 采样：`v ~ Categorical(logits)` → 随机
- 连续表征：`u = f(h)` → 确定性

这种确定性限制了 RL 策略探索，催生了 [[reparameterization-exploration|重参数化探索]] 和 [[hybrid-reasoning|混合推理]] 两条解决路线。

## 参考

- [[latent-reasoning|潜在推理]]
- [[soft-token]]
- [[hard-token]]
- [[hybrid-reasoning|混合推理]]