20260625:很多新内容
This commit is contained in:
49
concepts/thinking-mode.md
Normal file
49
concepts/thinking-mode.md
Normal file
@@ -0,0 +1,49 @@
|
||||
---
|
||||
title: "思考模式 (Thinking Mode)"
|
||||
created: 2026-06-18
|
||||
updated: 2026-06-18
|
||||
type: concept
|
||||
tags: [reasoning, cot, lrm]
|
||||
sources:
|
||||
- gan-thinking-based-non-thinking-2026
|
||||
---
|
||||
|
||||
# 思考模式 (Thinking Mode)
|
||||
|
||||
思考模式是[[large-reasoning-models|大推理模型]]中通过**长[[chain-of-thought|思维链]](CoT)**进行推理的模式。在 TNT 的框架中,定义为响应的思考部分**非空**:`[y_1, ..., y_τ] ≠ ∅`(Gan et al., 2026)。
|
||||
|
||||
## 结构
|
||||
|
||||
思考模式的典型结构(以 DeepSeek-R1 为例):
|
||||
```
|
||||
<think>
|
||||
Wait, let me analyze this carefully...
|
||||
Maybe I should try another approach...
|
||||
Let me verify this step...
|
||||
</think>
|
||||
The final answer is: 42
|
||||
```
|
||||
|
||||
## 关键特征
|
||||
|
||||
- **探索**:尝试多种推理路径
|
||||
- **反思**:自我质疑和纠错("Wait... that doesn't seem right")
|
||||
- **自验证**:检查中间结果的正确性
|
||||
- **最终收敛**到 `</think>` 后的 solution
|
||||
|
||||
## TNT 中的关键利用
|
||||
|
||||
TNT 的核心洞察:LRM 的思考模式经过大规模数据训练,**确保 `</think>` 之后的 solution 部分不含额外思考**。这意味着 solution 部分的长度可以作为 [[non-thinking-mode|非思考模式]] 自然输出长度的**可靠上界估计**。
|
||||
|
||||
## 与非思考模式的选择
|
||||
|
||||
[[hybrid-reasoning-models|混合推理模型]]的目标是让模型自主权衡:
|
||||
- 复杂查询 → 思考模式(准确性优先)
|
||||
- 简单查询 → 非思考模式(效率优先)
|
||||
|
||||
## 参考
|
||||
|
||||
- [[non-thinking-mode|非思考模式]]
|
||||
- [[hybrid-reasoning-models|混合推理模型]]
|
||||
- [[large-reasoning-models|大推理模型]]
|
||||
- [[gan-thinking-based-non-thinking-2026|TNT 论文]]
|
||||
Reference in New Issue
Block a user