Files
myWiki/concepts/thinking-mode.md

50 lines
1.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "思考模式 (Thinking Mode)"
created: 2026-06-18
updated: 2026-06-18
type: concept
tags: [reasoning, cot, lrm]
sources:
- gan-thinking-based-non-thinking-2026
---
# 思考模式 (Thinking Mode)
思考模式是[[large-reasoning-models|大推理模型]]中通过**长[[chain-of-thought|思维链]]CoT**进行推理的模式。在 TNT 的框架中,定义为响应的思考部分**非空**`[y_1, ..., y_τ] ≠ ∅`Gan et al., 2026
## 结构
思考模式的典型结构(以 DeepSeek-R1 为例):
```
<think>
Wait, let me analyze this carefully...
Maybe I should try another approach...
Let me verify this step...
</think>
The final answer is: 42
```
## 关键特征
- **探索**:尝试多种推理路径
- **反思**:自我质疑和纠错("Wait... that doesn't seem right"
- **自验证**:检查中间结果的正确性
- **最终收敛**到 `</think>` 后的 solution
## TNT 中的关键利用
TNT 的核心洞察LRM 的思考模式经过大规模数据训练,**确保 `</think>` 之后的 solution 部分不含额外思考**。这意味着 solution 部分的长度可以作为 [[non-thinking-mode|非思考模式]] 自然输出长度的**可靠上界估计**。
## 与非思考模式的选择
[[hybrid-reasoning-models|混合推理模型]]的目标是让模型自主权衡:
- 复杂查询 → 思考模式(准确性优先)
- 简单查询 → 非思考模式(效率优先)
## 参考
- [[non-thinking-mode|非思考模式]]
- [[hybrid-reasoning-models|混合推理模型]]
- [[large-reasoning-models|大推理模型]]
- [[gan-thinking-based-non-thinking-2026|TNT 论文]]