20260625:很多新内容
This commit is contained in:
37
concepts/overthinking.md
Normal file
37
concepts/overthinking.md
Normal file
@@ -0,0 +1,37 @@
|
||||
---
|
||||
title: "过度思考 (Overthinking)"
|
||||
created: 2026-06-18
|
||||
updated: 2026-06-18
|
||||
type: concept
|
||||
tags: [reasoning, efficiency, cot, thinking]
|
||||
sources:
|
||||
- gan-thinking-based-non-thinking-2026
|
||||
---
|
||||
|
||||
# 过度思考 (Overthinking)
|
||||
|
||||
Overthinking 指[[large-reasoning-models|大推理模型]]对**所有查询**(包括简单查询)都产生冗长、重复的[[chain-of-thought|思维链]](CoT),导致推理开销和延迟大幅增加的问题(Sui et al., 2025; Qu et al., 2025)。
|
||||
|
||||
## 问题本质
|
||||
|
||||
LRM 的 CoT 包含持续的探索、反思和自我验证——这是它们在数学竞赛等复杂任务上成功的关键。但这种"不分难易一律思考"的策略导致:
|
||||
- **推理时延**:简单问题也经历完整的多步推理
|
||||
- **Token 浪费**:大量"Wait... Let me check..."等无建设性内容
|
||||
- **计算成本**:每个请求的 FLOPs 和内存消耗显著增加
|
||||
|
||||
## 与混合推理的关系
|
||||
|
||||
Overthinking 是[[hybrid-reasoning-models|混合推理模型]]的**核心动机**——目标不是消除思考,而是让模型学会"简单问题直接答,复杂问题才思考"。
|
||||
|
||||
## TNT 的解决方案
|
||||
|
||||
TNT 让模型**自主选择**思考/非思考模式,最终在数学基准上:
|
||||
- Token 使用量减少约 **50%**
|
||||
- 准确率**同时提升** 4.1%
|
||||
- Reward hacking 率 < 10%
|
||||
|
||||
## 参考
|
||||
|
||||
- [[hybrid-reasoning-models|混合推理模型]]
|
||||
- [[token-efficiency|Token 效率]]
|
||||
- [[gan-thinking-based-non-thinking-2026|TNT 论文]]
|
||||
Reference in New Issue
Block a user