20260514:增加新内容

2026-05-14 13:54:52 +08:00
parent 56c4d3ef7c
commit b116710e4c
294 changed files with 10682 additions and 255 deletions
--- a/concepts/long-context-understanding.md
+++ b/concepts/long-context-understanding.md
@@ -0,0 +1,37 @@
+---
+title: 长上下文理解 (Long-Context Understanding)
+created: 2026-05-01
+updated: 2026-05-01
+type: concept
+tags: [llm, architecture, benchmark]
+sources: [papers/hunyuan-team-cl-bench-life.md]
+---
+
+# 长上下文理解 (Long-Context Understanding)
+
+> 语言模型在超长输入序列（10K–1M+ tokens）中检索信息和进行推理的能力。与 [[real-life-context-learning]] 相关但不等价。
+
+## 定义
+
+长上下文理解考察模型在以下方面的表现：
+- **信息检索**：能否在长文本中的任意位置找到特定事实（Needle-in-a-Haystack）
+- **多跳推理**：能否组合分散在不同位置的信息
+- **位置鲁棒性**：性能是否随目标信息位置变化（如 [[lost-in-the-middle]]）
+
+## 与真实生活上下文学习的解耦
+
+CL-bench Life 的重要发现：**长上下文能力 ≠ 真实生活上下文学习能力**：
+
+- CL-bench Life 的上下文长度（5.4K–170.8K）在大多数前沿模型窗口内
+- 任务解决率与上下文长度**无强相关性**
+- 混乱上下文的推理质量是独立于上下文长度的瓶颈
+
+## 相关概念
+- [[context-learning]] — 通用上下文学习
+- [[real-life-context-learning]] — 真实生活上下文学习
+- [[lost-in-the-middle]] — 中间信息丢失
+- [[million-token-context]] — 百万 Token 上下文
+
+---
+
+*Last Updated: 2026-05-01*