20260625:很多新内容
This commit is contained in:
57
concepts/fact-augmented-key-expansion.md
Normal file
57
concepts/fact-augmented-key-expansion.md
Normal file
@@ -0,0 +1,57 @@
|
||||
---
|
||||
title: "Fact-Augmented Key Expansion"
|
||||
created: 2026-06-25
|
||||
updated: 2026-06-25
|
||||
type: concept
|
||||
tags: ["memory", "indexing", "optimization", "rag"]
|
||||
sources:
|
||||
- "[[longmem-eval-2025]]"
|
||||
---
|
||||
|
||||
# Fact-Augmented Key Expansion
|
||||
|
||||
Fact-Augmented Key Expansion 是 LongMemEval 实验验证的记忆索引优化策略:在存储记忆时,用 LLM 从对话中提取的结构化事实作为索引键(而非仅对话原文)。
|
||||
|
||||
## 动机
|
||||
|
||||
简单用对话原文作为 key 的问题:
|
||||
- 原文含大量噪音(闲聊、过渡语)
|
||||
- 一条对话可能包含多个独立事实,但只有一个 key
|
||||
- BM25 词法匹配依赖精确 token,原文可能用不同的词表达同一事实
|
||||
|
||||
## 做法
|
||||
|
||||
```
|
||||
对话历史
|
||||
↓
|
||||
LLM 事实提取 → [{"fact": "用户偏好 PostgreSQL", "confidence": 0.92},
|
||||
{"fact": "用户住在深圳", "confidence": 0.95}, ...]
|
||||
↓
|
||||
结构化为 key → 存入索引(与原文 value 关联)
|
||||
```
|
||||
|
||||
## 效果(LongMemEval 实验数据)
|
||||
|
||||
| 指标 | 仅原文 Key | +Fact Key | 增益 |
|
||||
|------|----------|-----------|------|
|
||||
| Memory Recall@k | baseline | +9.4% | 显著 |
|
||||
| QA Accuracy | baseline | +5.4% | 显著 |
|
||||
|
||||
## 为什么有效
|
||||
|
||||
1. **结构化事实消除歧义**:"我只用 PostgreSQL" → "数据库偏好: PostgreSQL" 比原文本 BM25 匹配更可靠
|
||||
2. **多事实拆分**:一条对话可能含 3 个独立事实 → 3 个 key,每个独立可召回
|
||||
3. **confidence 字段**支持未来过滤:低置信度事实可降低召回权重
|
||||
|
||||
## 与 Atlas Consolidation 的关系
|
||||
|
||||
Atlas 的 consolidation 本质上是 Fact-Augmented Key Expansion 的一种实现:
|
||||
- episodic → 原文 value
|
||||
- consolidation → 从 episodic 提取结构化事实 → 存入 semantic 索引
|
||||
- semantic 索引的 recall 就等价于 fact-augmented key expansion 的效果
|
||||
|
||||
## 参考
|
||||
- [[longmem-eval-2025]]
|
||||
- [[memory-indexing-retrieval-reading]]
|
||||
- [[atlas-memory-system]]
|
||||
- [[memory-consolidation]]
|
||||
Reference in New Issue
Block a user