20260625:很多新内容

2026-06-25 14:08:47 +08:00
parent 91fac5b6fc
commit 6021dea160
375 changed files with 19263 additions and 251 deletions
--- a/articles/atlas-agent-memory-architecture-2026.md
+++ b/articles/atlas-agent-memory-architecture-2026.md
@@ -0,0 +1,89 @@
+---
+title: "Atlas Agent 记忆系统架构（2026）"
+created: 2026-06-24
+updated: 2026-06-24
+type: article
+tags: ["agent-memory", "elasticsearch", "hybrid-retrieval", "consolidation"]
+sources:
+  - "https://mp.weixin.qq.com/s/fypjVWJBQg_MZV9OMfPpIA"
+---
+
+# Atlas Agent 记忆系统架构
+
+> 基于 noamschwartz/atlas-memory-demo 的深度工程实践解析。核心主张：Agent 记忆不是 KV 存储问题，是多索引信息检索问题。
+
+## 问题
+
+`chat_history.append()` 把三种不同生命周期的信息塞进同一个数组——稳定事实、操作流程、时序事件——这是 Agent 永远在"忘记"的根因。真正的挑战是在查询瞬间穿过噪音找到对的那几条。
+
+## 核心架构：[[atlas-memory-system|三索引 + 公共]]
+
+[[agent-memory-taxonomy|四种记忆类型]]，各自独立的索引、字段和衰减策略：
+
+| 索引 | 存储内容 | 衰减源 | 写入频率 | 更新策略 |
+|------|---------|--------|---------|---------|
+| episodic | 原始消息+时间戳 | timestamp | 每回合 | 只写不改 |
+| semantic | 提炼后稳定事实 | last_used_at | consolidation | supersession 链 |
+| procedural | 多步操作流程 | 豁免 (1.0) | consolidation | 计数器更新 |
+| catalog | 公共共享知识 | timestamp | 手动 | 脚本覆盖 |
+
+## 检索管线：[[hybrid-recall-pipeline|混合召回]]
+
+```
+用户消息 → Verbatim Pre-Recall（不经 LLM 改写）
+         → BM25 词法 + Dense 语义 双通路并行
+         → RRF 融合 (rank_constant=30)
+         → Cross-encoder 重排序 (top-80 → top-K)
+         → 返回（reranker 失败时降级 RRF 顺序）
+```
+
+### 关键参数
+- **RECALL_OVER_FETCH_K=80** — consolidation 产生近重复 doc，候选池不足会挤掉 gold doc
+- **rank_constant=30** — 比默认 60 小，排名靠前的结果保持更强信号权重
+- **DECAY_SCALE=1825d** — 演示默认，客服应收紧至 60-180d
+
+### Ablation 数据 (168 QA, 3 persona, ~250 docs/user)
+
+| 配置 | R@10 |
+|------|------|
+| Full | **0.89** |
+| Dense-only | 0.845 |
+| BM25-only | 0.708 |
+| No-Reranker | -0.238 |
+
+dense 是主力，但 BM25 单腿 0.708 说明词法腿不可省略。reranker 最大单点贡献，但只在候选池足够宽时有用。
+
+## [[verbatim-pre-recall|Verbatim Pre-Recall]]
+
+在 `messages.append(user_msg)` 和 LLM 调用之间，用用户原话（不经改写）跑一次 recall。LLM 会把 "postgres v15.3 + pgvector 0.5.1" 泛化成 "PostgreSQL 数据库"——精确 token 丢失，BM25 词法匹配报废。Verbatim 绕过改写层，把最原始的 token 直接给 BM25。
+
+Ablation 证实：额外 query expansion（LLM paraphrase）反而降低性能——BM25 已捕获精确 token，dense 已捕获语义改写。
+
+## [[memory-consolidation|Consolidation（写后提炼）]]
+
+每回合结束后从最近 30 条 episodic 事件中提取稳定事实和操作流程。一次 LLM 调用同时输出三类结果：new_facts、new_procedures、procedural_updates。Production 建议改为后台日批模式——积累一天后在夜间统一跑，成本减半。
+
+## [[soft-supersession|Soft-Supersession]]
+
+非破坏性矛盾处理：用户说"搬家了"→ 创建新 doc + 标记旧 doc (superseded_by) + 召回时过滤旧版。链式追溯支持任意长度，旧记录永不删除（审计需要）。
+
+## [[gbrain-memory|与 GBrain 的对比]]
+
+| 维度 | Atlas (ES) | GBrain (Markdown+Git) |
+|------|-----------|----------------------|
+| 存储 | ES 搜索引擎 | Markdown 文件 + Git |
+| 多租户 | ES DLS（集群层） | 应用层 auth |
+| 矛盾处理 | Soft-Supersession 链 | Git 版本历史 |
+| 衰减 | [[per-index-time-decay|Per-index gauss]] | 无显式衰减 |
+| 透明度 | 仅 API | 直接打开文件 |
+
+个人助理 → GBrain（人可读信任优先）；多租户产品 → Atlas（ES 原生隔离）。
+
+## 三个通用设计原则
+
+1. **衰减曲线是领域性决策** — 先定义信息有效周期，再定衰减参数
+2. **BM25 + vector 互补，不可二选一** — BM25 抓精确术语，dense 抓语义意图
+3. **记忆需要后台提炼 + 矛盾处理** — 瓶颈从来不在数据库引擎，在分型逻辑和召回架构
+
+## 来源
+[原始存档](raw/articles/atlas-agent-memory-architecture-2026.md)