Files
myWiki/concepts/context-enriched-embeddings.md

46 lines
1.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "上下文增强嵌入 — Context Enriched Embeddings"
created: 2026-06-19
updated: 2026-06-19
type: concept
tags: [embeddings, context-enrichment, vector-retrieval, tool-discovery]
sources:
- https://arxiv.org/abs/2509.20386
---
# 上下文增强嵌入Context Enriched Embeddings
## 定义
Dynamic ReAct 论文中的关键向量检索优化策略:使用 LLMSonnet 4**程序化增强工具描述**——生成隐式功能和用例描述——再嵌入。将 Top-5 检索准确率从 40% 提升至 60%+50% 相对提升)。
## 为什么需要增强
工具文档通常只描述**显式功能**(参数、返回类型),缺少:
- 隐式功能("send email" 暗示需要 SMTP 能力)
- 用例上下文(什么场景下用这个工具)
- 工具间的关系(这个工具通常和哪些工具配合)
## 实验数据
| 策略 | Top-5 | Top-10 |
|------|-------|--------|
| OpenAI text-embedding-3-large (baseline) | 40% | 64% |
| voyage-context-3 | 48% | 68% |
| **voyage-context-3 + Sonnet context enrichment** | **60%** | 68% |
| + BM25 hybrid | 56% | 72% |
Sonnet 增强带来 **+12pp**vs voyage-context-3 alone。BM25 混合提升 recall+4pp Top-10但降 precision-4pp Top-5因为关键词重叠引入误匹配。
## 实际案例
查询 "send email"
- BaselineOpenAIresend__send_email #4google_mail__send_email #6outlook__send_mail 未进 Top-10
- OptimizedVoyage + Contextoutlook__send_mail #1google_mail__send_email #2resend__send_email #4 ——三个期望工具全进 Top-5
## 参考
- [[dynamic-react|Dynamic ReAct]]
- [[gaurav-dynamic-react-2025|论文]]
- [[search-and-load|Search and Load]]