This commit is contained in:
2026-06-01 10:46:01 +08:00
parent 2faf4bb002
commit e96b955fda
221 changed files with 10219 additions and 332 deletions

View File

@@ -0,0 +1,52 @@
---
title: "Review: SkillOpt — Agent Skill 的文本空间优化器"
created: 2026-05-29
type: review
paper: "yang-skillopt-2026"
arxiv: "2605.23904"
---
# 📌 Review: SkillOpt
**论文**: SkillOpt: Executive Strategy for Self-Evolving Agent Skills
**作者**: Yifan Yang, Ziyang Gong, Weiquan Huang et al. (15 authors)
**机构**: Microsoft, SJTU, Tongji, Fudan
**arXiv**: 2605.23904 | **领域**: cs.AI | **时间**: 2026-05-29
---
## 🎯 核心概念
1. **[[skillopt|SkillOpt]]** — 首个系统性 Agent Skill 文本空间优化器52/52 best or tied
2. **[[text-space-optimizer|Text-Space Optimizer]]** — 将 skill 训练建模为文本空间优化,与权重空间形成精确类比
3. **[[textual-learning-rate|Textual Learning Rate]]** — 编辑预算 L_t 控制优化步长
4. **[[held-out-validation-gate|Held-Out Validation Gate]]** — 候选编辑仅在留出集上改善时才被接受
5. **[[rejected-edit-buffer|Rejected-Edit Buffer]]** — 失败编辑的负反馈信号epoch-local
6. **[[slow-meta-update|Slow/Meta Update]]** — Momentum 在文本空间的对应:跨 epoch 持久规律
7. **[[skill-as-external-state|Skill as External State]]** — 适应不一定要改权重skill 就是可训练的外部状态
---
## 🔗 概念网络
**核心链**: `skillopt``text-space-optimizer``textual-learning-rate``held-out-validation-gate``slow-meta-update`
**反馈闭环**: `held-out-validation-gate``rejected-edit-buffer` → optimizer → `held-out-validation-gate`
**上层哲学**: `skill-as-external-state` → 连接 `model-harness-relationship` + `heuristic-learning`
---
## 📚 Wiki 集成
- **新增页面**: 10 个1 raw + 1 paper + 7 概念 + 1 review
- **链接完整性**: 100% 无断链 ✅
- **总规模**: 527 → 535 页
---
## 💡 关键洞察
**1. "类比是操作性的,不是装饰性的"**SkillOpt 最精妙之处是它对深度学习优化器的类比**每个组件都有操作性对应**——learning rate → edit budget、validation → held-out gate、momentum → slow update。这不是比喻是一个完整翻译过来的优化框架。这在 AI 历史上可能是第一次有人把"优化自然语言 artifact"这件事做得如此系统。
**2. 从"改参数"到"改文档"的范式转移**SkillOpt 明确指出 adaptation ≠ weight update。Skill 作为可训练外部状态,与今日已在推进的 `model-harness-relationship``heuristic-learning``compiled-ai-paradigm` 形成了一条完整的叙事线——AI 的适应正在从模型内部权重迁移到模型外部skill/harness/code这是一个与本次 GenAI 浪潮本质特征(生成式·通用性·统一性)高度一致的深层趋势。