Files
myWiki/reviews/yang-skillopt-review.md
2026-06-01 10:46:01 +08:00

53 lines
2.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "Review: SkillOpt — Agent Skill 的文本空间优化器"
created: 2026-05-29
type: review
paper: "yang-skillopt-2026"
arxiv: "2605.23904"
---
# 📌 Review: SkillOpt
**论文**: SkillOpt: Executive Strategy for Self-Evolving Agent Skills
**作者**: Yifan Yang, Ziyang Gong, Weiquan Huang et al. (15 authors)
**机构**: Microsoft, SJTU, Tongji, Fudan
**arXiv**: 2605.23904 | **领域**: cs.AI | **时间**: 2026-05-29
---
## 🎯 核心概念
1. **[[skillopt|SkillOpt]]** — 首个系统性 Agent Skill 文本空间优化器52/52 best or tied
2. **[[text-space-optimizer|Text-Space Optimizer]]** — 将 skill 训练建模为文本空间优化,与权重空间形成精确类比
3. **[[textual-learning-rate|Textual Learning Rate]]** — 编辑预算 L_t 控制优化步长
4. **[[held-out-validation-gate|Held-Out Validation Gate]]** — 候选编辑仅在留出集上改善时才被接受
5. **[[rejected-edit-buffer|Rejected-Edit Buffer]]** — 失败编辑的负反馈信号epoch-local
6. **[[slow-meta-update|Slow/Meta Update]]** — Momentum 在文本空间的对应:跨 epoch 持久规律
7. **[[skill-as-external-state|Skill as External State]]** — 适应不一定要改权重skill 就是可训练的外部状态
---
## 🔗 概念网络
**核心链**: `skillopt``text-space-optimizer``textual-learning-rate``held-out-validation-gate``slow-meta-update`
**反馈闭环**: `held-out-validation-gate``rejected-edit-buffer` → optimizer → `held-out-validation-gate`
**上层哲学**: `skill-as-external-state` → 连接 `model-harness-relationship` + `heuristic-learning`
---
## 📚 Wiki 集成
- **新增页面**: 10 个1 raw + 1 paper + 7 概念 + 1 review
- **链接完整性**: 100% 无断链 ✅
- **总规模**: 527 → 535 页
---
## 💡 关键洞察
**1. "类比是操作性的,不是装饰性的"**SkillOpt 最精妙之处是它对深度学习优化器的类比**每个组件都有操作性对应**——learning rate → edit budget、validation → held-out gate、momentum → slow update。这不是比喻是一个完整翻译过来的优化框架。这在 AI 历史上可能是第一次有人把"优化自然语言 artifact"这件事做得如此系统。
**2. 从"改参数"到"改文档"的范式转移**SkillOpt 明确指出 adaptation ≠ weight update。Skill 作为可训练外部状态,与今日已在推进的 `model-harness-relationship``heuristic-learning``compiled-ai-paradigm` 形成了一条完整的叙事线——AI 的适应正在从模型内部权重迁移到模型外部skill/harness/code这是一个与本次 GenAI 浪潮本质特征(生成式·通用性·统一性)高度一致的深层趋势。