20260601
This commit is contained in:
45
concepts/text-space-optimizer.md
Normal file
45
concepts/text-space-optimizer.md
Normal file
@@ -0,0 +1,45 @@
|
||||
---
|
||||
title: "Text-Space Optimizer (文本空间优化器)"
|
||||
created: 2026-05-29
|
||||
updated: 2026-05-29
|
||||
type: concept
|
||||
tags: ["optimization", "text-space", "agent", "skill"]
|
||||
sources: ["https://arxiv.org/abs/2605.23904"]
|
||||
---
|
||||
|
||||
# Text-Space Optimizer (文本空间优化器)
|
||||
|
||||
**Text-Space Optimizer** 是 [[skillopt|SkillOpt]] 引入的核心范式:将 Agent skill 的训练建模为**文本空间中的优化问题**,与权重空间中的深度学习优化形成精确的结构类比。
|
||||
|
||||
## 为什么需要文本空间优化
|
||||
|
||||
传统的 skill 创建方式都不具备优化器的基本特征:
|
||||
- **手写/一次性生成**:无反馈循环
|
||||
- **松散自修正**:无控制(学习率、验证、动量)
|
||||
- **缺乏可复现性**:每次结果不可预测
|
||||
|
||||
## 从权重空间到文本空间的映射
|
||||
|
||||
SkillOpt 建立的精确类比:
|
||||
|
||||
| 组件 | 权重空间(θ) | 文本空间(Skill) |
|
||||
|------|:---:|:---:|
|
||||
| 优化对象 | 浮点张量 | Markdown 文档 |
|
||||
| 更新操作 | θ ← θ - η∇L | ADD/DELETE/REPLACE |
|
||||
| 步长控制 | Learning rate η | [[textual-learning-rate\|Edit budget L_t]] |
|
||||
| 数据划分 | Train/Val/Test | Rollout/Validation/Test |
|
||||
| 防止过拟合 | Early stopping | [[held-out-validation-gate\|Validation gate]] |
|
||||
| 负反馈 | 梯度下降 | [[rejected-edit-buffer\|Rejected buffer]] |
|
||||
| 动量 | EMA / Adam β | [[slow-meta-update\|Epoch-wise slow update]] |
|
||||
|
||||
## 核心洞察
|
||||
|
||||
> "The deep-learning analogy is operational rather than decorative."
|
||||
|
||||
这个类比不只是比喻——每个组件都有**操作性对应**。这使得 skill optimization 不再是"随便改改",而是一个可控的、可复现的训练过程。
|
||||
|
||||
## 相关
|
||||
|
||||
- [[skillopt]] — SkillOpt 的具体实现
|
||||
- [[skill-as-external-state]] — 为什么文本可以被优化
|
||||
- [[yang-skillopt-2026]] — 原始论文
|
||||
Reference in New Issue
Block a user