Files
myWiki/concepts/text-space-optimizer.md
2026-06-01 10:46:01 +08:00

46 lines
1.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "Text-Space Optimizer (文本空间优化器)"
created: 2026-05-29
updated: 2026-05-29
type: concept
tags: ["optimization", "text-space", "agent", "skill"]
sources: ["https://arxiv.org/abs/2605.23904"]
---
# Text-Space Optimizer (文本空间优化器)
**Text-Space Optimizer** 是 [[skillopt|SkillOpt]] 引入的核心范式:将 Agent skill 的训练建模为**文本空间中的优化问题**,与权重空间中的深度学习优化形成精确的结构类比。
## 为什么需要文本空间优化
传统的 skill 创建方式都不具备优化器的基本特征:
- **手写/一次性生成**:无反馈循环
- **松散自修正**:无控制(学习率、验证、动量)
- **缺乏可复现性**:每次结果不可预测
## 从权重空间到文本空间的映射
SkillOpt 建立的精确类比:
| 组件 | 权重空间(θ) | 文本空间Skill |
|------|:---:|:---:|
| 优化对象 | 浮点张量 | Markdown 文档 |
| 更新操作 | θ ← θ - η∇L | ADD/DELETE/REPLACE |
| 步长控制 | Learning rate η | [[textual-learning-rate\|Edit budget L_t]] |
| 数据划分 | Train/Val/Test | Rollout/Validation/Test |
| 防止过拟合 | Early stopping | [[held-out-validation-gate\|Validation gate]] |
| 负反馈 | 梯度下降 | [[rejected-edit-buffer\|Rejected buffer]] |
| 动量 | EMA / Adam β | [[slow-meta-update\|Epoch-wise slow update]] |
## 核心洞察
> "The deep-learning analogy is operational rather than decorative."
这个类比不只是比喻——每个组件都有**操作性对应**。这使得 skill optimization 不再是"随便改改",而是一个可控的、可复现的训练过程。
## 相关
- [[skillopt]] — SkillOpt 的具体实现
- [[skill-as-external-state]] — 为什么文本可以被优化
- [[yang-skillopt-2026]] — 原始论文