20260601
This commit is contained in:
47
concepts/textual-learning-rate.md
Normal file
47
concepts/textual-learning-rate.md
Normal file
@@ -0,0 +1,47 @@
|
||||
---
|
||||
title: "Textual Learning Rate (文本学习率)"
|
||||
created: 2026-05-29
|
||||
updated: 2026-05-29
|
||||
type: concept
|
||||
tags: ["optimization", "skill", "learning-rate", "control"]
|
||||
sources: ["https://arxiv.org/abs/2605.23904"]
|
||||
---
|
||||
|
||||
# Textual Learning Rate (文本学习率)
|
||||
|
||||
**Textual Learning Rate** 是 [[skillopt|SkillOpt]] 中控制优化步长的核心机制:每步最多允许应用的 skill 编辑数量 L_t。它是深度学习中 learning rate η 在文本空间的精确类比。
|
||||
|
||||
## 为什么需要
|
||||
|
||||
无约束的文本重写会导致:
|
||||
- 删除有用的规则
|
||||
- 引入不兼容的指令
|
||||
- 对局部失败过拟合
|
||||
- 与之前的优化历史失去连续性
|
||||
|
||||
## 调度策略
|
||||
|
||||
SkillOpt 支持四种编辑预算调度:
|
||||
|
||||
| 策略 | 行为 |
|
||||
|------|------|
|
||||
| **Constant** | L_t 固定不变 |
|
||||
| **Linear** | 线性衰减 |
|
||||
| **Cosine** (默认) | 前期大步长 → 后期小步长 → 收敛 |
|
||||
| **Autonomous** | Optimizer 自主判断 |
|
||||
|
||||
默认 cosine schedule 从较大编辑开始(探索),逐步衰减到较小的 consolidation 步骤(精调)。
|
||||
|
||||
## 与学习率的类比
|
||||
|
||||
```
|
||||
θ ← θ - η∇L → Skill ← Skill + bounded_edits(L_t)
|
||||
```
|
||||
|
||||
两者都控制"一步可以走多远"——太大导致不稳定,太小导致收敛慢。
|
||||
|
||||
## 相关
|
||||
|
||||
- [[text-space-optimizer]] — 文本空间优化范式
|
||||
- [[skillopt]] — 使用 textual learning rate 的方法
|
||||
- [[yang-skillopt-2026]] — 原始论文
|
||||
Reference in New Issue
Block a user