20260625:很多新内容
This commit is contained in:
46
concepts/convex-hull-relaxation.md
Normal file
46
concepts/convex-hull-relaxation.md
Normal file
@@ -0,0 +1,46 @@
|
||||
---
|
||||
title: "Convex-Hull Relaxation (KV Cache)"
|
||||
created: 2026-06-18
|
||||
updated: 2026-06-18
|
||||
type: concept
|
||||
tags: ["optimization", "kv-cache", "convex-relaxation"]
|
||||
sources: ["https://arxiv.org/abs/2602.08585"]
|
||||
---
|
||||
|
||||
# Convex-Hull Relaxation
|
||||
|
||||
## 定义
|
||||
|
||||
Convex-Hull Relaxation(凸包松弛)是 LU-KV 用于求解 [[global-combinatorial-optimization]] 的核心技巧。将对每个 attention head 的非凸离散损失序列进行凸化,使全局贪心算法能达到最优解。
|
||||
|
||||
## 为什么需要
|
||||
|
||||
原始的 [[oracle-importance]] 驱逐损失 L(M^π(0)), ..., L(M^π(T)) 作为整数预算的函数**不满足凸性**,导致:
|
||||
|
||||
- 无法直接应用贪心算法(贪心在非凸目标上无最优性保证)
|
||||
- 动态规划可行但 cost 过高(profiling 规模不可接受)
|
||||
|
||||
## 方法:PAVA 保序回归
|
||||
|
||||
LU-KV 采用 Pool Adjacent Violators Algorithm (PAVA) 做保序回归:
|
||||
|
||||
1. 计算原始损失的**边际递减量**序列 d(i) = L(i-1) - L(i)(可能非单调)
|
||||
2. 对 d(i) 做保序回归,投影到非负、非增序列 d̆(i) >= 0
|
||||
3. 从投影后的边际递减量重构损失序列 L̆(i) = L̆(i-1) - d̆(i)
|
||||
|
||||
结果:L̆ 是**凸的、非增的**——即边际增益 g(i) = L̆(i-1) - L̆(i) >= 0 且单调递减。
|
||||
|
||||
## 最优性保证
|
||||
|
||||
凸化后,边际增益 g(i) 满足递减性质 → 贪心算法等价于凸资源分配问题的最优解 → **贪心 = DP 最优**。论文图 2a 验证了贪心解与精确 DP 解完全一致。
|
||||
|
||||
## 相关概念
|
||||
|
||||
- [[global-combinatorial-optimization]] — 凸松弛求解的目标问题
|
||||
- [[marginal-utility]] — 凸松弛后得到的有序边际增益
|
||||
- [[offline-profiling]] — profiling 中离线完成凸松弛计算
|
||||
- [[isotonic-regression]] — PAVA 属于保序回归方法
|
||||
|
||||
## 参考
|
||||
|
||||
- [[tang-lukv|LU-KV]] (Tang et al., ICML 2026) — 附录 A.1 给出非凸性证明
|
||||
Reference in New Issue
Block a user