SidneyZhang/myWiki

Files

Sidney Zhang e96b955fda

20260601

2026-06-01 10:46:01 +08:00

1.8 KiB

Raw Blame History

title, created, updated, type, tags, sources

title

created

updated

type

tags

sources

AutoHarness

2026-05-29

2026-05-29

concept

agent

code-synthesis

game-playing

LLM

https://arxiv.org/abs/2603.03329

AutoHarness

AutoHarness 是 Lou et al. (Google DeepMind, 2026) 提出的一种 LLM Agent 增强方法：让 LLM 自动合成为自己服务的代码 harness，以消除 Agent 在结构化环境中的非法动作。

问题背景

LLM Agent 在游戏中频繁产出非法动作（Gemini-2.5-Flash 78% 的象棋失利源于非法走子）。传统方案：

手写 harness：脆弱、每游戏需重写
Fine-tuning：昂贵、损害通用能力
推理时搜索（如 Tree of Thoughts）：依赖 LLM 内部世界模型，易 hallucinate 合法转移

核心思想

"Code as Harness"：LLM 不仅扮演 Agent 的"大脑"，还为自己编写"保护壳"——一个外部的、可验证的程序来检查动作合法性。

搜索机制

thompson-sampling-code-search：维护多个代码假设，平衡探索与利用
LLM 作为 mutation operator：基于环境 feedback 提出代码改进
Critic：提供结构化反馈（动作合法性、reward）

三种 Harness 形式

模式	描述	LLM 推理时调用
[[harness-as-action-verifier	Verifier]]	LLM 提议 → 代码验证 → 非法重试
Action Filter	代码生成合法集合 → LLM 排序	✅
[[harness-as-policy	Policy]]	纯代码决策

成果

145 个 TextArena 游戏 100% 合法动作率
Flash+Harness 胜 Flash+Pro（裸奔）
Code-as-Policy 超 GPT-5.2-High

相关

lou-autoharness-2026 — 原始论文
code-as-harness — 框架哲学
harness-as-policy — 终极形态