20260601

2026-06-01 10:46:01 +08:00
parent 2faf4bb002
commit e96b955fda
221 changed files with 10219 additions and 332 deletions
--- a/raw/papers/agent-harness-engineering-survey-2026.md
+++ b/raw/papers/agent-harness-engineering-survey-2026.md
@@ -0,0 +1,27 @@
+---
+source_url: user-upload
+ingested: 2026-05-23
+sha256: unknown
+---
+
+# Agent Harness Engineering: A Survey
+
+## Metadata
+- **Authors**: Junjie Li^1,6^*, Xi Xiao^6^*, Yunbei Zhang^5^*, Chen Liu^2^*, Lin Zhao^4, Xiaoying Liao^3, Yingrui Ji^6, Janet Wang^6, Jianyang Gu^7, Yingqiang Ge^9, Weijie Xu^9, Xi Fang^9, Xiang Xu^9, Tianchen Zhao^9, Youngeun Kim^9, Tianyang Wang^6, Jihun Hamm^5, Smita Krishnaswamy^2, Jun Huan^9, Chandan K Reddy^8,9
+- **Institutions**: 1 CMU, 2 Yale, 3 JHU, 4 NEU, 5 Tulane, 6 UAB, 7 OSU, 8 Virginia Tech, 9 Amazon
+- **Venue**: Under review at TMLR (Transactions on Machine Learning Research), 2026
+- **Project Page**: Awesome-Agent-Harness
+
+## Abstract
+
+The rapid deployment of large language model (LLM) agents in production has revealed a recurring pattern: task execution reliability depends less on the underlying model than on the infrastructure layer that wraps it — the **agent execution harness**. This survey provides a practice-grounded, systematic treatment of agent harness engineering, organized around three claims:
+
+1. **Binding-Constraint Thesis**: The agent harness is an independent system layer whose engineering quality drives a large share of real-world reliability
+2. **ETCLOVG Taxonomy**: A seven-layer taxonomy (Execution environment, Tool interface, Context management, Lifecycle/Orchestration, Observability, Verification, Governance) 
+3. **Ecosystem Mapping**: 170+ open-source projects mapped onto this taxonomy
+
+## Key Contributions
+
+- Three-phase engineering evolution: Prompt → Context → Harness Engineering
+- Cross-layer synthesis: Cost-Quality-Speed Trilemma, Capability-Control Tradeoff, Harness Coupling Problem
+- Open-problem agenda spanning harden/scale execution, maintain reliable state, diagnose from traces, standardize handoffs, and adaptive simplification