Files
myWiki/raw/papers/agent-harness-engineering-survey-2026.md
2026-06-01 10:46:01 +08:00

1.8 KiB

source_url, ingested, sha256
source_url ingested sha256
user-upload 2026-05-23 unknown

Agent Harness Engineering: A Survey

Metadata

  • Authors: Junjie Li^1,6^, Xi Xiao^6^, Yunbei Zhang^5^, Chen Liu^2^, Lin Zhao^4, Xiaoying Liao^3, Yingrui Ji^6, Janet Wang^6, Jianyang Gu^7, Yingqiang Ge^9, Weijie Xu^9, Xi Fang^9, Xiang Xu^9, Tianchen Zhao^9, Youngeun Kim^9, Tianyang Wang^6, Jihun Hamm^5, Smita Krishnaswamy^2, Jun Huan^9, Chandan K Reddy^8,9
  • Institutions: 1 CMU, 2 Yale, 3 JHU, 4 NEU, 5 Tulane, 6 UAB, 7 OSU, 8 Virginia Tech, 9 Amazon
  • Venue: Under review at TMLR (Transactions on Machine Learning Research), 2026
  • Project Page: Awesome-Agent-Harness

Abstract

The rapid deployment of large language model (LLM) agents in production has revealed a recurring pattern: task execution reliability depends less on the underlying model than on the infrastructure layer that wraps it — the agent execution harness. This survey provides a practice-grounded, systematic treatment of agent harness engineering, organized around three claims:

  1. Binding-Constraint Thesis: The agent harness is an independent system layer whose engineering quality drives a large share of real-world reliability
  2. ETCLOVG Taxonomy: A seven-layer taxonomy (Execution environment, Tool interface, Context management, Lifecycle/Orchestration, Observability, Verification, Governance)
  3. Ecosystem Mapping: 170+ open-source projects mapped onto this taxonomy

Key Contributions

  • Three-phase engineering evolution: Prompt → Context → Harness Engineering
  • Cross-layer synthesis: Cost-Quality-Speed Trilemma, Capability-Control Tradeoff, Harness Coupling Problem
  • Open-problem agenda spanning harden/scale execution, maintain reliable state, diagnose from traces, standardize handoffs, and adaptive simplification