Files
myWiki/concepts/bare-adapter.md

44 lines
1.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "Bare Adapter"
created: 2026-06-15
updated: 2026-06-15
type: concept
tags: [benchmark, evaluation, coding-agent]
sources: [raw/papers/zheng-claw-swe-bench-2026.md]
---
# Bare Adapter
## 定义
Bare Adapter 是 Claw-SWE-Bench 中定义的**诊断基线**——提供最小化的 Docker 集成,要求模型直接在最终响应中输出 unified diff。它不是 Full Adapter 的逐组件消融,而是测试"最小接入是否足以创建可靠的 SWE-bench 评测目标"。
## 与 Full Adapter 的对比
| 能力 | Bare Adapter | Full Adapter |
|------|-------------|-------------|
| Docker 进入 | ✅ | ✅ |
| 发送 issue 描述 | ✅ | ✅ |
| 禁用不公平的网络检索 | ✅ | ✅ |
| Workspace 对齐 | ❌ | ✅ |
| Future-Commit 清理 | ❌ | ✅ |
| 共享阶段 Prompt | ❌ | ✅ |
| Git-based Patch 提取 | ❌ | ✅ |
| Patch 清理 | ❌ | ✅ |
| 输出方式 | 模型输出 unified diff | Runner 从仓库状态导出 diff |
## 实验结果
- **Pass@1:** 19.1%GLM 5.1
- **Apply Failed:** 69.1%
- **瓶颈:** 不是模型不能编辑代码,而是直接生成 unified-diff 文本的脆弱性
## 诊断价值
Bare Adapter 证明了 adapter 设计**不是工程包装,而是评分可靠性的必要条件**——最小接入无法使通用 agent 成为 SWE-bench 的可靠评测目标。
## 参考
- [[claw-swe-bench|Claw-SWE-Bench 论文]]
- [[adapter-protocol|适配器协议]]
- [[patch-based-evaluation|Patch-Based Evaluation]]