Files
myWiki/concepts/mcp-tools-dataset.md

59 lines
1.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "MCP-tools 数据集"
created: 2026-06-19
updated: 2026-06-19
type: concept
tags: [dataset, mcp, tool-discovery, benchmark, evaluation]
sources:
- https://arxiv.org/abs/2506.01056
---
# MCP-tools 数据集
## 定义
MCP-tools 是 MCP-Zero 论文构建的**首个面向检索的工具发现数据集**,从官方 Model Context Protocol 仓库收集了 308 个 MCP server 和 2,797 个 tool。
## 与其他 MCP 数据集的区别
| | MCPBench | MCP-tools |
|---|---|---|
| 关注点 | Server 可用性、延迟测试 | 语义工具发现和检索 |
| 目标 | 基础设施评估 | Agent 工具发现能力评估 |
## 数据结构
```json
{
"server_name": "string",
"server_description": "string",
"server_summary": "string", // MCP-Zero 增强摘要
"tools": [
{
"name": "string",
"description": "string",
"parameter": {
"param1": "(type) description",
"param2": "(Optional, type) description"
}
}
]
}
```
## 增强摘要
MCP-Zero 为每个 server 构建了增强摘要——包含综合使用示例——用于提升 server 级语义匹配精度。对比仅用原始描述(通常只有一句话)有显著改善。
## 关键指标
- 308 servers × 2,797 tools
- 全量 tool schema 约 **248.1K tokens**
- 单个 GitHub MCP server**4,600+ tokens**26 tools
## 参考
- [[mcp-protocol|MCP 协议]]
- [[fei-mcp-zero-2025|MCP-Zero 论文]]
- [[active-tool-discovery|主动工具发现]]