39 lines
1.1 KiB
Markdown
39 lines
1.1 KiB
Markdown
---
|
||
title: "OmniDocBench"
|
||
created: 2026-06-24
|
||
updated: 2026-06-24
|
||
type: concept
|
||
tags: ["benchmark", "ocr", "document-parsing", "evaluation"]
|
||
sources:
|
||
- "[[unlimited-ocr-works-2026]]"
|
||
---
|
||
|
||
# OmniDocBench
|
||
|
||
OmniDocBench 是文档解析(document parsing)的综合评测基准,被 Unlimited OCR 选为主要评测平台。v1.5 和 v1.6 两个版本。
|
||
|
||
## 评测维度
|
||
|
||
| 指标 | 含义 |
|
||
|------|------|
|
||
| Text Edit Distance (↓) | 字符级文本识别准确率 |
|
||
| Formula CDM (↑) | 数学公式识别质量 |
|
||
| Table TEDS (↑) | 表格结构提取(含内容) |
|
||
| Table TEDS-S (↑) | 表格结构提取(纯结构) |
|
||
| Read-order Edit Distance (↓) | 阅读顺序预测正确性 |
|
||
| Overall (↑) | 加权综合得分 |
|
||
|
||
## 版本差异
|
||
|
||
- **v1.5**:包含经典模型的官方指标(DeepSeek OCR 报告的基线),便于性能对比
|
||
- **v1.6**:新增 296 张测试图,代表最新评测标准,包含当前 SOTA 模型
|
||
|
||
## Unlimited OCR 的表现
|
||
|
||
- v1.5 Overall: 93.23%(比 DeepSeek OCR +6.22pp)
|
||
- v1.6 Overall: 93.54%(SOTA 级别)
|
||
|
||
## 参考
|
||
- [[unlimited-ocr-works-2026]]
|
||
- [[end-to-end-ocr]]
|