Files
myWiki/concepts/omnidocbench.md

39 lines
1.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "OmniDocBench"
created: 2026-06-24
updated: 2026-06-24
type: concept
tags: ["benchmark", "ocr", "document-parsing", "evaluation"]
sources:
- "[[unlimited-ocr-works-2026]]"
---
# OmniDocBench
OmniDocBench 是文档解析document parsing的综合评测基准被 Unlimited OCR 选为主要评测平台。v1.5 和 v1.6 两个版本。
## 评测维度
| 指标 | 含义 |
|------|------|
| Text Edit Distance (↓) | 字符级文本识别准确率 |
| Formula CDM (↑) | 数学公式识别质量 |
| Table TEDS (↑) | 表格结构提取(含内容) |
| Table TEDS-S (↑) | 表格结构提取(纯结构) |
| Read-order Edit Distance (↓) | 阅读顺序预测正确性 |
| Overall (↑) | 加权综合得分 |
## 版本差异
- **v1.5**包含经典模型的官方指标DeepSeek OCR 报告的基线),便于性能对比
- **v1.6**:新增 296 张测试图,代表最新评测标准,包含当前 SOTA 模型
## Unlimited OCR 的表现
- v1.5 Overall: 93.23%(比 DeepSeek OCR +6.22pp
- v1.6 Overall: 93.54%SOTA 级别)
## 参考
- [[unlimited-ocr-works-2026]]
- [[end-to-end-ocr]]