20260625:很多新内容
This commit is contained in:
38
concepts/omnidocbench.md
Normal file
38
concepts/omnidocbench.md
Normal file
@@ -0,0 +1,38 @@
|
||||
---
|
||||
title: "OmniDocBench"
|
||||
created: 2026-06-24
|
||||
updated: 2026-06-24
|
||||
type: concept
|
||||
tags: ["benchmark", "ocr", "document-parsing", "evaluation"]
|
||||
sources:
|
||||
- "[[unlimited-ocr-works-2026]]"
|
||||
---
|
||||
|
||||
# OmniDocBench
|
||||
|
||||
OmniDocBench 是文档解析(document parsing)的综合评测基准,被 Unlimited OCR 选为主要评测平台。v1.5 和 v1.6 两个版本。
|
||||
|
||||
## 评测维度
|
||||
|
||||
| 指标 | 含义 |
|
||||
|------|------|
|
||||
| Text Edit Distance (↓) | 字符级文本识别准确率 |
|
||||
| Formula CDM (↑) | 数学公式识别质量 |
|
||||
| Table TEDS (↑) | 表格结构提取(含内容) |
|
||||
| Table TEDS-S (↑) | 表格结构提取(纯结构) |
|
||||
| Read-order Edit Distance (↓) | 阅读顺序预测正确性 |
|
||||
| Overall (↑) | 加权综合得分 |
|
||||
|
||||
## 版本差异
|
||||
|
||||
- **v1.5**:包含经典模型的官方指标(DeepSeek OCR 报告的基线),便于性能对比
|
||||
- **v1.6**:新增 296 张测试图,代表最新评测标准,包含当前 SOTA 模型
|
||||
|
||||
## Unlimited OCR 的表现
|
||||
|
||||
- v1.5 Overall: 93.23%(比 DeepSeek OCR +6.22pp)
|
||||
- v1.6 Overall: 93.54%(SOTA 级别)
|
||||
|
||||
## 参考
|
||||
- [[unlimited-ocr-works-2026]]
|
||||
- [[end-to-end-ocr]]
|
||||
Reference in New Issue
Block a user