20260617:目前有914 页
This commit is contained in:
24
concepts/synthetic-data.md
Normal file
24
concepts/synthetic-data.md
Normal file
@@ -0,0 +1,24 @@
|
||||
---
|
||||
title: "合成数据 (Synthetic Data)"
|
||||
created: 2026-06-03
|
||||
updated: 2026-06-03
|
||||
type: concept
|
||||
tags: [synthetic-data, training, data-generation]
|
||||
status: placeholder
|
||||
---
|
||||
|
||||
# 合成数据 (Synthetic Data)
|
||||
|
||||
> ⚠️ 占位符页面 — 待完善
|
||||
|
||||
合成数据是通过算法或模型生成的人工数据,用于增强或替代真实训练数据。在 LLM 训练中广泛用于:
|
||||
|
||||
- **问题生成**:如 [[mathchatsync-reasoning|MathChatSync]] 的多轮推理数据合成
|
||||
- **指令数据**:GPT-4 等强模型生成指令-响应对
|
||||
- **数据扩充**:弥补真实数据不足的领域
|
||||
|
||||
## 相关概念
|
||||
|
||||
- [[mathchatsync-reasoning|MathChatSync 推理]]
|
||||
- [[synthetic-data-qa-generation|合成数据 QA 生成]]
|
||||
- [[data-quality-over-scale|数据质量重于规模]]
|
||||
Reference in New Issue
Block a user