Files
myWiki/concepts/multi-token-prediction.md

34 lines
917 B
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "Multi-Token Prediction (MTP)"
domain: "Deep Learning / Training"
tags: [training, prediction, transformer, efficiency]
sources: [[deepseek-v4-million-token-context]]
---
# Multi-Token Prediction (MTP)
> **类型**: Concept (Tier 3 — Placeholder)
> **来源**: [[deepseek-v4-million-token-context]], DeepSeek-V3 (2024)
## 概述
MTP 是一种训练策略,让模型在每一步同时预测多个后续 token提高训练效率和下游任务性能。DeepSeek-V4 继承自 DeepSeek-V3 的 MTP 配置,未做修改。
## 核心内容
*此页面为占位符,用于修复 wiki 中的断链。详细内容待后续补充。*
## 与 DeepSeek-V4 的关系
- DeepSeek-V4 的 MTP 模块与 V3 完全相同
- 通过额外的 MTP 预测头增强训练信号密度
## 相关概念
- [[test-time-scaling]] — 测试时扩展
---
*Last Updated: 2026-04-27*
*Status: Placeholder — to be completed*