20260429:一些新东西
This commit is contained in:
35
concepts/million-token-context.md
Normal file
35
concepts/million-token-context.md
Normal file
@@ -0,0 +1,35 @@
|
||||
---
|
||||
title: "Million-Token Context"
|
||||
domain: "Machine Learning / Long-Context Models"
|
||||
tags: [long-context, efficiency, inference, kv-cache]
|
||||
sources: [[deepseek-v4-million-token-context]]
|
||||
---
|
||||
|
||||
# Million-Token Context
|
||||
|
||||
> **类型**: Concept (Tier 3 — Placeholder)
|
||||
> **来源**: [[deepseek-v4-million-token-context]]
|
||||
|
||||
## 概述
|
||||
|
||||
百万 Token 上下文是指语言模型能够高效处理的序列长度达到 1,000,000 个 token。这是 DeepSeek-V4 系列的核心突破——通过 [[hybrid-attention-architecture]] 等技术创新,实现了在百万 token 上下文下仅为 DeepSeek-V3.2 27%(Pro)或 10%(Flash)的推理 FLOPs。
|
||||
|
||||
## 关键技术
|
||||
|
||||
- [[compressed-sparse-attention]] + [[heavily-compressed-attention]] 混合注意力
|
||||
- [[fp4-quantization-training]] FP4 量化
|
||||
- 异构 KV Cache 与磁盘存储策略
|
||||
|
||||
## 核心内容
|
||||
|
||||
*此页面为占位符,用于修复 wiki 中的断链。详细内容待后续补充。*
|
||||
|
||||
## 相关概念
|
||||
|
||||
- [[hybrid-attention-architecture]] — 混合注意力架构
|
||||
- [[test-time-scaling]] — 测试时扩展
|
||||
|
||||
---
|
||||
|
||||
*Last Updated: 2026-04-27*
|
||||
*Status: Placeholder — to be completed*
|
||||
Reference in New Issue
Block a user