20260625:很多新内容
This commit is contained in:
40
concepts/autoregressive-video-generation.md
Normal file
40
concepts/autoregressive-video-generation.md
Normal file
@@ -0,0 +1,40 @@
|
||||
---
|
||||
title: "Autoregressive Video Generation"
|
||||
created: 2026-06-20
|
||||
updated: 2026-06-20
|
||||
type: concept
|
||||
tags: ["generation", "video", "autoregressive", "causal"]
|
||||
sources: ["https://arxiv.org/abs/2606.17800"]
|
||||
---
|
||||
|
||||
# Autoregressive Video Generation (自回归视频生成)
|
||||
|
||||
**Autoregressive Video Generation** 是将视频生成建模为逐帧/逐块因果生成的过程:每一帧依赖之前生成的帧,而不访问未来信息。
|
||||
|
||||
## 与双向扩散模型的核心差异
|
||||
|
||||
传统 [[diffusion-transformer|DiT]] 视频模型使用**双向时间注意力**(bidirectional temporal attention),在生成过程中所有帧相互依赖。这带来两个问题:
|
||||
1. **非实时**:中间帧在全部去噪完成前无法输出
|
||||
2. **计算随长度增长**:自注意力成本随序列长度平方增长
|
||||
|
||||
自回归视频生成通过**因果注意力**(causal attention)解决:
|
||||
- 逐块生成,每块仅依赖历史
|
||||
- 使用 [[kv-cache|KV-Cache]] 复用历史状态
|
||||
- 支持流式输出和实时交互
|
||||
|
||||
## 关键技术
|
||||
|
||||
- **Causal Streaming Generation**: 因果时间顺序生成,帧/块依次产生
|
||||
- **KV-Cache 管理**: 持久化缓存,限制缓存大小以控制计算量
|
||||
- **漂移控制**:长时序自回归容易积累误差,需要 drift mitigation
|
||||
|
||||
## 代表性模型
|
||||
|
||||
- **MaineCoon**: 实时音视频自回归模型([[maineCoon]]),22B,47.5 FPS
|
||||
- 其他流式视频生成模型:VideoGPT, TATS 等
|
||||
|
||||
## 相关概念
|
||||
- [[streaming-generation|流式生成]]
|
||||
- [[audio-visual-generation|音视频联合生成]]
|
||||
- [[kv-cache]]
|
||||
- [[causal-generation|因果生成]]
|
||||
Reference in New Issue
Block a user