20260617:目前有914 页

2026-06-17 15:02:40 +08:00
parent e96b955fda
commit 91fac5b6fc
423 changed files with 20687 additions and 34 deletions
--- a/concepts/temporal-rollout.md
+++ b/concepts/temporal-rollout.md
@@ -0,0 +1,51 @@
+---
+title: "时间滚动展开 (Temporal Rollout)"
+created: 2026-06-13
+updated: 2026-06-13
+type: concept
+tags: [computer-vision, video-generation, inference-technique]
+sources: [raw/papers/cheng-flex4dhuman-2026.md]
+---
+
+# 时间滚动展开 (Temporal Rollout)
+
+Flex4DHuman 用于生成长于训练窗口的多视角视频的推理策略，通过分块+重叠的方式实现任意长度的持续生成。
+
+## 机制
+
+将完整生成序列切分为多个 T 帧的 chunk，逐块去噪，并在块间共享重叠帧作为历史条件：
+
+1. **Iteration 0**：使用参考视图 token 作为唯一清洁条件，生成第一个 T 帧 chunk（所有目标视图）
+2. **Iteration 1+**：窗口前进 T-O 帧，前一 chunk 的**最后 O 帧预测**作为当前 chunk 的清洁历史 token
+3. **循环**：重复直到覆盖目标总帧数
+
+## 关键参数
+
+- **T**：每块帧数（训练中见过的最大帧数：如 4 或 16）
+- **O**：重叠帧数（通常 O=1）
+- **n_history**：历史 token 数（= O × 视图数）
+
+## 实验发现
+
+在 DNA-Rendering 测试中（42 帧窗口）：
+
+| 配置 | PSNR | 说明 |
+|------|------|------|
+| T=4, O=1 (14 次迭代) | 24.79 dB | 更多迭代，更小 chunk |
+| T=16, O=1 (3 次迭代) | 24.86 dB | 更少迭代，更大 chunk |
+
+两种设置的 PSNR 几乎相同，说明：
+- **教师强制历史足以支持稳定长程展开**
+- 短 chunk 方案（T=4）提供**更内存友好的多视角密集生成**操作点
+
+## 关键设计
+
+- **清洁历史 token**：不使用模型自己的噪声预测作为历史，而是使用前一 chunk 完整去噪后的清洁帧——避免误差累积
+- **与训练一致**：[[teacher-forced-history|教师强制历史]]训练使模型习惯清洁历史条件
+- **视图同步**：所有视角同步推进，保持跨视角时序一致性
+
+## 参考
+
+- [[flex4dhuman|Flex4DHuman]] — 提出该机制的模型
+- [[teacher-forced-history|教师强制历史]] — 训练端的历史条件策略
+- [[clean-conditioning-mask|清洁条件掩码]] — 推理时复用为历史标记