SidneyZhang/myWiki

Files

Sidney Zhang 91fac5b6fc

20260617:目前有914 页

2026-06-17 15:02:40 +08:00

1.8 KiB

Raw Blame History

title, created, updated, type, tags, sources

title

created

updated

type

tags

sources

教师强制历史 (Teacher-Forced History)

2026-06-13

2026-06-13

concept

computer-vision

video-generation

training-technique

raw/papers/cheng-flex4dhuman-2026.md

教师强制历史 (Teacher-Forced History)

Flex4DHuman three-stage-curriculum-training 训练中使用的关键技术——在训练时将清洁（ground-truth）的历史帧 token 作为条件提供给模型，使其学会基于清洁历史进行预测。

为什么需要

标准扩散模型训练时，所有 token 同时去噪。但在 temporal-rollout 推理时，模型需要基于之前已生成的帧（而非同时去噪的帧）进行条件生成。

如果训练时从未见过"清洁历史 + 噪声目标"的模式，推理时就会出现训练-推理分布偏移。

实现

训练时，将前 O 帧的 ground-truth latent 填入 clean-conditioning-mask 的清洁通道
目标 token（剩余帧）使用噪声 latent，保持正常的 flow-matching 去噪
配置多种 V×T 布局下的不同 n_history 模式（0 history, 1 history）

效果

实验表明，教师强制历史训练使：

T=4 和 T=16 两种 rollout 配置的 PSNR 几乎相同（24.79 vs 24.86 dB）
无需额外的 drift-mitigation 策略即可实现稳定长程展开

局限与未来方向

论文指出更长程生成可能需要额外的 drift-mitigation 策略，如：

Self-forcing [Huang et al., 2026]
Diffusion forcing [Chen et al., 2024]
其他长期一致性目标

参考

flex4dhuman — 应用该方法
temporal-rollout — 推理端使用清洁历史
clean-conditioning-mask — 清洁历史 token 的载体