SidneyZhang/myWiki

Files

Sidney Zhang 6021dea160

20260625:很多新内容

2026-06-25 14:08:47 +08:00

1.4 KiB

Raw Blame History

title, created, updated, type, tags, sources

title

created

updated

type

tags

sources

JEPA for Robotics

2026-06-24

2026-06-24

concept

jepa

robot-learning

world-model

embodied-ai

JEPA for Robotics

JEPA for Robotics 是将 Joint-Embedding Predictive Architecture 应用于机器人学习和 VLA 策略的范式。VLA-JEPA 是该方向的代表性工作。

为什么 JEPA 适合 Robotics

像素预测的问题

机器人视觉中的像素变化主要由外观因素驱动（光照、纹理、相机位置），而非动作相关的状态转移。像素预测目标容易被这些因素主导。

JEPA 的优势

Latent space prediction：天然过滤像素噪声，关注语义动态
Leakage-free design：未来仅作 target，消除捷径学习
对相机鲁棒：latent 表示对相机运动和背景变化不敏感

从 jepa 到 Embodied JEPA

维度	通用 JEPA	Embodied JEPA
输入	视频帧	多视角机器人视频
预测目标	未来表示	世界状态转移
下游	动作识别	动作策略生成
微调	分类/检测	Action Head (Flow-Matching)

可扩展性

VLA-JEPA 的作者认为该范式高度可扩展——可通过自然融入更多机器人数据和文本推理数据进一步泛化。

参考