20260625:很多新内容
This commit is contained in:
43
concepts/jepa-for-robotics.md
Normal file
43
concepts/jepa-for-robotics.md
Normal file
@@ -0,0 +1,43 @@
|
||||
---
|
||||
title: "JEPA for Robotics"
|
||||
created: 2026-06-24
|
||||
updated: 2026-06-24
|
||||
type: concept
|
||||
tags: ["jepa", "robot-learning", "world-model", "embodied-ai"]
|
||||
sources:
|
||||
- "[[vla-jepa-2026]]"
|
||||
---
|
||||
|
||||
# JEPA for Robotics
|
||||
|
||||
JEPA for Robotics 是将 Joint-Embedding Predictive Architecture 应用于机器人学习和 VLA 策略的范式。VLA-JEPA 是该方向的代表性工作。
|
||||
|
||||
## 为什么 JEPA 适合 Robotics
|
||||
|
||||
### 像素预测的问题
|
||||
机器人视觉中的像素变化主要由外观因素驱动(光照、纹理、相机位置),而非动作相关的状态转移。像素预测目标容易被这些因素主导。
|
||||
|
||||
### JEPA 的优势
|
||||
1. **Latent space prediction**:天然过滤像素噪声,关注语义动态
|
||||
2. **Leakage-free design**:未来仅作 target,消除捷径学习
|
||||
3. **对相机鲁棒**:latent 表示对相机运动和背景变化不敏感
|
||||
|
||||
## 从 [[jepa|通用 JEPA]] 到 Embodied JEPA
|
||||
|
||||
| 维度 | 通用 JEPA | Embodied JEPA |
|
||||
|------|----------|---------------|
|
||||
| 输入 | 视频帧 | 多视角机器人视频 |
|
||||
| 预测目标 | 未来表示 | 世界状态转移 |
|
||||
| 下游 | 动作识别 | 动作策略生成 |
|
||||
| 微调 | 分类/检测 | Action Head (Flow-Matching) |
|
||||
|
||||
## 可扩展性
|
||||
|
||||
VLA-JEPA 的作者认为该范式高度可扩展——可通过自然融入更多机器人数据和文本推理数据进一步泛化。
|
||||
|
||||
## 参考
|
||||
- [[vla-jepa-2026]]
|
||||
- [[jepa]]
|
||||
- [[vla-vision-language-action]]
|
||||
- [[latent-world-model]]
|
||||
- [[world-model-lecun]]
|
||||
Reference in New Issue
Block a user