Files
myWiki/concepts/rlhf.md

22 lines
615 B
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "RLHF (Reinforcement Learning from Human Feedback)"
created: 2026-06-03
updated: 2026-06-03
type: concept
tags: [RLHF, alignment, LLM, training]
status: placeholder
---
# RLHF (Reinforcement Learning from Human Feedback)
> ⚠️ 占位符页面 — 待完善
RLHF 是一种基于人类反馈的强化学习对齐方法,是 SFT 的主要替代/补充后训练范式。典型流程SFT → 奖励模型训练 → PPO 优化。
与 SFT 的对比是 [[zhang-reconciling-sft-interaction-2026|Zhang et al. (2026)]] 讨论的重要背景。
## 相关概念
- [[supervised-fine-tuning|SFT]]
- [[dpo]]