SidneyZhang/myWiki

Files

Sidney Zhang 91fac5b6fc

20260617:目前有914 页

2026-06-17 15:02:40 +08:00

1.3 KiB

Raw Blame History

title, type, created, tags

title

type

created

tags

Hallucination Mitigation in LLM Systems

concept

2026-06-04

hallucination

llm

safety

faithfulness

Hallucination Mitigation（幻觉抑制）

定义：在 LLM 系统中抑制模型生成不忠实于输入或事实的信息的一组技术和方法。

方法分类

1. 检索锚定

rag：通过外部检索提供事实基础
content-grounded-retrieval：严格限定于提供的内容

2. 推理闸门

sufficiency-check：评估证据是否充足后再生成
Self-Consistency：多路径采样投票

3. 训练时抑制

rlhf：通过人类偏好对齐减少编造
dpo：直接偏好优化
Factuality tuning：事实性微调

4. 推理时控制

约束解码（constrained decoding）
引用强制（citation enforcement）

IntrAgent 的贡献

intragent 的 sufficiency-check 属于推理闸门类——在生成最终答案前显式评估信息充分性。与 RAG 的隐式可靠性不同，这是一个显式的质量控制步骤。

相关概念

sufficiency-check — 充分性检查作为幻觉抑制闸门
content-grounded-retrieval — 内容锚定抑制幻觉
faithfulness-in-ai — AI 忠实性