20260514:增加新内容
This commit is contained in:
46
concepts/backtranslation-round-trip-relay.md
Normal file
46
concepts/backtranslation-round-trip-relay.md
Normal file
@@ -0,0 +1,46 @@
|
||||
---
|
||||
title: "Backtranslation Round-Trip Relay"
|
||||
created: 2026-05-14
|
||||
type: concept
|
||||
tags: ["evaluation-methodology", "backtranslation", "round-trip", "relay", "semantic-equivalence"]
|
||||
sources: ["https://arxiv.org/abs/2604.15597"]
|
||||
---
|
||||
|
||||
# Backtranslation Round-Trip Relay
|
||||
|
||||
回译接力(Backtranslation Round-Trip Relay)是 [[delegate-52]] 基准的核心评估方法论,通过可逆编辑任务的链式组合来衡量 LLM 的文档编辑保真度。
|
||||
|
||||
## 回译原语(Round-Trip Primitive)
|
||||
|
||||
给定种子文档 s 和一对编辑指令 (x→, x←):
|
||||
|
||||
1. **正向编辑**:t = σ(s) = LLM(s; x→)
|
||||
2. **反向编辑**:ŝ = σ⁻¹(t) = LLM(t; x←)
|
||||
3. **重建分数**:sim(s, ŝ) ∈ [0, 1]
|
||||
|
||||
完美模型应使 sim(s, ŝ) = 1,评估退化为衡量语义等价程度,无需人工标注参考答案。
|
||||
|
||||
## Relay 组合
|
||||
|
||||
将 N 对正向/反向指令连续应用:
|
||||
|
||||
ŝₖ = σ₁ ∘ σ₁⁻¹ ∘ ... ∘ σₙ ∘ σₙ⁻¹(s)
|
||||
|
||||
每轮回译后计算 RS@k = sim(s, ŝ_{k/2}),跟踪退化曲线。
|
||||
|
||||
## 方法论前提
|
||||
|
||||
- 每个编辑任务必须是可逆的
|
||||
- 模型**真正尝试执行编辑**而非走捷径(附录 A 验证)
|
||||
- 每次交互是独立的单轮会话
|
||||
|
||||
## 思想渊源
|
||||
|
||||
源于机器翻译中的回译(backtranslation)技术(Sennrich et al., 2015; Somers, 2005),近期被用于评估 LLM 一致性(Hong et al., 2025; Allamanis et al., 2024)。
|
||||
|
||||
## 相关概念
|
||||
|
||||
- [[delegate-52]] — 使用此方法论的基准
|
||||
- [[round-trip-reconstruction-score]] — RS@k 指标
|
||||
- [[semantic-equivalence]] — 评分的理论基础
|
||||
- [[document-degradation]] — 此方法揭示的核心现象
|
||||
Reference in New Issue
Block a user