Files
myWiki/concepts/backtranslation-round-trip-relay.md

47 lines
1.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "Backtranslation Round-Trip Relay"
created: 2026-05-14
type: concept
tags: ["evaluation-methodology", "backtranslation", "round-trip", "relay", "semantic-equivalence"]
sources: ["https://arxiv.org/abs/2604.15597"]
---
# Backtranslation Round-Trip Relay
回译接力Backtranslation Round-Trip Relay是 [[delegate-52]] 基准的核心评估方法论,通过可逆编辑任务的链式组合来衡量 LLM 的文档编辑保真度。
## 回译原语Round-Trip Primitive
给定种子文档 s 和一对编辑指令 (x→, x←)
1. **正向编辑**t = σ(s) = LLM(s; x→)
2. **反向编辑**:ŝ = σ⁻¹(t) = LLM(t; x←)
3. **重建分数**sim(s, ŝ) ∈ [0, 1]
完美模型应使 sim(s, ŝ) = 1评估退化为衡量语义等价程度无需人工标注参考答案。
## Relay 组合
将 N 对正向/反向指令连续应用:
ŝₖ = σ₁ ∘ σ₁⁻¹ ∘ ... ∘ σₙ ∘ σₙ⁻¹(s)
每轮回译后计算 RS@k = sim(s, ŝ_{k/2}),跟踪退化曲线。
## 方法论前提
- 每个编辑任务必须是可逆的
- 模型**真正尝试执行编辑**而非走捷径(附录 A 验证)
- 每次交互是独立的单轮会话
## 思想渊源
源于机器翻译中的回译backtranslation技术Sennrich et al., 2015; Somers, 2005近期被用于评估 LLM 一致性Hong et al., 2025; Allamanis et al., 2024
## 相关概念
- [[delegate-52]] — 使用此方法论的基准
- [[round-trip-reconstruction-score]] — RS@k 指标
- [[semantic-equivalence]] — 评分的理论基础
- [[document-degradation]] — 此方法揭示的核心现象