20260617:目前有914 页

2026-06-17 15:02:40 +08:00
parent e96b955fda
commit 91fac5b6fc
423 changed files with 20687 additions and 34 deletions
--- a/reviews/geometric-sae-review-20260617.md
+++ b/reviews/geometric-sae-review-20260617.md
@@ -0,0 +1,49 @@
+---
+title: "Geometric SAE 论文集成 Review"
+created: 2026-06-17
+type: review
+---
+
+# 📌 基本信息
+
+- **论文**：A Geometric View for Understanding Concept Learning and Neuron Interpretation in Sparse Autoencoders
+- **作者**：Chenhao Zhang, Chris Lin, Su-In Lee — University of Washington
+- **领域**：cs.LG / Mechanistic Interpretability
+- **arXiv**：2606.07007v1 (2026-06-05)
+
+# 🎯 核心概念
+
+1. **[[sparse-autoencoder|SAE]]** — 机制可解释性的核心工具：过完备稀疏字典解耦叠加表征
+2. **[[polysemanticity|多义性/单义性]]** — 神经元可解释性的核心挑战与目标
+3. **[[concept-learning|概念学习三层]]** — detection → separation → approximation，几何条件递进
+4. **[[formal-concept-analysis|FCA]] / [[concept-lattice|概念格]]** — 组织神经元-概念多对多关系的数学框架
+5. **[[absolute-gating|绝对 vs 相对门控]]** — SAE 架构分类决定几何性质
+
+# 🔗 概念网络
+
+**核心链路**：
+```
+Superposition → Polysemanticity → SAE → Absolute/Relative Gating
+    ↓                                    ↓
+Linear Rep. Hypothesis    ←→    Hyperplane Arrangements
+    ↓                                    ↓
+Concept Learning ←→ Formal Concept Analysis → Concept Lattice
+    ↓
+Feature Splitting / Absorption / Family
+```
+
+**与已有知识的关联**：通过 [[linear-representation-hypothesis]]（已存在）和 [[superposition]]（新增）与现有 wiki 概念形成桥梁。这是 wiki 中**首个覆盖机制可解释性**的论文集成。
+
+# 📚 Wiki 集成
+
+- **新增页面**：14 个（1 论文 + 12 概念 + 1 raw）
+- **总规模**：855 → 868 页（+13，review 不计入）
+- **全新子领域**：机制可解释性（mech interp）——此前 wiki 零覆盖
+
+# 💡 关键洞察
+
+1. **概念 = 集合** 是最优雅的起点：放弃"概念 = 方向"的线性假设，将概念直接定义为数据点集合。这一简单抽象使整个 SAE 分析具有几何清晰性——概念学习就是集合对齐、神经元解释就是集合表征。
+
+2. **三层学习层次是工程指南**：Detection（覆盖）、Separation（独占）、Approximation（紧致包围）——每一层对应不同的应用场景和几何条件。Theorem 5.8（近似 ↔ 凸性）是限制 SAE 能力的根本瓶颈。
+
+3. **概念格解决了解释的模糊性**：FCA 揭示概念学习与神经元解释是**不对偶**的——两者不必一致。概念格组织多对多关系，避免强行选择"最佳单一匹配"带来的信息损失。