SidneyZhang/myWiki

Fork 0

Files

Sidney Zhang 91fac5b6fc

20260617:目前有914 页

2026-06-17 15:02:40 +08:00

2.1 KiB

Raw Blame History

title, created, type, paper

title	created	type	paper
Review: Dead Directions — Geometric Singular Learning	2026-06-10	review	dead-directions-geometric-singular-learning

Review: Dead Directions — Geometric Singular Learning

📌 基本信息

论文：Dead Directions: Geometric Singular Learning
作者：Tejas Pradeep Shirodkar (IIIT Hyderabad)
领域：奇异学习理论 × 信息几何 × 深度学习理论
arXiv：2606.05957v1 [cs.LG, stat.ML], 2026 | 139 pages

🎯 核心贡献

Dead Direction 桥接原语 — 同一向量同时是 Amari 的 Fisher 退化方向和 Watanabe 的奇异集切向量。KL 阶可在原始坐标中从 Fisher 曲率衰减率恢复。
无需广中平祐消解 — 传统 SLT 需要 blow-up（对百万参数网络不可行）；本文在原始参数坐标中直接计算 lambda。
单 Checkpoint 读取 Watanabe 三元组 — 从一次前向+反向传播计算 (lambda, m, nu)，无需 MCMC 后验采样。
DDCAdam 优化器 — 标准 Adam 破坏奇异几何；DDCAdam 保持 G-等变性，使训练轨迹中的 SLT 信号可读。

🔗 概念网络

Dead Direction ←→ KL Order ←→ RLCT (lambda)
      ↓                          ↓
Fisher Metric (Info Geometry) → Singular Learning Theory
      ↓                              ↓
  DDCAdam ←→ Gauge Quotient    Watanabe's Triple (lambda,m,nu)

📊 Wiki 集成

新增页面：9 个（1 论文 + 8 概念）
链接完整性：100%
总规模：729 → 738 页

💡 关键洞察

这篇 139 页的论文解决了一个困扰领域二十年的问题：SLT 和信息几何使用几乎不相交的词汇描述同一参数空间。Dead Direction 是第一个在两个框架中具有明确双重解读的数学对象，KL 阶是第一个可被两方计算的桥接不变量。

实践意义巨大：首次使 SLT 分析在实际规模的深度网络上可行——从单个 checkpoint 的梯度信息中直接提取泛化理论的不变量，无需 blow-up，无需后验采样。这对理解大模型的泛化行为可能具有基础性影响。

2.1 KiB Raw Blame History Unescape Escape

Review: Dead Directions — Geometric Singular Learning

2.1 KiB

Raw Blame History