20260617:目前有914 页
This commit is contained in:
54
concepts/absolute-gating.md
Normal file
54
concepts/absolute-gating.md
Normal file
@@ -0,0 +1,54 @@
|
||||
---
|
||||
title: "绝对门控与相对门控 (Absolute vs Relative Gating)"
|
||||
created: 2026-06-17
|
||||
updated: 2026-06-17
|
||||
type: concept
|
||||
tags: [architecture, sparse-autoencoder, gating-mechanism]
|
||||
sources: [raw/papers/zhang-geometric-sae-2026.md]
|
||||
confidence: high
|
||||
---
|
||||
|
||||
# 绝对门控与相对门控 (Absolute vs Relative Gating)
|
||||
|
||||
[[geometric-sae-concepts|Zhang et al. (2026)]] 提出的 [[sparse-autoencoder|SAE]] 架构分类——门控机制的本质差异决定了 SAE 的几何性质和可分析性。
|
||||
|
||||
## 绝对门控 (Absolute Gating)
|
||||
|
||||
每个神经元的激活**仅由自身预激活值决定**:
|
||||
|
||||
- **ReLU SAE**:`a_i = max(0, z_i)`,激活区域 = 半空间 `H_i^+`
|
||||
- **Gated SAE**:引入额外门控网络
|
||||
- **JumpReLU SAE**:`a_i = z_i · 1[z_i > τ_i]`
|
||||
|
||||
**几何性质**:
|
||||
- 神经元激活区域 `N_i = H_i^+`(精确等于半空间)
|
||||
- 多神经元单元的激活区域 = 半空间的交集(凸多面体)
|
||||
- 理论分析直接、条件简单
|
||||
|
||||
## 相对门控 (Relative Gating)
|
||||
|
||||
神经元的激活**依赖于与其他神经元的比较**:
|
||||
|
||||
- **Top-K SAE**:仅保留 k 个最大 z_i
|
||||
- **Matching Pursuit SAE**:迭代选择贡献最大的
|
||||
- **SPaDE**:结构化选择
|
||||
|
||||
**几何性质**:
|
||||
- `N_i ⊆ H_i^+`(仅半空间的子集)
|
||||
- N_i 是超平面排列区域的并集(复杂非凸形状)
|
||||
- 即使 `z_i > τ_i`,若未进入 Top-K 也不激活
|
||||
|
||||
## 几何差异的后果
|
||||
|
||||
| 性质 | 绝对门控 | 相对门控 |
|
||||
|------|---------|---------|
|
||||
| 激活区域 | 半空间(凸) | 多胞体并集(非凸) |
|
||||
| 概念分离条件 | Conv(C) ∩ Conv(N) = ∅ | 更复杂 |
|
||||
| 稀疏性 | L1 正则化(软约束) | k 硬约束 |
|
||||
| 层级概念 | 天然支持(子概念激活不排斥父概念) | 竞争排斥 |
|
||||
|
||||
## 参考
|
||||
|
||||
- [[sparse-autoencoder|SAE]]
|
||||
- [[hyperplane-arrangements|超平面排列]]
|
||||
- [[geometric-sae-concepts|几何框架论文]]
|
||||
Reference in New Issue
Block a user