20260625:很多新内容
This commit is contained in:
70
concepts/engram.md
Normal file
70
concepts/engram.md
Normal file
@@ -0,0 +1,70 @@
|
||||
---
|
||||
title: "Engram (Conditional Memory Module)"
|
||||
created: 2026-06-25
|
||||
updated: 2026-06-25
|
||||
type: concept
|
||||
tags: ["architecture", "memory", "transformer", "sparsity"]
|
||||
sources:
|
||||
- "[[engram-conditional-memory-2026]]"
|
||||
---
|
||||
|
||||
# Engram (Conditional Memory Module)
|
||||
|
||||
Engram 是 DeepSeek-AI 提出的条件记忆模块,将经典 N-gram 嵌入现代化为 Transformer 的静态知识查找原语。
|
||||
|
||||
## 架构:两阶段流水线
|
||||
|
||||
### 阶段 1:Sparse Retrieval(稀疏检索)
|
||||
|
||||
**Tokenizer Compression**:
|
||||
- 预计算满射函数 P: V → V',基于 NFKC 归一化 + 小写化
|
||||
- 将语义等价但 token ID 不同的词(如 "Apple" vs " apple")映射到同一规范 ID
|
||||
- 对 128k tokenizer 实现 23% 有效词表缩减
|
||||
|
||||
**Multi-Head Hashing**:
|
||||
- 每个 N-gram 阶数 n ∈ {2,3,...N} 用 K 个独立哈希头
|
||||
- 乘性 XOR 哈希 𝜑_{n,k} 将压缩 N-gram 映射到嵌入表 E_{n,k}[z](素数大小 M_{n,k})
|
||||
- 所有检索向量拼接为记忆向量 e_t ∈ R^{d_mem}
|
||||
- 碰撞通过上下文门控消解
|
||||
|
||||
### 阶段 2:Context-aware Fusion(上下文感知融合)
|
||||
|
||||
**Gating**:
|
||||
- h_t(隐藏状态,含全局上下文)→ Query
|
||||
- e_t(静态记忆)→ Key, Value(经可学习投影 W_K, W_V)
|
||||
- 标量门 α_t = σ(RMSNorm(h_t)^T · RMSNorm(k_t) / √d)
|
||||
- 输出 ṽ_t = α_t · v_t:若记忆与上下文矛盾,门控趋近于 0
|
||||
|
||||
**Depthwise Causal Convolution**:
|
||||
- Kernel=4, dilation=max N-gram order, SiLU 激活
|
||||
- 扩展感受野,增强非线性
|
||||
- 残差连接:Y = SiLU(Conv1D(RMSNorm(Ṽ))) + Ṽ
|
||||
|
||||
### 集成到 Transformer
|
||||
|
||||
```
|
||||
H(ℓ) ← H(ℓ) + Y (残差)
|
||||
→ Attention
|
||||
→ MoE
|
||||
```
|
||||
|
||||
**非全层应用**:Engram 只插入特定层,具体位置由系统延迟约束决定。
|
||||
|
||||
## 基础设施感知设计
|
||||
|
||||
- **确定性寻址**:不同于 MoE 的动态路由,Engram 使用确定性哈希 → 支持运行时预取
|
||||
- **内存层次**:大嵌入表可卸载到主机内存,通过预取重叠通信与计算
|
||||
- **开销**:100B 参数嵌入表卸载到主机内存的开销 <3%
|
||||
|
||||
## 关键设计要点
|
||||
|
||||
1. **静态 vs 动态分离**:记忆是静态的(N-gram 嵌入),但通过上下文门控获得动态适应性
|
||||
2. **哈希碰撞不是 bug**:Multi-head hashing + 上下文门控共同消解碰撞噪声
|
||||
3. **深度而非宽度**:Engram 的价值不在存更多事实,在释放计算深度用于推理
|
||||
|
||||
## 参考
|
||||
- [[engram-conditional-memory-2026]]
|
||||
- [[conditional-memory]]
|
||||
- [[mixture-of-experts]]
|
||||
- [[ngram-embedding]]
|
||||
- [[sparsity-allocation]]
|
||||
Reference in New Issue
Block a user