20260518-morning:新增内容
This commit is contained in:
46
concepts/adaptive-computation-time.md
Normal file
46
concepts/adaptive-computation-time.md
Normal file
@@ -0,0 +1,46 @@
|
|||||||
|
---
|
||||||
|
title: "Adaptive Computation Time (ACT)"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [neural-architecture, efficiency, computation]
|
||||||
|
sources: [raw/papers/darlow-ctm-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Adaptive Computation Time (ACT)
|
||||||
|
|
||||||
|
**Adaptive Computation Time** 是一类技术,允许神经网络根据输入难度动态调整计算量。
|
||||||
|
|
||||||
|
## 经典方案
|
||||||
|
|
||||||
|
### ACT (Graves, 2016)
|
||||||
|
- 引入可学习的 halting 单元
|
||||||
|
- 在每个循环步骤输出 halting 概率
|
||||||
|
- 当累积 halting 概率超过 1−ε 时停止
|
||||||
|
- 需要 "ponder cost" 正则化项鼓励效率
|
||||||
|
|
||||||
|
### PonderNet (Banino et al., 2021)
|
||||||
|
- 将 halting 概率建模为几何分布
|
||||||
|
- 训练时从分布采样步数
|
||||||
|
- 推理时使用期望步数
|
||||||
|
|
||||||
|
### 其他变体
|
||||||
|
- **Early-Exit Networks**:中间层添加分类器,满足条件则提前退出
|
||||||
|
- **AdaTape**:动态扩展输入序列
|
||||||
|
- **Sparse Universal Transformer**:循环权重共享 + 动态 halting + MoE
|
||||||
|
|
||||||
|
## CTM 的原生 ACT
|
||||||
|
|
||||||
|
CTM 通过 [[certainty-based-loss|Certainty-Based Loss]] 自然实现 ACT,无需显式 halting 模块:
|
||||||
|
- 确定性可以作为停止条件
|
||||||
|
- 简单样本在早期 tick 即达到高确定性
|
||||||
|
- ImageNet 实验中,大多数样本在 <10 ticks 即可停止(总共 50 ticks)
|
||||||
|
|
||||||
|
## 关键区别
|
||||||
|
|
||||||
|
CTM 的 ACT 是**涌现属性**而非显式设计——没有 halting 模块、没有 ponder cost、没有步数采样。这是其架构哲学的核心体现:通过设计损失函数和表示,让"智能"行为自然涌现。
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- Graves, "Adaptive Computation Time for Recurrent Neural Networks", 2016
|
||||||
|
- [[darlow-ctm-2025|CTM 论文]] (NeurIPS 2025)
|
||||||
31
concepts/analytical-report-synthesizer.md
Normal file
31
concepts/analytical-report-synthesizer.md
Normal file
@@ -0,0 +1,31 @@
|
|||||||
|
---
|
||||||
|
title: "Analytical Report Synthesizer"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [nlp, llm, report-generation, interface]
|
||||||
|
sources: [raw/papers/zeng-neurida-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Analytical Report Synthesizer
|
||||||
|
|
||||||
|
**Analytical Report Synthesizer** 是 NeurIDA 的输出组件,将预测结果转化为可解释的分析报告。
|
||||||
|
|
||||||
|
## 功能
|
||||||
|
|
||||||
|
- **输入**:预测结果 + 任务画像 + [[data-slice|Data Slice]]
|
||||||
|
- **方法**:LLM 驱动的专用分析代理,具备强推理能力和领域知识
|
||||||
|
- **输出**:结构化分析报告,包含上下文解读和分析摘要
|
||||||
|
|
||||||
|
## 设计价值
|
||||||
|
|
||||||
|
传统数据分析工作流中,预测结果(如 "8 名患者被标记为高风险")需要领域专家进一步解读才能转化为可操作决策。Synthesizer 自动化了这一解释过程,使 NeurIDA 从"输出数字"升级为"输出洞察"。
|
||||||
|
|
||||||
|
## 示例
|
||||||
|
|
||||||
|
输入:患者 ICU 入院预测结果 + 患者特征
|
||||||
|
输出:包含高风险患者数量、常见驱动因素、建议关注维度的结构化报告
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[zeng-neurida-2025|NeurIDA 论文]]
|
||||||
32
concepts/base-table-embedding.md
Normal file
32
concepts/base-table-embedding.md
Normal file
@@ -0,0 +1,32 @@
|
|||||||
|
---
|
||||||
|
title: "Base Table Embedding"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [machine-learning, embedding, tabular-data]
|
||||||
|
sources: [raw/papers/zeng-neurida-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Base Table Embedding
|
||||||
|
|
||||||
|
**Base Table Embedding** 是 DIME 管线的第一阶段,为 [[data-slice|Data Slice]] 中所有元组生成初始向量表示,捕获表内语义。
|
||||||
|
|
||||||
|
## 双路径编码策略
|
||||||
|
|
||||||
|
### 路径 1:基础模型编码
|
||||||
|
使用 [[conditional-model-dispatcher|Dispatcher]] 选出的基础模型 m* 对其原生特征进行编码,保留该模型的归纳偏置和建模能力。
|
||||||
|
|
||||||
|
### 路径 2:统一元组编码器
|
||||||
|
使用共享的 Unified Tuple Encoder 将异构 schema 的元组映射到统一的 d 维表示空间:
|
||||||
|
- 按数据类型编码每个属性值(数值型、类别型、文本型、时间戳型)
|
||||||
|
- 通过 Feature Tokenizer + Transformer Layer 捕获特征交互
|
||||||
|
- 产生兼容的统一表示,便于后续跨表建模
|
||||||
|
|
||||||
|
## 设计考量
|
||||||
|
|
||||||
|
- **兼容性**:统一编码器产生的表示与后续 [[dynamic-relation-modeling|关系建模]] 和 [[dynamic-model-fusion|融合]] 兼容
|
||||||
|
- **保真性**:基础模型路径保留其原生能力,双路径输出共同构成元组嵌入
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[zeng-neurida-2025|NeurIDA 论文]]
|
||||||
50
concepts/certainty-based-loss.md
Normal file
50
concepts/certainty-based-loss.md
Normal file
@@ -0,0 +1,50 @@
|
|||||||
|
---
|
||||||
|
title: "Certainty-Based Loss"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [loss-function, adaptive-computation, training]
|
||||||
|
sources: [raw/papers/darlow-ctm-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Certainty-Based Loss
|
||||||
|
|
||||||
|
**Certainty-Based Loss** 是 CTM 的训练损失函数,通过对多个内部 tick 的**动态选择**实现原生自适应计算。
|
||||||
|
|
||||||
|
## 定义
|
||||||
|
|
||||||
|
CTM 在每个 [[internal-ticks|internal tick]] t 产生输出 y_t(如分类概率)。对每个前向传播,选择两个 tick:
|
||||||
|
|
||||||
|
1. **t₁ = argmin_t(L_t)** — 损失最小的 tick("最佳"预测)
|
||||||
|
2. **t₂ = argmax_t(C_t)** — 确定性最高的 tick
|
||||||
|
|
||||||
|
其中 C_t = 1 − normalized_entropy(y_t),衡量预测置信度。
|
||||||
|
|
||||||
|
最终损失:
|
||||||
|
```
|
||||||
|
L = (L_t₁ + L_t₂) / 2
|
||||||
|
```
|
||||||
|
|
||||||
|
## 为什么这个设计关键?
|
||||||
|
|
||||||
|
### 原生自适应计算
|
||||||
|
- 不要求模型在固定 tick 停止——损失函数不指定"正确"的 tick
|
||||||
|
- 模型可以自然地学习在达到足够确定性时停止
|
||||||
|
- 简单样本在早期 tick 达到高确定性 → 实际推理时可早停
|
||||||
|
|
||||||
|
### 校准对齐
|
||||||
|
- 同时优化损失最小化和确定性最大化
|
||||||
|
- 促使模型的置信度与准确性对齐(校准)
|
||||||
|
- ImageNet 实验显示 CTM 具有天然优秀的校准性能
|
||||||
|
|
||||||
|
## 与 ACT 的对比
|
||||||
|
|
||||||
|
| 维度 | ACT (Graves 2016) | CTM Certainty-Based Loss |
|
||||||
|
|------|-------------------|-------------------------|
|
||||||
|
| Halting 机制 | 显式 halting 模块 + 额外损失项 | 损失函数设计自然实现 |
|
||||||
|
| 计算惩罚 | 需要 ponder cost 正则化 | 不需要 |
|
||||||
|
| 何时停止 | 学习 halting 概率 | 确定性阈值 |
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[darlow-ctm-2025|CTM 论文]]
|
||||||
41
concepts/composable-base-model-architecture.md
Normal file
41
concepts/composable-base-model-architecture.md
Normal file
@@ -0,0 +1,41 @@
|
|||||||
|
---
|
||||||
|
title: "Composable Base Model Architecture"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [machine-learning, architecture, modular-design]
|
||||||
|
sources: [raw/papers/zeng-neurida-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Composable Base Model Architecture
|
||||||
|
|
||||||
|
**可组合基础模型架构**是 NeurIDA 实现 [[dynamic-in-database-modeling|动态建模]] 的架构基础。
|
||||||
|
|
||||||
|
## 构成
|
||||||
|
|
||||||
|
```
|
||||||
|
基础模型池 M = {m₁, m₂, ..., mₖ}
|
||||||
|
+
|
||||||
|
共享模型组件
|
||||||
|
├── 统一元组编码器(Unified Tuple Encoder)
|
||||||
|
├── 关系感知消息传递模块(Relation-Aware Message Passing)
|
||||||
|
└── 上下文感知融合模块(Context-Aware Fusion)
|
||||||
|
```
|
||||||
|
|
||||||
|
## 基础模型池
|
||||||
|
|
||||||
|
涵盖四类异构模型:
|
||||||
|
- **Traditional ML**:RF, CatBoost, LightGBM, Logistic Regression
|
||||||
|
- **Tuple Representation Models (TRM)**:FT-Transformer, ARM-Net, TabM, ResNet, DNN, DeepFM
|
||||||
|
- **Tabular Foundation Models**:TabPFN, TabICL
|
||||||
|
- **Large Tabular Models (LTM)**:TP-BERTa, Nomic, BGE
|
||||||
|
|
||||||
|
## 设计原则
|
||||||
|
|
||||||
|
1. **异构性**:不同架构的模型互补,适应不同数据分布
|
||||||
|
2. **可组合性**:共享组件的接口统一,可任意与基础模型组合
|
||||||
|
3. **查询条件化**:[[conditional-model-dispatcher|Dispatcher]] 根据任务选择合适的 m*,DIME 按需装配
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[zeng-neurida-2025|NeurIDA 论文]]
|
||||||
36
concepts/conditional-model-dispatcher.md
Normal file
36
concepts/conditional-model-dispatcher.md
Normal file
@@ -0,0 +1,36 @@
|
|||||||
|
---
|
||||||
|
title: "Conditional Model Dispatcher"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [machine-learning, model-selection, efficiency]
|
||||||
|
sources: [raw/papers/zeng-neurida-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Conditional Model Dispatcher
|
||||||
|
|
||||||
|
**Conditional Model Dispatcher** 是 NeurIDA 的轻量级调度组件,解决两个关键决策:
|
||||||
|
|
||||||
|
1. **基础模型选择**:从 [[composable-base-model-architecture|模型池]] 中为当前任务选择最优基础模型
|
||||||
|
2. **条件增强**:判断是否需要调用 DIME 进行结构增强,还是直接部署基础模型
|
||||||
|
|
||||||
|
## 决策机制
|
||||||
|
|
||||||
|
### 模型选择
|
||||||
|
- 维护 metadata dictionary,记录每个基础模型的历史 EMA 性能 μᵢ
|
||||||
|
- 使用 [[zero-cost-proxies|Zero-Cost Proxies (ZCP)]] 对候选模型快速评分,得到代理分数 sᵢ
|
||||||
|
- 选择 s* 最高的模型 m*
|
||||||
|
|
||||||
|
### 增强决策
|
||||||
|
- 计算动态阈值:τ = (1 − ε) · μₘ*
|
||||||
|
- 若 s* ≥ τ:直接部署 m*(高效)
|
||||||
|
- 若 s* < τ:调用 [[dime-dynamic-in-database-modeling-engine|DIME]] 进行结构增强
|
||||||
|
|
||||||
|
## 设计考量
|
||||||
|
|
||||||
|
- **轻量级**:ZCP 评分无需完整训练,基于小批量标注数据即可完成
|
||||||
|
- **自适应**:阈值基于历史 EMA 动态调整,避免浪费计算资源
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[zeng-neurida-2025|NeurIDA 论文]]
|
||||||
38
concepts/continuous-thought-machine.md
Normal file
38
concepts/continuous-thought-machine.md
Normal file
@@ -0,0 +1,38 @@
|
|||||||
|
---
|
||||||
|
title: "Continuous Thought Machine (CTM)"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [neural-architecture, temporal-dynamics, biological-plausibility, sakana-ai]
|
||||||
|
sources: [raw/papers/darlow-ctm-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Continuous Thought Machine (CTM)
|
||||||
|
|
||||||
|
**Continuous Thought Machine (CTM)** 是 Sakana AI 提出的新型神经网络架构,将神经时序动力学作为核心计算原理。
|
||||||
|
|
||||||
|
## 核心设计原则
|
||||||
|
|
||||||
|
与大多数将神经元简化为静态激活函数(ReLU/GELU/SiLU)的现代 NN 不同,CTM 有两个根本性创新:
|
||||||
|
|
||||||
|
1. **[[neuron-level-models|Neuron-Level Models]]**:每个神经元拥有私有参数,从激活历史中产生复杂时序动态
|
||||||
|
2. **[[neural-synchronization|Neural Synchronization]]**:将神经元群体活动的时序相关性直接用作潜在表示
|
||||||
|
|
||||||
|
## 与现有架构的差异
|
||||||
|
|
||||||
|
| 维度 | 标准 Transformer/CNN | RNN/LSTM | CTM |
|
||||||
|
|------|---------------------|----------|-----|
|
||||||
|
| 神经元模型 | 统一激活函数 | 统一激活函数 | 私有 NLMs |
|
||||||
|
| 时序处理 | 位置编码 | 隐藏状态 | 内部 ticks + 同步 |
|
||||||
|
| 表示来源 | 单步激活快照 | 最终隐藏状态 | 激活历史的时序相关性 |
|
||||||
|
| 自适应计算 | 需显式模块 | 需显式模块 | 原生涌现 |
|
||||||
|
|
||||||
|
## 关键属性
|
||||||
|
|
||||||
|
- **原生自适应计算**:通过 [[certainty-based-loss|Certainty-Based Loss]] 自然实现
|
||||||
|
- **可解释性**:同步表示和注意力轨迹提供自然的可解释途径
|
||||||
|
- **涌现行为**:环顾四周、行波、内部世界模型构建
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[darlow-ctm-2025|CTM 论文 (NeurIPS 2025)]]
|
||||||
38
concepts/data-slice.md
Normal file
38
concepts/data-slice.md
Normal file
@@ -0,0 +1,38 @@
|
|||||||
|
---
|
||||||
|
title: "Data Slice"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [database, sql, data-management]
|
||||||
|
sources: [raw/papers/zeng-neurida-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Data Slice
|
||||||
|
|
||||||
|
**Data Slice**(数据切片)是任务特定的关系数据库子集,是 NeurIDA 分析管线的核心数据对象。
|
||||||
|
|
||||||
|
## 形式定义
|
||||||
|
|
||||||
|
给定分析查询 q,Data Slice Dq 是从数据库 D 中派生的自包含数据库子集:
|
||||||
|
|
||||||
|
```
|
||||||
|
Dq = {Tₖ,₍q₎ | Tₖ ∈ D, k ∈ Kq ⊆ {1, ..., K}}
|
||||||
|
```
|
||||||
|
|
||||||
|
其中每个 Table Slice Tₖ,₍q₎ 通过关系代数的选择和投影操作得到:
|
||||||
|
```
|
||||||
|
Tₖ,₍q₎ = π_Jₖ,₍q₎(σ_Iₖ,₍q₎(Tₖ))
|
||||||
|
```
|
||||||
|
|
||||||
|
- σ:行选择(由查询的 WHERE/JOIN 条件决定)
|
||||||
|
- π:列投影(由 [[query-intent-analyzer|Data Profiler]] 过滤不相关列决定)
|
||||||
|
|
||||||
|
## 在 NeurIDA 中的作用
|
||||||
|
|
||||||
|
- Data Slice 由 [[query-intent-analyzer|Query Intent Analyzer]] 自动生成
|
||||||
|
- 被转换为 [[relational-graph|关系图]](FK-PK 边),作为 DIME 建模的数据结构基础
|
||||||
|
- 所有后续建模仅在 Data Slice 上进行,无需访问整个数据库
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[zeng-neurida-2025|NeurIDA 论文]]
|
||||||
38
concepts/dime-dynamic-in-database-modeling-engine.md
Normal file
38
concepts/dime-dynamic-in-database-modeling-engine.md
Normal file
@@ -0,0 +1,38 @@
|
|||||||
|
---
|
||||||
|
title: "DIME (Dynamic In-Database Modeling Engine)"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [database, machine-learning, engine, in-database-analytics]
|
||||||
|
sources: [raw/papers/zeng-neurida-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# DIME: Dynamic In-Database Modeling Engine
|
||||||
|
|
||||||
|
**DIME** 是 NeurIDA 的核心执行引擎,负责在接收分析任务后动态构造定制模型并进行预测。
|
||||||
|
|
||||||
|
## 四阶段管线
|
||||||
|
|
||||||
|
### 1. [[base-table-embedding|Base Table Embedding]]
|
||||||
|
将 [[data-slice|Data Slice]] 中的所有元组转换为向量表示。采用**双路径编码**:
|
||||||
|
- **基础模型路径**:使用 Dispatcher 选出的基础模型生成原生表示,保留其归纳偏置
|
||||||
|
- **统一编码器路径**:共享的统一元组编码器将异构 schema 的元组映射到统一表示空间
|
||||||
|
|
||||||
|
### 2. [[dynamic-relation-modeling|Dynamic Relation Modeling]]
|
||||||
|
在 [[relational-graph|关系图]](FK-PK 边)上执行关系感知消息传递,将跨表结构信息注入元组嵌入,产生关系嵌入。
|
||||||
|
|
||||||
|
### 3. [[dynamic-model-fusion|Dynamic Model Fusion]]
|
||||||
|
使用上下文感知融合模块,计算关联表中各上下文信号的**重标定重要性分数**,自适应地将最相关的关联上下文融合到目标表元组表示中。
|
||||||
|
|
||||||
|
### 4. Task-Aware Prediction
|
||||||
|
基于融合嵌入,使用任务特定的预测头(分类/回归)生成最终预测。
|
||||||
|
|
||||||
|
## 关键特性
|
||||||
|
|
||||||
|
- **查询条件化**:整个管线由任务画像和数据画像驱动
|
||||||
|
- **关系感知**:显式建模 FK-PK 结构,不像传统方法将元组视为独立样本
|
||||||
|
- **可解释性**:融合模块的重要性分数提供预测归因
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[zeng-neurida-2025|NeurIDA 论文]]
|
||||||
29
concepts/dynamic-in-database-modeling.md
Normal file
29
concepts/dynamic-in-database-modeling.md
Normal file
@@ -0,0 +1,29 @@
|
|||||||
|
---
|
||||||
|
title: "Dynamic In-Database Modeling"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [database, machine-learning, paradigm, in-database-analytics]
|
||||||
|
sources: [raw/papers/zeng-neurida-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Dynamic In-Database Modeling
|
||||||
|
|
||||||
|
**动态库内建模**是 NeurIDA 提出的新范式,核心思想是**从「为每个任务训练固定模型」转向「在查询时从共享组件装配定制模型」**。
|
||||||
|
|
||||||
|
## 范式对比
|
||||||
|
|
||||||
|
| 维度 | 静态建模(传统) | 动态建模(NeurIDA) |
|
||||||
|
|------|----------------|-------------------|
|
||||||
|
| 模型生命周期 | 训练→部署→固定 | 查询时按需装配 |
|
||||||
|
| 任务覆盖 | 一对一(每个任务一个模型) | 一对多(共享架构覆盖所有任务) |
|
||||||
|
| 数据操作 | 需提取、预处理、外移 | 数据库内直接操作 |
|
||||||
|
| 适应性 | 新任务需从头构建 pipeline | 即时适配新任务 |
|
||||||
|
|
||||||
|
## 实现机制
|
||||||
|
|
||||||
|
依托 [[composable-base-model-architecture|可组合基础模型架构]]:预训练一组异构基础模型 + 共享模型组件(统一元组编码器、关系消息传递模块、上下文融合模块)。查询时 [[dime-dynamic-in-database-modeling-engine|DIME]] 根据任务画像从池中动态选取和配置组件。
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[zeng-neurida-2025|NeurIDA 论文]]
|
||||||
36
concepts/dynamic-model-fusion.md
Normal file
36
concepts/dynamic-model-fusion.md
Normal file
@@ -0,0 +1,36 @@
|
|||||||
|
---
|
||||||
|
title: "Dynamic Model Fusion"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [machine-learning, attention, interpretability, relational-data]
|
||||||
|
sources: [raw/papers/zeng-neurida-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Dynamic Model Fusion
|
||||||
|
|
||||||
|
**Dynamic Model Fusion** 是 DIME 管线的第三阶段,负责将有选择地将关联表的上下文信息融合到目标表元组表示中。
|
||||||
|
|
||||||
|
## 核心机制
|
||||||
|
|
||||||
|
使用**上下文感知融合模块(Context-Aware Fusion Module)**:
|
||||||
|
|
||||||
|
1. 计算关联表中每个上下文信号的**重标定重要性分数**
|
||||||
|
2. 基于任务画像自适应加权:与任务最相关的关联表获得更高权重
|
||||||
|
3. 将加权后的关系上下文注入目标表元组的 [[dynamic-relation-modeling|关系嵌入]]
|
||||||
|
4. 输出:**融合嵌入(Fused Embedding)**,直接送入预测头
|
||||||
|
|
||||||
|
## 可解释性
|
||||||
|
|
||||||
|
融合模块产生的重要性分数天然支持**预测归因**:
|
||||||
|
- 例如,在用户流失预测中,UserInfo 表获得最高重要性(用户画像最具预测力)
|
||||||
|
- 在广告点击预测中,Search 行为信号权重最大
|
||||||
|
- 这些分数与领域知识高度一致,无需额外的可解释性组件
|
||||||
|
|
||||||
|
## 消融实验
|
||||||
|
|
||||||
|
移除 Dynamic Model Fusion 后,性能下降幅度超过移除 Dynamic Relation Modeling,说明**选择性融合**(而不是简单拼接所有关系信息)是关键设计。
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[zeng-neurida-2025|NeurIDA 论文]]
|
||||||
35
concepts/dynamic-relation-modeling.md
Normal file
35
concepts/dynamic-relation-modeling.md
Normal file
@@ -0,0 +1,35 @@
|
|||||||
|
---
|
||||||
|
title: "Dynamic Relation Modeling"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [machine-learning, graph-neural-networks, relational-data]
|
||||||
|
sources: [raw/papers/zeng-neurida-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Dynamic Relation Modeling
|
||||||
|
|
||||||
|
**Dynamic Relation Modeling** 是 DIME 管线的第二阶段,负责将跨表关系结构融入元组表示。
|
||||||
|
|
||||||
|
## 机制
|
||||||
|
|
||||||
|
在 [[relational-graph|关系图]](以 FK-PK 为边的元组图)上执行**关系感知消息传递**:
|
||||||
|
|
||||||
|
1. 每个元组节点以其 [[base-table-embedding|Base Table Embedding]] 初始化
|
||||||
|
2. 消息沿 FK-PK 边传递,聚合邻接表元组的信息
|
||||||
|
3. 通过多轮消息传递,元组嵌入吸收关联表的语义信息
|
||||||
|
4. 输出:**关系嵌入(Relational Embedding)**,融合了表内语义和跨表结构
|
||||||
|
|
||||||
|
## 可配置维度
|
||||||
|
|
||||||
|
- 聚合器类型:min / max / mean / sum
|
||||||
|
- 消息传递层数 ℓ
|
||||||
|
- 编码器层数和注意力头数
|
||||||
|
|
||||||
|
## 消融实验
|
||||||
|
|
||||||
|
移除 Dynamic Relation Modeling 后,性能显著下降——证实跨表结构信息对预测至关重要,尤其在目标表特征稀疏的场景下。
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[zeng-neurida-2025|NeurIDA 论文]]
|
||||||
36
concepts/in-database-analytics.md
Normal file
36
concepts/in-database-analytics.md
Normal file
@@ -0,0 +1,36 @@
|
|||||||
|
---
|
||||||
|
title: "In-Database Analytics"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [database, machine-learning, analytics]
|
||||||
|
sources: [raw/papers/zeng-neurida-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# In-Database Analytics
|
||||||
|
|
||||||
|
**In-Database Analytics** 指在数据库管理系统内部直接执行 ML/分析任务,无需将数据导出到外部计算环境。
|
||||||
|
|
||||||
|
## 核心价值
|
||||||
|
|
||||||
|
1. **零数据移动**:消除 ETL/数据导出开销,保持数据本地性
|
||||||
|
2. **实时性**:分析紧耦合于数据,响应延迟最低
|
||||||
|
3. **一致性**:分析所用的数据与事务数据一致
|
||||||
|
4. **治理**:数据库的访问控制、审计等安全机制自然覆盖分析任务
|
||||||
|
|
||||||
|
## 代表性系统
|
||||||
|
|
||||||
|
- **NeurIDA** — 端到端自主系统,动态建模 + 自然语言接口
|
||||||
|
- **NeurDB** — AI 驱动的自主数据库(CIDR 2025)
|
||||||
|
- **PostgresML** / **MindsDB** — SQL 内嵌 ML 推理
|
||||||
|
- **Cerebro** — 数据库内的 DL 模型选择
|
||||||
|
|
||||||
|
## 关键挑战
|
||||||
|
|
||||||
|
- **范式鸿沟**:传统 ML 的静态模型 vs 数据库的动态环境
|
||||||
|
- **schema 异构**:关系数据的多表结构需要特殊建模
|
||||||
|
- **查询多样性**:需支持分类、回归等多种预测类型
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[zeng-neurida-2025|NeurIDA 论文]]
|
||||||
40
concepts/internal-ticks.md
Normal file
40
concepts/internal-ticks.md
Normal file
@@ -0,0 +1,40 @@
|
|||||||
|
---
|
||||||
|
title: "Internal Ticks"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [neural-architecture, recurrence, temporal-processing]
|
||||||
|
sources: [raw/papers/darlow-ctm-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Internal Ticks
|
||||||
|
|
||||||
|
**Internal Ticks** 是 CTM 中的内部时序维度 t ∈ {1, 2, ..., T},与数据维度(序列长度、图像尺寸等)完全解耦。
|
||||||
|
|
||||||
|
## 核心思想
|
||||||
|
|
||||||
|
传统循环模型沿**数据固有的序列维度**展开(如文本的 token 位置),而 CTM 沿**自生成的"思考步骤"** 展开——即使是静态输入(单张图像)也有内部时序。
|
||||||
|
|
||||||
|
## 在 CTM 中的作用
|
||||||
|
|
||||||
|
每个 internal tick 中:
|
||||||
|
1. [[synapse-model|Synapse Model]] 产生前激活 a_t
|
||||||
|
2. [[neuron-level-models|NLMs]] 产生后激活 z_{t+1}
|
||||||
|
3. [[neural-synchronization|同步矩阵]] S^t 被计算
|
||||||
|
4. 输出 y_t 和注意力查询 q_t 被生成
|
||||||
|
5. 注意力输出 o_t 与 z_{t+1} 拼接进入下一 tick
|
||||||
|
|
||||||
|
## 与 Adaptive Computation 的关系
|
||||||
|
|
||||||
|
CTM 不要求使用固定的 tick 数——[[certainty-based-loss|损失函数]] 在每个样本上动态选择最佳 tick。这意味着:
|
||||||
|
- 简单样本可提前终止(如 <10 ticks for ImageNet)
|
||||||
|
- 困难样本可使用更多 ticks
|
||||||
|
|
||||||
|
## 相关概念
|
||||||
|
|
||||||
|
- 类似 Perceiver 的 iterative attention 和 PonderNet 的 halting 机制
|
||||||
|
- 但 CTM 的 ticks 是 **neural dynamics 的展开**,而非单纯的迭代精炼
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[darlow-ctm-2025|CTM 论文]]
|
||||||
45
concepts/internal-world-model.md
Normal file
45
concepts/internal-world-model.md
Normal file
@@ -0,0 +1,45 @@
|
|||||||
|
---
|
||||||
|
title: "Internal World Model"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [cognitive-science, planning, representation, world-models]
|
||||||
|
sources: [raw/papers/darlow-ctm-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Internal World Model
|
||||||
|
|
||||||
|
**内部世界模型** 是 agent 在内部构建的环境表征,用于预测、规划和推理而不直接与环境交互。
|
||||||
|
|
||||||
|
## 经典定义 (Ha & Schmidhuber, 2018)
|
||||||
|
|
||||||
|
世界模型由三个组件构成:
|
||||||
|
1. **Vision (V)**:将观察压缩为潜在编码
|
||||||
|
2. **Memory (M)**:预测未来的潜在编码
|
||||||
|
3. **Controller (C)**:基于潜在编码选择动作
|
||||||
|
|
||||||
|
## CTM 中的涌现世界模型
|
||||||
|
|
||||||
|
在 2D 迷宫任务中,CTM **没有位置编码**,但必须输出从起点到终点的动作序列。这意味着:
|
||||||
|
|
||||||
|
- CTM 必须在内部构建空间表征("地图")
|
||||||
|
- 表征通过 [[neural-synchronization|神经同步]] 自然形成
|
||||||
|
- 无需显式设计——从架构中涌现
|
||||||
|
|
||||||
|
### 证据
|
||||||
|
- CTM 训练于 39×39 迷宫,可泛化到 99×99(通过重复应用学到的策略)
|
||||||
|
- 模型可以在训练步数之外"继续探索"
|
||||||
|
- 注意力可视化显示模型有序追踪路径
|
||||||
|
|
||||||
|
## 与显式世界模型的对比
|
||||||
|
|
||||||
|
| 维度 | 显式世界模型 (Dreamer 等) | CTM 涌现世界模型 |
|
||||||
|
|------|-------------------------|-----------------|
|
||||||
|
| 设计方式 | 明确分离 V/M/C 模块 | 同一架构中的涌现属性 |
|
||||||
|
| 表示形式 | 潜在向量快照 | 神经同步矩阵(时序) |
|
||||||
|
| 空间编码 | 通常使用位置编码 | 无位置编码,完全自建 |
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[darlow-ctm-2025|CTM 论文]]
|
||||||
|
- Ha & Schmidhuber, "World Models", 2018
|
||||||
46
concepts/neural-synchronization.md
Normal file
46
concepts/neural-synchronization.md
Normal file
@@ -0,0 +1,46 @@
|
|||||||
|
---
|
||||||
|
title: "Neural Synchronization as Representation"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [representation-learning, temporal-dynamics, synchronization, biological-plausibility]
|
||||||
|
sources: [raw/papers/darlow-ctm-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Neural Synchronization as Representation
|
||||||
|
|
||||||
|
**神经同步作为表示**是 CTM 的第二个核心创新:将神经元群体活动历史的时序相关性直接用作潜在表示,而非使用单一时间点的激活快照。
|
||||||
|
|
||||||
|
## 数学定义
|
||||||
|
|
||||||
|
给定所有神经元在 tick 1 到 t 的后激活历史 Z^t ∈ R^{D×t},同步矩阵定义为:
|
||||||
|
|
||||||
|
```
|
||||||
|
S^t = Z^t · (Z^t)^⊺ ∈ R^{D×D}
|
||||||
|
```
|
||||||
|
|
||||||
|
即任意两个神经元 d_i 和 d_j 在整个激活历史上的**内积**,衡量它们的时间相关程度。
|
||||||
|
|
||||||
|
## 为什么是同步而非快照?
|
||||||
|
|
||||||
|
作者发现将 z_t 直接投影到下游任务会过度约束神经元——每个神经元的激活被强制编码任务相关信息,限制了其可能产生的动态类型。同步表示**解耦了神经元动态与任务需求**:神经元可以自由产生丰富的时序模式,只需它们的相关性(而非具体值)编码任务信息。
|
||||||
|
|
||||||
|
## 子采样:Neuron Pairing
|
||||||
|
|
||||||
|
完整 S^t 的 O(D²) 规模过大。CTM 在训练开始时随机选择:
|
||||||
|
- D_out 对神经元 → 输出同步表示 S^t_out
|
||||||
|
- D_action 对神经元 → 动作同步表示 S^t_action(用于注意力查询)
|
||||||
|
|
||||||
|
这些对的选择在整个训练过程中固定。
|
||||||
|
|
||||||
|
## 时间尺度调制
|
||||||
|
|
||||||
|
每对神经元 (i,j) 有可学习的指数衰减参数 r_ij:
|
||||||
|
- r_ij = 0:所有历史 tick 等权重
|
||||||
|
- r_ij 大:偏向近期 tick
|
||||||
|
|
||||||
|
这使 CTM 能学习在多个时间尺度上同步。
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[darlow-ctm-2025|CTM 论文]]
|
||||||
33
concepts/neurida.md
Normal file
33
concepts/neurida.md
Normal file
@@ -0,0 +1,33 @@
|
|||||||
|
---
|
||||||
|
title: "NeurIDA"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [database, machine-learning, autonomous-system, in-database-analytics]
|
||||||
|
sources: [raw/papers/zeng-neurida-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# NeurIDA
|
||||||
|
|
||||||
|
**NeurIDA**(Neural In-Database Analytics)是一个自主端到端系统,实现了关系数据库内的 ML 分析。其核心创新在于**动态库内建模(Dynamic In-Database Modeling)**——在查询时从可组合的共享组件中装配定制模型,而非为每个任务从头构建 pipeline。
|
||||||
|
|
||||||
|
## 架构
|
||||||
|
|
||||||
|
```
|
||||||
|
NLQ → Query Intent Analyzer → Conditional Model Dispatcher → DIME → Analytical Report Synthesizer
|
||||||
|
```
|
||||||
|
|
||||||
|
- **输入**:自然语言查询
|
||||||
|
- **输出**:可解释的分析报告
|
||||||
|
- **运行环境**:直接操作 RDBMS,无需数据迁移
|
||||||
|
|
||||||
|
## 核心能力
|
||||||
|
|
||||||
|
1. **任务无关性**:同一系统可处理分类、回归等多种预测任务
|
||||||
|
2. **动态适配**:根据查询语义和数据画像即时调整模型结构
|
||||||
|
3. **零数据移动**:所有建模直接在数据库内完成
|
||||||
|
4. **自然语言界面**:NLQ 输入 + LLM 解析 + 报告生成
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[zeng-neurida-2025|NeurIDA 论文]]
|
||||||
41
concepts/neuron-level-models.md
Normal file
41
concepts/neuron-level-models.md
Normal file
@@ -0,0 +1,41 @@
|
|||||||
|
---
|
||||||
|
title: "Neuron-Level Models (NLMs)"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [neural-architecture, biological-plausibility, temporal-processing]
|
||||||
|
sources: [raw/papers/darlow-ctm-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Neuron-Level Models (NLMs)
|
||||||
|
|
||||||
|
**Neuron-Level Models (NLMs)** 是 CTM 的第一个核心创新:每个神经元拥有**私有参数**的模型,而非所有神经元共享同一激活函数。
|
||||||
|
|
||||||
|
## 机制
|
||||||
|
|
||||||
|
对于第 d 个神经元:
|
||||||
|
```
|
||||||
|
z_{t+1}^d = g_{θ_d}(A_t^d)
|
||||||
|
```
|
||||||
|
|
||||||
|
其中:
|
||||||
|
- A_t^d ∈ R^M 是该神经元最近 M 步的**前激活历史**(pre-activation history)
|
||||||
|
- g_{θ_d} 是一个深度为 1 的 MLP(宽度 d_hidden),每个神经元有独立权重
|
||||||
|
- z 是**后激活**(post-activation),即该神经元的放电状态
|
||||||
|
|
||||||
|
## 与传统激活函数的对比
|
||||||
|
|
||||||
|
| 维度 | 传统(ReLU/GELU) | NLMs |
|
||||||
|
|------|-------------------|------|
|
||||||
|
| 参数共享 | 全共享 | 每个神经元私有 |
|
||||||
|
| 时序依赖 | 无(仅当前输入) | M 步历史 |
|
||||||
|
| 表达能力 | 低(单一点态非线性) | 高(时态模式检测) |
|
||||||
|
| 生物学类比 | 无 | 类似真实神经元的脉冲时序依赖 |
|
||||||
|
|
||||||
|
## 含义
|
||||||
|
|
||||||
|
NLMs 是「神经元即小型时序处理器」的思想实验——将 D 维潜在空间中的每个维度视为一个具有独立时序动力学的"微型大脑"。这显著增加了参数量(× d_hidden × M),但也开辟了新的能力维度。
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[darlow-ctm-2025|CTM 论文]]
|
||||||
43
concepts/neuron-pairing.md
Normal file
43
concepts/neuron-pairing.md
Normal file
@@ -0,0 +1,43 @@
|
|||||||
|
---
|
||||||
|
title: "Neuron Pairing"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [efficiency, synchronization, subsampling]
|
||||||
|
sources: [raw/papers/darlow-ctm-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Neuron Pairing
|
||||||
|
|
||||||
|
**Neuron Pairing** 是 CTM 中用于降低 [[neural-synchronization|同步矩阵]] 计算开销的子采样策略。
|
||||||
|
|
||||||
|
## 动机
|
||||||
|
|
||||||
|
同步矩阵 S^t = Z^t·(Z^t)^⊺ ∈ R^{D×D} 的规模是 O(D²),对于典型的 D(数百到数千)规模过大,无法直接用于下游。
|
||||||
|
|
||||||
|
## 策略
|
||||||
|
|
||||||
|
在训练开始时,随机选择两组神经元对并固定:
|
||||||
|
- **D_out 对** → 输出同步表示 S^t_out → 投影到 y_t(预测)
|
||||||
|
- **D_action 对** → 动作同步表示 S^t_action → 投影到 q_t(注意力查询)
|
||||||
|
|
||||||
|
此外还保留:
|
||||||
|
- **D_self 对** → 对角线元素 (i,i),捕获单个神经元的自同步(即能量)
|
||||||
|
|
||||||
|
## 设计考量
|
||||||
|
|
||||||
|
- **固定对**:在整个训练中保持不变,使投影矩阵 W_out、W_in 可学习
|
||||||
|
- **随机选择**:避免偏差,确保多样的神经元交互被采样
|
||||||
|
- **恢复快照依赖**:对角线对 (i,i) 保留了类似"快照"的表示能力
|
||||||
|
|
||||||
|
## 效率 vs 表达力权衡
|
||||||
|
|
||||||
|
| 维度 | 完整 S^t | Neuron Pairing |
|
||||||
|
|------|---------|----------------|
|
||||||
|
| 参数量 | O(D²) | O(D × (D_out + D_action)) |
|
||||||
|
| 信息量 | 所有对的相关性 | 子采样对的相关性 |
|
||||||
|
| 训练稳定性 | 投影矩阵过大 | 可控维度 |
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[darlow-ctm-2025|CTM 论文]]
|
||||||
42
concepts/pre-activation-history.md
Normal file
42
concepts/pre-activation-history.md
Normal file
@@ -0,0 +1,42 @@
|
|||||||
|
---
|
||||||
|
title: "Pre-Activation History"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [neural-architecture, temporal-processing, memory]
|
||||||
|
sources: [raw/papers/darlow-ctm-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Pre-Activation History
|
||||||
|
|
||||||
|
**前激活历史** 是 CTM 中每个神经元维护的滚动缓冲区,存储最近 M 步的前激活值,供 [[neuron-level-models|NLM]] 处理。
|
||||||
|
|
||||||
|
## 定义
|
||||||
|
|
||||||
|
```
|
||||||
|
A_t = [a_{t-M+1}, a_{t-M+2}, ..., a_t] ∈ R^{D×M}
|
||||||
|
```
|
||||||
|
|
||||||
|
其中 a_t 是 [[synapse-model|Synapse Model]] 的输出(前激活)。A_t 以 FIFO 方式滚动更新。
|
||||||
|
|
||||||
|
对于第 d 个神经元:
|
||||||
|
```
|
||||||
|
A_t^d ∈ R^M → NLM g_{θ_d} → z_{t+1}^d
|
||||||
|
```
|
||||||
|
|
||||||
|
## 为什么重要?
|
||||||
|
|
||||||
|
前激活历史是 NLMs 能够产生**复杂时序动态**的基础:
|
||||||
|
- 没有历史 → NLM 退化为普通逐元素变换
|
||||||
|
- M 较大 → 每个神经元可以检测 M 步的模式
|
||||||
|
- 这类似于卷积的感受野,但在时间维度上且每个神经元独立
|
||||||
|
|
||||||
|
## 超参数 M
|
||||||
|
|
||||||
|
作者发现 M ≈ 10-100 在初始探索中有效:
|
||||||
|
- 太小:缺乏足够的时序上下文
|
||||||
|
- 太大:训练开销增加,可能稀释近期信号
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[darlow-ctm-2025|CTM 论文]]
|
||||||
36
concepts/query-intent-analyzer.md
Normal file
36
concepts/query-intent-analyzer.md
Normal file
@@ -0,0 +1,36 @@
|
|||||||
|
---
|
||||||
|
title: "Query Intent Analyzer"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [nlp, llm, interface, database]
|
||||||
|
sources: [raw/papers/zeng-neurida-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Query Intent Analyzer
|
||||||
|
|
||||||
|
**Query Intent Analyzer** 是 NeurIDA 的入口组件,将自然语言查询转化为结构化的任务画像(Task Profile)和数据画像(Data Profile)。
|
||||||
|
|
||||||
|
## 两步流程
|
||||||
|
|
||||||
|
### 1. Task Parser
|
||||||
|
- **输入**:NLQ + DB Catalog(schema 级元数据)
|
||||||
|
- **方法**:LLM 驱动的分析代理,通过精心设计的 prompt(含示例)指导输出
|
||||||
|
- **输出**:JSON 格式任务画像(目标表、任务类型、时间窗口等)
|
||||||
|
- **验证**:JSON Parser + Rule Checker 确保语法和 schema 对齐
|
||||||
|
|
||||||
|
### 2. Data Profiler
|
||||||
|
- **输入**:任务画像 + DB Catalog
|
||||||
|
- **方法**:LLM 代理采用 Chain-of-Thought 多轮交互策略
|
||||||
|
- 第一轮:识别目标表和相关表
|
||||||
|
- 第二轮:过滤不相关/冗余列
|
||||||
|
- **输出**:SQL 片段形式的数据画像,定义 [[data-slice|Data Slice]]
|
||||||
|
|
||||||
|
## 设计要点
|
||||||
|
|
||||||
|
- **零人工特征工程**:完全自动化从 NLQ 到结构化画像的转换
|
||||||
|
- **DB schema 感知**:grounded 在真实 schema 上,避免幻觉
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[zeng-neurida-2025|NeurIDA 论文]]
|
||||||
35
concepts/relational-graph.md
Normal file
35
concepts/relational-graph.md
Normal file
@@ -0,0 +1,35 @@
|
|||||||
|
---
|
||||||
|
title: "Relational Graph"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [database, graph, relational-data]
|
||||||
|
sources: [raw/papers/zeng-neurida-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Relational Graph
|
||||||
|
|
||||||
|
**关系图**是将关系数据库中的元组表示为图的数据结构,是 DIME 中 [[dynamic-relation-modeling|动态关系建模]] 的基础。
|
||||||
|
|
||||||
|
## 构造方式
|
||||||
|
|
||||||
|
在 NeurIDA 中,关系图基于 [[data-slice|Data Slice]] 构建:
|
||||||
|
|
||||||
|
- **节点**:每个元组为一个节点,按源表标注节点类型
|
||||||
|
- **边**:通过主键-外键(PK-FK)约束形成连接
|
||||||
|
|
||||||
|
## 与通用图神经网络的差异
|
||||||
|
|
||||||
|
- 关系图是**异构图**(多种节点类型,对应不同表)
|
||||||
|
- 边有明确的**关系语义**(由 schema 定义,不是学习得到的)
|
||||||
|
- 无需图构建的启发式方法(如 kNN),关系由数据库 schema 直接提供
|
||||||
|
|
||||||
|
## 在 DIME 中的作用
|
||||||
|
|
||||||
|
1. 定义消息传递的邻域结构
|
||||||
|
2. 为元组嵌入编码跨表依赖关系
|
||||||
|
3. 使模型能利用数据库中已编码的结构化知识
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[zeng-neurida-2025|NeurIDA 论文]]
|
||||||
22
concepts/spiking-neural-networks.md
Normal file
22
concepts/spiking-neural-networks.md
Normal file
@@ -0,0 +1,22 @@
|
|||||||
|
---
|
||||||
|
title: "Spiking Neural Networks (SNN)"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [neural-architecture, biological-plausibility, event-driven]
|
||||||
|
confidence: medium
|
||||||
|
---
|
||||||
|
|
||||||
|
# Spiking Neural Networks (SNN)
|
||||||
|
|
||||||
|
**脉冲神经网络** 是一类使用离散脉冲(spike)而非连续值进行通信的神经网络,在时间编码和事件驱动计算方面更接近生物神经元。
|
||||||
|
|
||||||
|
## 与 CTM 的关系
|
||||||
|
|
||||||
|
CTM 和 SNN 共享**生物学灵感**但路径不同:
|
||||||
|
- **SNN**:离散脉冲 + 事件驱动 + 脉冲时序依赖可塑性(STDP)
|
||||||
|
- **CTM**:连续值 NLM + 梯度优化 + 同步作为表示
|
||||||
|
|
||||||
|
两者都认为**时间是神经计算的核心**,但 CTM 选择了更适合现代深度学习的抽象层次。
|
||||||
|
|
||||||
|
> 📝 占位页面 — 待完整 ingest 后更新
|
||||||
44
concepts/synapse-model.md
Normal file
44
concepts/synapse-model.md
Normal file
@@ -0,0 +1,44 @@
|
|||||||
|
---
|
||||||
|
title: "Synapse Model"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [neural-architecture, recurrence, connectivity]
|
||||||
|
sources: [raw/papers/darlow-ctm-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Synapse Model
|
||||||
|
|
||||||
|
**Synapse Model** 是 CTM 中的循环互连结构,负责在神经元之间共享信息。
|
||||||
|
|
||||||
|
## 定义
|
||||||
|
|
||||||
|
```
|
||||||
|
a_t = f_θ_syn(concat(z_t, o_t)) ∈ R^D
|
||||||
|
```
|
||||||
|
|
||||||
|
其中:
|
||||||
|
- z_t 是当前神经元后激活状态
|
||||||
|
- o_t 是上一 tick 的注意力输出(与外部数据的交互结果)
|
||||||
|
- f_θ_syn 是 U-Net 风格的 MLP(深度为 k,k 为偶数)
|
||||||
|
|
||||||
|
## 为什么是 U-Net 风格?
|
||||||
|
|
||||||
|
作者发现 U-Net 风格(带跳跃连接的编码器-解码器 MLP)表现最佳,暗示**更深、更灵活的突触计算**有益于信息整合。这与生物突触的复杂性(多种受体类型、短期可塑性、神经递质动力学)形成类比。
|
||||||
|
|
||||||
|
## 在 CTM 流程中的位置
|
||||||
|
|
||||||
|
```
|
||||||
|
z_t ─┐
|
||||||
|
├→ Synapse → a_t → NLMs → z_{t+1}
|
||||||
|
o_t ─┘ ↓
|
||||||
|
Sync → q_t, y_t → Attention → o_{t+1}
|
||||||
|
↓
|
||||||
|
concat(z_{t+1}, o_{t+1}) → next tick
|
||||||
|
```
|
||||||
|
|
||||||
|
Synapse 是**神经动力学的引擎**——它将外部信息(通过注意力)和内部状态融合,为每个神经元的 NLM 提供前激活输入。
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[darlow-ctm-2025|CTM 论文]]
|
||||||
33
concepts/tabular-foundation-models.md
Normal file
33
concepts/tabular-foundation-models.md
Normal file
@@ -0,0 +1,33 @@
|
|||||||
|
---
|
||||||
|
title: "Tabular Foundation Models"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [machine-learning, foundation-models, tabular-data]
|
||||||
|
sources: [raw/papers/zeng-neurida-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Tabular Foundation Models
|
||||||
|
|
||||||
|
**表格基础模型**是在大规模表格数据上预训练、能通过上下文学习适应新任务的模型。
|
||||||
|
|
||||||
|
## 代表模型
|
||||||
|
|
||||||
|
- **TabPFN** (Hollmann et al., ICLR 2023):基于 Transformer 的表格分类模型,一次前向传播即可完成预测
|
||||||
|
- **TabICL** (Qu et al., ICML 2025):支持大规模数据的上下文学习表格基础模型
|
||||||
|
- **TP-BERTa** (Yan et al., ICLR 2024):将预训练语言模型适配到表格预测
|
||||||
|
|
||||||
|
## 在 NeurIDA 中的表现
|
||||||
|
|
||||||
|
作为 [[composable-base-model-architecture|基础模型池]] 的一部分:
|
||||||
|
- TabPFN/TabICL 在小数据集上表现最佳(利用预训练先验 + 上下文适应)
|
||||||
|
- TP-BERTa 等 LTM 在关系数据库中表现较差——因为表格属性缺乏自然语言语义(如 UserAgentID、UserDeviceID)
|
||||||
|
|
||||||
|
## 核心优势
|
||||||
|
|
||||||
|
- **免训练推理**:预训练后无需针对新任务微调
|
||||||
|
- **强泛化**:预训练先验覆盖广泛的表格分布
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[zeng-neurida-2025|NeurIDA 论文]]
|
||||||
46
concepts/temporal-decay-neural.md
Normal file
46
concepts/temporal-decay-neural.md
Normal file
@@ -0,0 +1,46 @@
|
|||||||
|
---
|
||||||
|
title: "Temporal Decay (Neural)"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [temporal-processing, synchronization, learnable-parameters]
|
||||||
|
sources: [raw/papers/darlow-ctm-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Temporal Decay (Neural)
|
||||||
|
|
||||||
|
**时态衰减** 是 CTM 中每对神经元 (i,j) 的可学习参数 r_ij ≥ 0,用于调制 [[neural-synchronization|同步]] 计算中的时间尺度。
|
||||||
|
|
||||||
|
## 定义
|
||||||
|
|
||||||
|
对于神经元对 (i,j),衰减向量定义为:
|
||||||
|
```
|
||||||
|
R^t_ij = [exp(-r_ij(t-1)), exp(-r_ij(t-2)), ..., exp(0)]^⊺ ∈ R^t
|
||||||
|
```
|
||||||
|
|
||||||
|
同步计算被重新缩放:
|
||||||
|
```
|
||||||
|
S^t_ij = (Z^t_i)^⊺ · diag(R^t_ij) · Z^t_j / √(Σ_τ R^t_ij[τ])
|
||||||
|
```
|
||||||
|
|
||||||
|
## 行为
|
||||||
|
|
||||||
|
- **r_ij = 0**:所有历史 tick 等权重 → 长时程整合
|
||||||
|
- **r_ij 大**:偏向近期 tick → 短时程响应
|
||||||
|
- **r_ij 可学习**:CTM 根据任务需求自动调整每对神经元的时间尺度
|
||||||
|
|
||||||
|
## 实验发现
|
||||||
|
|
||||||
|
作者观察到 CTM 对 r_ij 的使用具有**任务依赖性**:
|
||||||
|
- **迷宫任务**:模型积极利用多时间尺度
|
||||||
|
- **ImageNet**:衰减的使用较少
|
||||||
|
|
||||||
|
这暗示不同任务需要不同的时态整合模式。
|
||||||
|
|
||||||
|
## 生物学类比
|
||||||
|
|
||||||
|
类似生物突触的**短期可塑性**(short-term plasticity)——某些突触对近期活动敏感(促进/抑制),而其他突触保持长时程稳定。
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[darlow-ctm-2025|CTM 论文]]
|
||||||
29
concepts/zero-cost-proxies.md
Normal file
29
concepts/zero-cost-proxies.md
Normal file
@@ -0,0 +1,29 @@
|
|||||||
|
---
|
||||||
|
title: "Zero-Cost Proxies (ZCP)"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: concept
|
||||||
|
tags: [machine-learning, neural-architecture-search, efficiency]
|
||||||
|
sources: [raw/papers/zeng-neurida-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Zero-Cost Proxies (ZCP)
|
||||||
|
|
||||||
|
**Zero-Cost Proxies** 是源自 Neural Architecture Search (NAS) 的技术,在**不进行完整训练的情况下**估计模型在给定任务上的性能。
|
||||||
|
|
||||||
|
## 核心思想
|
||||||
|
|
||||||
|
在模型初始化阶段通过某些可计算的代用指标(如梯度范数、激活模式、雅可比矩阵特征等)来预测模型的最终性能,成本接近零(无需梯度下降迭代)。
|
||||||
|
|
||||||
|
## 在 NeurIDA 中的应用
|
||||||
|
|
||||||
|
[[conditional-model-dispatcher|Conditional Model Dispatcher]] 使用 ZCP 对 [[composable-base-model-architecture|基础模型池]] 中每个候选模型进行快速评分,实现轻量级的模型选择。ZCP 评分的低成本意味着 Dispatcher 可以在几乎不增加延迟的情况下完成模型选择决策。
|
||||||
|
|
||||||
|
## 关键参考文献
|
||||||
|
|
||||||
|
- Abdelfattah et al., "Zero-Cost Proxies for Lightweight NAS", ICLR 2021
|
||||||
|
- Shu et al., "NASI: Label- and Data-agnostic Neural Architecture Search at Initialization", ICLR 2022
|
||||||
|
|
||||||
|
## 来源
|
||||||
|
|
||||||
|
- [[zeng-neurida-2025|NeurIDA 论文]]
|
||||||
76
index.md
76
index.md
@@ -1,10 +1,11 @@
|
|||||||
# LLM Wiki
|
# LLM Wiki
|
||||||
|
|
||||||
> 知识索引页面 — 自动生成
|
> 知识索引页面 — 自动生成
|
||||||
> 最后更新:2026-05-14 | 总页面数:300
|
> 最后更新:2026-05-15 | 总页面数:335
|
||||||
|
|
||||||
## Concepts
|
## Concepts
|
||||||
|
|
||||||
|
- [[adaptive-computation-time]] — 根据输入难度动态调整计算量的技术族(ACT, PonderNet 等)
|
||||||
- [[additive-combinatorics]]
|
- [[additive-combinatorics]]
|
||||||
- [[agent-communication-stack]]
|
- [[agent-communication-stack]]
|
||||||
- [[agent-mediated-deception]]
|
- [[agent-mediated-deception]]
|
||||||
@@ -17,20 +18,25 @@
|
|||||||
- [[ai-alignment]]
|
- [[ai-alignment]]
|
||||||
- [[ai-mathematics]]
|
- [[ai-mathematics]]
|
||||||
- [[ai-safety]]
|
- [[ai-safety]]
|
||||||
|
- [[analytical-report-synthesizer]] — LLM 驱动的预测结果→分析报告自动生成器
|
||||||
- [[api-key-authentication]]
|
- [[api-key-authentication]]
|
||||||
- [[attention-entropy-collapse]]
|
- [[attention-entropy-collapse]]
|
||||||
- [[attention-sinks]]
|
- [[attention-sinks]]
|
||||||
- [[automated-theorem-proving]]
|
- [[automated-theorem-proving]]
|
||||||
- [[backtranslation-round-trip-relay]] — 回译接力:通过可逆编辑链评估 LLM 文档编辑保真度
|
- [[backtranslation-round-trip-relay]] — 回译接力:通过可逆编辑链评估 LLM 文档编辑保真度
|
||||||
|
- [[base-table-embedding]] — DIME 第一阶段:双路径编码捕获表内语义
|
||||||
|
- [[behrouz-memory-caching-rnn]]
|
||||||
- [[bidirectional-trajectory-evaluation]]
|
- [[bidirectional-trajectory-evaluation]]
|
||||||
- [[bpf-syscall-interception]]
|
- [[bpf-syscall-interception]]
|
||||||
- [[cache-health-observability]]
|
- [[cache-health-observability]]
|
||||||
- [[cache-hit-ratio]]
|
- [[cache-hit-ratio]]
|
||||||
- [[cache-invalidation]]
|
- [[cache-invalidation]]
|
||||||
- [[cache-safe-forking]]
|
- [[cache-safe-forking]]
|
||||||
|
- [[caddy-reverse-proxy-auth]]
|
||||||
- [[caddy-web-server]]
|
- [[caddy-web-server]]
|
||||||
- [[cel-shading-style]]
|
- [[cel-shading-style]]
|
||||||
- [[centralized-agent-architecture]]
|
- [[centralized-agent-architecture]]
|
||||||
|
- [[certainty-based-loss]] — 通过 argmin(loss) + argmax(certainty) 双 tick 选择实现原生自适应计算
|
||||||
- [[certainty-based-rewards]]
|
- [[certainty-based-rewards]]
|
||||||
- [[chain-of-thought]]
|
- [[chain-of-thought]]
|
||||||
- [[chaitin-algorithmic-information-theory]]
|
- [[chaitin-algorithmic-information-theory]]
|
||||||
@@ -38,12 +44,15 @@
|
|||||||
- [[cl-bench-life]]
|
- [[cl-bench-life]]
|
||||||
- [[classifier-free-guidance-language]] — CFG 在语言扩散模型中的应用
|
- [[classifier-free-guidance-language]] — CFG 在语言扩散模型中的应用
|
||||||
- [[clawless]]
|
- [[clawless]]
|
||||||
|
- [[clawless-ai-agent-security]]
|
||||||
- [[coarse-grained-counting]]
|
- [[coarse-grained-counting]]
|
||||||
- [[cognitive-architecture]]
|
- [[cognitive-architecture]]
|
||||||
- [[completeness-logic]]
|
- [[completeness-logic]]
|
||||||
|
- [[composable-base-model-architecture]] — 基础模型池 + 共享组件的可组合建模框架
|
||||||
- [[compressed-sparse-attention]]
|
- [[compressed-sparse-attention]]
|
||||||
- [[computability-theory]]
|
- [[computability-theory]]
|
||||||
- [[computerized-adaptive-testing]]
|
- [[computerized-adaptive-testing]]
|
||||||
|
- [[conditional-model-dispatcher]] — ZCP + 历史 EMA 驱动的模型选择与条件增强调度器
|
||||||
- [[confidence-correctness-alignment]]
|
- [[confidence-correctness-alignment]]
|
||||||
- [[consistency-logic]]
|
- [[consistency-logic]]
|
||||||
- [[context-blue-clique]]
|
- [[context-blue-clique]]
|
||||||
@@ -51,27 +60,38 @@
|
|||||||
- [[context-learning]]
|
- [[context-learning]]
|
||||||
- [[context-misuse]]
|
- [[context-misuse]]
|
||||||
- [[continuous-diffusion-language-models]] — 连续嵌入空间中的扩散语言模型
|
- [[continuous-diffusion-language-models]] — 连续嵌入空间中的扩散语言模型
|
||||||
|
- [[continuous-thought-machine]] — CTM:以神经时序动态和同步为核心计算原理的新架构
|
||||||
- [[continuum-hypothesis]]
|
- [[continuum-hypothesis]]
|
||||||
- [[cramer-rao-lower-bound]]
|
- [[cramer-rao-lower-bound]]
|
||||||
- [[crawl4ai]]
|
- [[crawl4ai]]
|
||||||
|
- [[crawl4ai-open-source-web-crawler]]
|
||||||
- [[critical-failures]] — 关键失败:稀疏但严重的错误解释了约80%的文档退化
|
- [[critical-failures]] — 关键失败:稀疏但严重的错误解释了约80%的文档退化
|
||||||
- [[curvine-distributed-cache]]
|
- [[curvine-distributed-cache]]
|
||||||
|
- [[darlow-ctm-2025]] — CTM: 以神经同步为表示的持续思考机器 (NeurIPS 2025)
|
||||||
- [[darwin-godel-machine]]
|
- [[darwin-godel-machine]]
|
||||||
|
- [[data-slice]] — 任务特定的关系数据库子集,DIME 的核心数据对象
|
||||||
- [[decentralized-agent-architecture]]
|
- [[decentralized-agent-architecture]]
|
||||||
- [[deepseek-v4-flash]]
|
- [[deepseek-v4-flash]]
|
||||||
|
- [[deepseek-v4-million-token-context]]
|
||||||
- [[deepseek-vit]]
|
- [[deepseek-vit]]
|
||||||
- [[delegate-52]] — Microsoft 基准:310工作环境 × 52专业领域,评估LLM委托工作就绪性
|
- [[delegate-52]] — Microsoft 基准:310工作环境 × 52专业领域,评估LLM委托工作就绪性
|
||||||
- [[delegated-work]] — 委托工作:新兴LLM交互范式,用户监督模型代其完成任务
|
- [[delegated-work]] — 委托工作:新兴LLM交互范式,用户监督模型代其完成任务
|
||||||
- [[depth-scaling-signal-degradation]]
|
- [[depth-scaling-signal-degradation]]
|
||||||
- [[diagonal-ramsey-number]]
|
- [[diagonal-ramsey-number]]
|
||||||
- [[diagonalization-method]]
|
- [[diagonalization-method]]
|
||||||
|
- [[dime-dynamic-in-database-modeling-engine]] — DIME:NeurIDA 的核心动态建模引擎
|
||||||
- [[discrete-diffusion-language-models]] — 离散 token 空间中的扩散语言模型
|
- [[discrete-diffusion-language-models]] — 离散 token 空间中的扩散语言模型
|
||||||
- [[distractor-context]] — 干扰上下文:话题相关但无需编辑的文档,模拟不完美检索精度
|
- [[distractor-context]] — 干扰上下文:话题相关但无需编辑的文档,模拟不完美检索精度
|
||||||
- [[document-degradation]] — 文档退化:LLM在长委托工作流中静默破坏文档内容的现象
|
- [[document-degradation]] — 文档退化:LLM在长委托工作流中静默破坏文档内容的现象
|
||||||
- [[domain-knowledge-reasoning]]
|
- [[domain-knowledge-reasoning]]
|
||||||
- [[domain-specific-evaluation]] — 领域特定评估:每个领域自定义解析器和语义等价评分的评估方法
|
- [[domain-specific-evaluation]] — 领域特定评估:每个领域自定义解析器和语义等价评分的评估方法
|
||||||
|
- [[dou-cl-bench]]
|
||||||
- [[duo-attention]]
|
- [[duo-attention]]
|
||||||
|
- [[dynamic-in-database-modeling]] — 从共享组件在查询时装配定制模型的新范式
|
||||||
- [[dynamic-mode-decomposition]]
|
- [[dynamic-mode-decomposition]]
|
||||||
|
- [[dynamic-model-fusion]] — 上下文感知的选择性关系融合模块
|
||||||
|
- [[dynamic-relation-modeling]] — 跨表关系结构感知的消息传递
|
||||||
|
- [[elf-embedded-language-flows]] — ELF: 连续嵌入空间中的 Flow Matching 语言扩散模型 (2026)
|
||||||
- [[embedded-language-flows]] — ELF: 连续嵌入流匹配语言模型
|
- [[embedded-language-flows]] — ELF: 连续嵌入流匹配语言模型
|
||||||
- [[eml-operator]]
|
- [[eml-operator]]
|
||||||
- [[empirical-discovery-simulation]]
|
- [[empirical-discovery-simulation]]
|
||||||
@@ -96,9 +116,11 @@
|
|||||||
- [[geometric-ramsey-theory]]
|
- [[geometric-ramsey-theory]]
|
||||||
- [[glitch-art-style]]
|
- [[glitch-art-style]]
|
||||||
- [[godel-incompleteness-theorems]]
|
- [[godel-incompleteness-theorems]]
|
||||||
|
- [[godel-incompleteness-tutorial]]
|
||||||
- [[godel-numbering]]
|
- [[godel-numbering]]
|
||||||
- [[goodsteins-theorem]]
|
- [[goodsteins-theorem]]
|
||||||
- [[gpt-image2]]
|
- [[gpt-image2]]
|
||||||
|
- [[gpt-image2-prompt-collection]]
|
||||||
- [[gravitino-unified-metadata]]
|
- [[gravitino-unified-metadata]]
|
||||||
- [[greedy-context-screening]]
|
- [[greedy-context-screening]]
|
||||||
- [[green-tao-theorem]]
|
- [[green-tao-theorem]]
|
||||||
@@ -106,15 +128,20 @@
|
|||||||
- [[grouped-query-attention]]
|
- [[grouped-query-attention]]
|
||||||
- [[halftone-print-style]]
|
- [[halftone-print-style]]
|
||||||
- [[halting-problem]]
|
- [[halting-problem]]
|
||||||
|
- [[he-urlvr-sharpening-2026]]
|
||||||
- [[heavily-compressed-attention]]
|
- [[heavily-compressed-attention]]
|
||||||
- [[hilberts-program]]
|
- [[hilberts-program]]
|
||||||
- [[human-agent-trust]]
|
- [[human-agent-trust]]
|
||||||
- [[human-centered-ai]]
|
- [[human-centered-ai]]
|
||||||
|
- [[hunyuan-team-cl-bench-life]]
|
||||||
- [[hybrid-attention-architecture]]
|
- [[hybrid-attention-architecture]]
|
||||||
- [[hyperagents]]
|
- [[hyperagents]]
|
||||||
- [[hypergraph-ramsey-number]]
|
- [[hypergraph-ramsey-number]]
|
||||||
- [[identity-reference-resolution]]
|
- [[identity-reference-resolution]]
|
||||||
- [[image-generation-prompt-design]]
|
- [[image-generation-prompt-design]]
|
||||||
|
- [[in-database-analytics]] — 在 DBMS 内部直接执行 ML/分析任务的方法论
|
||||||
|
- [[internal-ticks]] — 与数据维度解耦的内部时序,CTM 的「思考步骤」展开维度
|
||||||
|
- [[internal-world-model]] — agent 内部构建的环境表征,在 CTM 迷宫任务中涌现
|
||||||
- [[intrinsic-rewards-sharpening]]
|
- [[intrinsic-rewards-sharpening]]
|
||||||
- [[jagged-frontier]] — 锯齿前沿:AI模型能力在不同领域间不均衡、不可预测的分布
|
- [[jagged-frontier]] — 锯齿前沿:AI模型能力在不同领域间不均衡、不可预测的分布
|
||||||
- [[klein-blue]]
|
- [[klein-blue]]
|
||||||
@@ -125,10 +152,15 @@
|
|||||||
- [[koopman-theory]]
|
- [[koopman-theory]]
|
||||||
- [[kv-cache-bottleneck]]
|
- [[kv-cache-bottleneck]]
|
||||||
- [[kvcache-transfer]]
|
- [[kvcache-transfer]]
|
||||||
|
- [[laban-llms-corrupt-documents-delegate]] — "LLMs Corrupt Your Documents When You Delegate" — DELEGATE-52
|
||||||
- [[length-extrapolation]] — 长度外推:让 LLM 处理超出预训练窗口的序列长度
|
- [[length-extrapolation]] — 长度外推:让 LLM 处理超出预训练窗口的序列长度
|
||||||
|
- [[li-amd-human-perception]]
|
||||||
- [[linear-attention-methods]]
|
- [[linear-attention-methods]]
|
||||||
|
- [[liu-koopa-2023]]
|
||||||
- [[llm-applications]]
|
- [[llm-applications]]
|
||||||
|
- [[llm-attention-survey-2026]]
|
||||||
- [[llm-evaluation-benchmarks]]
|
- [[llm-evaluation-benchmarks]]
|
||||||
|
- [[log]] — 变更日志
|
||||||
- [[long-context-understanding]]
|
- [[long-context-understanding]]
|
||||||
- [[long-horizon-evaluation]] — 长视界评估:通过延长交互揭示短评估中不可见的退化模式
|
- [[long-horizon-evaluation]] — 长视界评估:通过延长交互揭示短评估中不可见的退化模式
|
||||||
- [[lost-in-the-middle]]
|
- [[lost-in-the-middle]]
|
||||||
@@ -156,15 +188,23 @@
|
|||||||
- [[multimodal-large-language-model]]
|
- [[multimodal-large-language-model]]
|
||||||
- [[muon-optimizer]]
|
- [[muon-optimizer]]
|
||||||
- [[native-sparse-attention]]
|
- [[native-sparse-attention]]
|
||||||
|
- [[neural-synchronization]] — 将神经元激活历史的时序相关性直接用作潜在表示
|
||||||
|
- [[neurida]] — Neural In-Database Analytics:自主端到端库内分析系统
|
||||||
|
- [[neuron-level-models]] — 每个神经元拥有私有参数的 MLP,替代统一激活函数
|
||||||
|
- [[neuron-pairing]] — 对 O(D²) 同步矩阵的子采样策略,用于效率与表达力平衡
|
||||||
- [[neuroscience]]
|
- [[neuroscience]]
|
||||||
|
- [[nikolopoulos-spurious-predictability]]
|
||||||
- [[non-stationary-time-series]]
|
- [[non-stationary-time-series]]
|
||||||
- [[ntk-aware-interpolation]]
|
- [[ntk-aware-interpolation]]
|
||||||
|
- [[odrzywolek-eml-single-operator]]
|
||||||
- [[on-policy-distillation]]
|
- [[on-policy-distillation]]
|
||||||
|
- [[oppo-multimodal-data-lake]]
|
||||||
- [[paley-graph]]
|
- [[paley-graph]]
|
||||||
- [[paris-harrington-theorem]]
|
- [[paris-harrington-theorem]]
|
||||||
- [[path-tracing]]
|
- [[path-tracing]]
|
||||||
- [[peano-arithmetic]]
|
- [[peano-arithmetic]]
|
||||||
- [[perception-gap]]
|
- [[perception-gap]]
|
||||||
|
- [[pre-activation-history]] — 每个神经元维护的滚动前激活缓冲区,NLM 的输入
|
||||||
- [[prefill-as-a-service]]
|
- [[prefill-as-a-service]]
|
||||||
- [[prefill-decode-disaggregation]]
|
- [[prefill-decode-disaggregation]]
|
||||||
- [[prefix-matching]]
|
- [[prefix-matching]]
|
||||||
@@ -173,21 +213,28 @@
|
|||||||
- [[procedural-task-execution]]
|
- [[procedural-task-execution]]
|
||||||
- [[program-synthesis]]
|
- [[program-synthesis]]
|
||||||
- [[prompt-caching]]
|
- [[prompt-caching]]
|
||||||
|
- [[prompt-caching-architecture]]
|
||||||
- [[prompt-layering]]
|
- [[prompt-layering]]
|
||||||
- [[prompt-reverse-engineering]]
|
- [[prompt-reverse-engineering]]
|
||||||
|
- [[qin-prfaas-cross-datacenter]]
|
||||||
|
- [[query-intent-analyzer]] — LLM 驱动的 NLQ 解析器,输出结构化任务/数据画像
|
||||||
- [[rag-systems]]
|
- [[rag-systems]]
|
||||||
- [[ramsey-context-cache]]
|
- [[ramsey-context-cache]]
|
||||||
|
- [[ramsey-context-construction]]
|
||||||
- [[ramsey-context-graph]]
|
- [[ramsey-context-graph]]
|
||||||
- [[ramsey-context-template]]
|
- [[ramsey-context-template]]
|
||||||
- [[ramsey-numbers]]
|
- [[ramsey-numbers]]
|
||||||
|
- [[ramsey-numbers-survey]]
|
||||||
- [[ramsey-theory]]
|
- [[ramsey-theory]]
|
||||||
- [[ramsey-theory-applications]]
|
- [[ramsey-theory-applications]]
|
||||||
- [[random-graph-theory]]
|
- [[random-graph-theory]]
|
||||||
|
- [[README]] — Wiki 说明
|
||||||
- [[real-life-context-learning]]
|
- [[real-life-context-learning]]
|
||||||
- [[rectified-flows]] — Flow Matching 中的直线插值路径
|
- [[rectified-flows]] — Flow Matching 中的直线插值路径
|
||||||
- [[recursive-self-improvement]]
|
- [[recursive-self-improvement]]
|
||||||
- [[reference-gap]]
|
- [[reference-gap]]
|
||||||
- [[reinforcement-learning-trading]]
|
- [[reinforcement-learning-trading]]
|
||||||
|
- [[relational-graph]] — 以 FK-PK 为边的元组图,关系建模的数据结构基础
|
||||||
- [[reverse-proxy-authentication]]
|
- [[reverse-proxy-authentication]]
|
||||||
- [[reward-hacking-llm]]
|
- [[reward-hacking-llm]]
|
||||||
- [[reward-model]]
|
- [[reward-model]]
|
||||||
@@ -199,6 +246,7 @@
|
|||||||
- [[rule-system-application]]
|
- [[rule-system-application]]
|
||||||
- [[russells-paradox]]
|
- [[russells-paradox]]
|
||||||
- [[russian-constructivism]]
|
- [[russian-constructivism]]
|
||||||
|
- [[SCHEMA]] — Wiki 结构规范
|
||||||
- [[sde-sampler-language]] — 语言扩散中的随机微分方程采样器
|
- [[sde-sampler-language]] — 语言扩散中的随机微分方程采样器
|
||||||
- [[secure-containers]]
|
- [[secure-containers]]
|
||||||
- [[seer-attention]]
|
- [[seer-attention]]
|
||||||
@@ -211,19 +259,27 @@
|
|||||||
- [[singularity]]
|
- [[singularity]]
|
||||||
- [[sink-token]] — 可学习汇 Token:预训练时添加专用 Token 作为唯一注意力汇
|
- [[sink-token]] — 可学习汇 Token:预训练时添加专用 Token 作为唯一注意力汇
|
||||||
- [[softmax-off-by-one]] — SoftMax₁:允许丢弃多余注意力的 SoftMax 变体
|
- [[softmax-off-by-one]] — SoftMax₁:允许丢弃多余注意力的 SoftMax 变体
|
||||||
|
- [[song-agent-network-taxonomy]]
|
||||||
- [[sparse-attention-patterns]]
|
- [[sparse-attention-patterns]]
|
||||||
- [[specialist-training-pipeline]]
|
- [[specialist-training-pipeline]]
|
||||||
- [[specialized-rl]]
|
- [[specialized-rl]]
|
||||||
- [[specialized-sft]]
|
- [[specialized-sft]]
|
||||||
|
- [[spiking-neural-networks]] — 使用离散脉冲和事件驱动计算的生物启发神经网络
|
||||||
- [[spurious-predictability]]
|
- [[spurious-predictability]]
|
||||||
|
- [[streaming-llm]] — StreamingLLM: 基于注意力汇的无限长流式语言模型推理框架 (ICLR 2024)
|
||||||
- [[stub-pattern]]
|
- [[stub-pattern]]
|
||||||
- [[subquadratic-transformer-alternatives]]
|
- [[subquadratic-transformer-alternatives]]
|
||||||
- [[symbolic-regression]]
|
- [[symbolic-regression]]
|
||||||
|
- [[synapse-model]] — CTM 的 U-Net 风格循环突触结构,神经元间信息共享引擎
|
||||||
- [[system-2-thinking]]
|
- [[system-2-thinking]]
|
||||||
- [[system-message-abuse]]
|
- [[system-message-abuse]]
|
||||||
- [[szemerédi-regularity-lemma]]
|
- [[szemerédi-regularity-lemma]]
|
||||||
|
- [[tabular-foundation-models]] — 大规模表格数据预训练的基础模型(TabPFN, TabICL 等)
|
||||||
|
- [[tao-klowden-ai-mathematical-methods]]
|
||||||
|
- [[temporal-decay-neural]] — 每对神经元可学习的指数衰减参数,控制同步的时间尺度
|
||||||
- [[test-time-scaling]]
|
- [[test-time-scaling]]
|
||||||
- [[test-time-training-rl]]
|
- [[test-time-training-rl]]
|
||||||
|
- [[thinking-with-visual-primitives]]
|
||||||
- [[time-variant-dynamics]]
|
- [[time-variant-dynamics]]
|
||||||
- [[token-efficiency]]
|
- [[token-efficiency]]
|
||||||
- [[tool-registry]]
|
- [[tool-registry]]
|
||||||
@@ -236,11 +292,19 @@
|
|||||||
- [[window-attention]] — 窗口注意力:仅缓存最近 Token 的朴素方案,因驱逐注意力汇而崩溃
|
- [[window-attention]] — 窗口注意力:仅缓存最近 Token 的朴素方案,因驱逐注意力汇而崩溃
|
||||||
- [[worst-case-threat-model]]
|
- [[worst-case-threat-model]]
|
||||||
- [[x-prediction-parameterization]] — Flow Matching 中直接预测干净数据的参数化
|
- [[x-prediction-parameterization]] — Flow Matching 中直接预测干净数据的参数化
|
||||||
|
- [[xing-trails-2024]] — Trails: 数据库原生的深度神经网络模型选择 (VLDB 2024)
|
||||||
|
- [[zeng-dynamic-model-slicing-2024]] — 数据库内的动态模型切片技术 (VLDB 2024)
|
||||||
|
- [[zeng-neurida-2025]] — NeurIDA: 动态库内建模实现有效的关系数据库分析
|
||||||
|
- [[zero-cost-proxies]] — 无需完整训练即可估计模型性能的 NAS 技术
|
||||||
|
- [[zhang-hyperagents]]
|
||||||
|
- [[zhao-neurdb-2025]] — NeurDB: AI 驱动的自主数据库 (CIDR 2025)
|
||||||
|
- [[zhu-moda-mixture-of-depths]]
|
||||||
|
|
||||||
## Papers
|
## Papers
|
||||||
|
|
||||||
- [[behrouz-memory-caching-rnn]]
|
- [[behrouz-memory-caching-rnn]]
|
||||||
- [[clawless-ai-agent-security]]
|
- [[clawless-ai-agent-security]]
|
||||||
|
- [[darlow-ctm-2025]] — CTM: 以神经同步为表示的持续思考机器 (NeurIPS 2025)
|
||||||
- [[deepseek-v4-million-token-context]]
|
- [[deepseek-v4-million-token-context]]
|
||||||
- [[dou-cl-bench]]
|
- [[dou-cl-bench]]
|
||||||
- [[elf-embedded-language-flows]] — ELF: 连续嵌入空间中的 Flow Matching 语言扩散模型 (2026)
|
- [[elf-embedded-language-flows]] — ELF: 连续嵌入空间中的 Flow Matching 语言扩散模型 (2026)
|
||||||
@@ -259,7 +323,11 @@
|
|||||||
- [[streaming-llm]] — StreamingLLM: 基于注意力汇的无限长流式语言模型推理框架 (ICLR 2024)
|
- [[streaming-llm]] — StreamingLLM: 基于注意力汇的无限长流式语言模型推理框架 (ICLR 2024)
|
||||||
- [[tao-klowden-ai-mathematical-methods]]
|
- [[tao-klowden-ai-mathematical-methods]]
|
||||||
- [[thinking-with-visual-primitives]]
|
- [[thinking-with-visual-primitives]]
|
||||||
|
- [[xing-trails-2024]] — Trails: 数据库原生的深度神经网络模型选择 (VLDB 2024)
|
||||||
|
- [[zeng-dynamic-model-slicing-2024]] — 数据库内的动态模型切片技术 (VLDB 2024)
|
||||||
|
- [[zeng-neurida-2025]] — NeurIDA: 动态库内建模实现有效的关系数据库分析
|
||||||
- [[zhang-hyperagents]]
|
- [[zhang-hyperagents]]
|
||||||
|
- [[zhao-neurdb-2025]] — NeurDB: AI 驱动的自主数据库 (CIDR 2025)
|
||||||
- [[zhu-moda-mixture-of-depths]]
|
- [[zhu-moda-mixture-of-depths]]
|
||||||
|
|
||||||
## Articles
|
## Articles
|
||||||
@@ -275,4 +343,8 @@
|
|||||||
|
|
||||||
- [[SCHEMA]] — Wiki 结构规范
|
- [[SCHEMA]] — Wiki 结构规范
|
||||||
- [[log]] — 变更日志
|
- [[log]] — 变更日志
|
||||||
- [[README]] — Wiki 说明
|
- [[README]] — Wiki 说明
|
||||||
|
|
||||||
|
## Reviews
|
||||||
|
|
||||||
|
- [[ctm-review-20260515]] — CTM 论文集成 Review (2026-05-15)
|
||||||
17
log.md
17
log.md
@@ -7,6 +7,23 @@
|
|||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
## 2026-05-15 — ingest | Continuous Thought Machines (arXiv:2505.05522, NeurIPS 2025)
|
||||||
|
- 添加论文 [[darlow-ctm-2025]]: "Continuous Thought Machines" — 以神经同步为表示的新型架构,NLMs + Neural Synchronization 两大创新
|
||||||
|
- 新增 11 个概念页: [[continuous-thought-machine]], [[neuron-level-models]], [[neural-synchronization]], [[internal-ticks]], [[synapse-model]], [[certainty-based-loss]], [[adaptive-computation-time]], [[internal-world-model]], [[neuron-pairing]], [[temporal-decay-neural]], [[pre-activation-history]]
|
||||||
|
- 核心创新: 每个神经元私有 NLM 替代统一激活函数 + 激活历史内积作为同步表示
|
||||||
|
- 实验亮点: 迷宫泛化(39×39→99×99)、ImageNet 原生自适应计算、Parity 可解释策略
|
||||||
|
- 作者含 Llion Jones (Attention Is All You Need 合著者), 机构: Sakana AI
|
||||||
|
- 来源: https://arxiv.org/abs/2505.05522
|
||||||
|
|
||||||
|
## 2026-05-15 — ingest | NeurIDA (arXiv:2512.08483v3, cs.DB 2025)
|
||||||
|
- 添加论文 [[zeng-neurida-2025]]: "NeurIDA: Dynamic Modeling for Effective In-Database Analytics" — 端到端自主库内分析系统,通过动态装配定制模型解决 ML 静态性与 RDBMS 动态性的范式鸿沟
|
||||||
|
- 新增 15 个概念页: [[neurida]], [[dynamic-in-database-modeling]], [[dime-dynamic-in-database-modeling-engine]], [[composable-base-model-architecture]], [[query-intent-analyzer]], [[conditional-model-dispatcher]], [[zero-cost-proxies]], [[dynamic-relation-modeling]], [[dynamic-model-fusion]], [[data-slice]], [[base-table-embedding]], [[in-database-analytics]], [[relational-graph]], [[analytical-report-synthesizer]], [[tabular-foundation-models]]
|
||||||
|
- 核心创新: DIME 四阶段管线(表嵌入→关系建模→上下文融合→预测),从共享组件在查询时动态装配定制模型
|
||||||
|
- 实验: 5 数据集 10 任务,AUC-ROC ↑12%, MAE ↓25%, 延迟开销仅 1.1×–2.1×
|
||||||
|
- 来源: https://arxiv.org/abs/2512.08483v3
|
||||||
|
|
||||||
## 2026-05-12 — ingest | TBA (arXiv:2503.18929, NeurIPS 2025)
|
## 2026-05-12 — ingest | TBA (arXiv:2503.18929, NeurIPS 2025)
|
||||||
- 添加论文 [[bartoldson-tba-2025]]: "Trajectory Balance with Asynchrony" — GFlowNet TB 目标 × 异步分布式 RL
|
- 添加论文 [[bartoldson-tba-2025]]: "Trajectory Balance with Asynchrony" — GFlowNet TB 目标 × 异步分布式 RL
|
||||||
- 新增 8 个概念页: [[tba]], [[trajectory-balance-objective]], [[asynchronous-rl-llm]], [[off-policy-llm-post-training]], [[gflownet-fine-tuning]], [[replay-buffer-rl-llm]], [[searcher-trainer-decoupling]], [[reward-recency-sampling]]
|
- 新增 8 个概念页: [[tba]], [[trajectory-balance-objective]], [[asynchronous-rl-llm]], [[off-policy-llm-post-training]], [[gflownet-fine-tuning]], [[replay-buffer-rl-llm]], [[searcher-trainer-decoupling]], [[reward-recency-sampling]]
|
||||||
|
|||||||
88
papers/darlow-ctm-2025.md
Normal file
88
papers/darlow-ctm-2025.md
Normal file
@@ -0,0 +1,88 @@
|
|||||||
|
---
|
||||||
|
title: "Continuous Thought Machines (CTM)"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: paper
|
||||||
|
tags: [neural-architecture, temporal-dynamics, biological-plausibility, synchronization, recurrence]
|
||||||
|
sources: [raw/papers/darlow-ctm-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# Continuous Thought Machines (CTM)
|
||||||
|
|
||||||
|
**Luke Darlow, Ciaran Regan, Sebastian Risi, Jeffrey Seely, Llion Jones** — Sakana AI, University of Tsukuba, IT University of Copenhagen
|
||||||
|
**arXiv:** [2505.05522v4](https://arxiv.org/abs/2505.05522) | cs.LG | **NeurIPS 2025**
|
||||||
|
|
||||||
|
> Llion Jones 是 "Attention Is All You Need" 的合著者之一。
|
||||||
|
|
||||||
|
## 核心问题
|
||||||
|
|
||||||
|
**生物大脑**依赖复杂的时态神经动力学处理信息,而**人工神经网络**有意抽象掉了单个神经元的时序复杂性以简化大规模训练。这种抽象虽然有效,但导致了灵活人类认知与当前 AI 能力之间的鸿沟。
|
||||||
|
|
||||||
|
CTM 的核心赌注:**将时间重新引入神经计算是推进 AI 的关键**。
|
||||||
|
|
||||||
|
## 两大创新
|
||||||
|
|
||||||
|
### 1. [[neuron-level-models|Neuron-Level Models (NLMs)]]
|
||||||
|
每个神经元拥有**私有的权重参数**(深度为 1 的 MLP),处理其 M 步 [[pre-activation-history|前激活历史]] 以产生复杂的时态动力学。与传统激活函数(ReLU/GELU 对所有神经元统一)形成鲜明对比。
|
||||||
|
|
||||||
|
### 2. [[neural-synchronization|Neural Synchronization as Representation]]
|
||||||
|
直接将神经元群体活动的时序相关性(激活历史的**内积**)作为潜在表示,用于注意力查询(qt)和输出预测(yt)。这与传统网络在单个时间点的"快照"表示根本不同。
|
||||||
|
|
||||||
|
## 架构
|
||||||
|
|
||||||
|
```
|
||||||
|
Input → FeatureExtractor → Cross-Attention ← qt (from sync)
|
||||||
|
↓ ot
|
||||||
|
Synapse Model → at (pre-activations) → NLMs → zt+1 (post-activations)
|
||||||
|
↑ ↓
|
||||||
|
└────────── concat(zt, ot) ←────────────────────┘
|
||||||
|
→ Sync Matrix St
|
||||||
|
```
|
||||||
|
|
||||||
|
### 关键组件
|
||||||
|
|
||||||
|
| 组件 | 作用 |
|
||||||
|
|------|------|
|
||||||
|
| [[internal-ticks|Internal Ticks]] | 与数据维度解耦的内部时序 t∈{1,...,T},实现迭代精炼 |
|
||||||
|
| [[synapse-model|Synapse Model]] | U-Net 风格 MLP,神经元间信息共享的循环结构 |
|
||||||
|
| [[neuron-level-models|NLMs]] | 每个神经元的私有 MLP,处理前激活历史 |
|
||||||
|
| [[neural-synchronization|Sync Matrix]] | 激活历史内积 S^t = Z^t·(Z^t)⊺ |
|
||||||
|
| [[neuron-pairing|Neuron Pairing]] | 对 O(D²) 同步矩阵的子采样策略,选出 Dout/Daction 对 |
|
||||||
|
| [[temporal-decay-neural|Temporal Decay r_ij]] | 每对神经元可学习的指数衰减,控制时间尺度 |
|
||||||
|
|
||||||
|
### [[certainty-based-loss|Certainty-Based Loss]]
|
||||||
|
|
||||||
|
不固定使用某个内部 tick 的输出,而是动态选择:
|
||||||
|
- **t₁ = argmin(L)** — 损失最小的 tick
|
||||||
|
- **t₂ = argmax(C)** — 确定性最高的 tick
|
||||||
|
|
||||||
|
L = (L_t₁ + L_t₂) / 2,实现**原生自适应计算**(无需单独 halting 模块)。
|
||||||
|
|
||||||
|
## 实验亮点
|
||||||
|
|
||||||
|
### 🧩 2D Mazes(39×39 → 99×99)
|
||||||
|
- **无位置编码**,需构建 [[internal-world-model|内部世界模型]]
|
||||||
|
- 显著优于 LSTM/FF 基线
|
||||||
|
- **涌现泛化**:训练于 100 步路径,可泛化到更远路径和更大的 99×99 迷宫
|
||||||
|
|
||||||
|
### 🖼️ ImageNet-1K 分类
|
||||||
|
- 原生 [[adaptive-computation-time|自适应计算]]:简单样本可在 <10 ticks 停止
|
||||||
|
- **自然校准**(calibration):无需专门技术即达到优秀校准
|
||||||
|
- 涌现"环顾四周"(look around) 行为:模型在没有训练信号的情况下学习顺序扫描图像
|
||||||
|
|
||||||
|
### 🧮 Parity 计算
|
||||||
|
- 学习**可解释的算法策略**(如周期性重置、前瞻性预测)
|
||||||
|
- CTM 在 64 位序列上显著优于 LSTM
|
||||||
|
|
||||||
|
## 关键洞察
|
||||||
|
|
||||||
|
1. **从"统一激活函数"到"私有神经元模型"**:这不仅是架构创新,更是对神经元抽象层次的重新思考
|
||||||
|
2. **同步作为表示**:将时序相关性直接用作表示,开辟了高基数表示空间,天然适合捕获"思考"的时序特征
|
||||||
|
3. **不要位置编码**:CTM 完全通过内部动态建立空间理解,暗示时间可能是比空间更基础的表示维度
|
||||||
|
4. **涌现属性丰富**:适应性计算、校准、环顾四周、行波——均无专门设计,从同一核心架构自然涌现
|
||||||
|
|
||||||
|
## 相关概念
|
||||||
|
|
||||||
|
- [[adaptive-computation-time|Adaptive Computation Time (ACT)]] 的传统方案需要显式 halting 模块,CTM 通过 loss 设计自然实现
|
||||||
|
- [[internal-world-model|Internal World Models]]:Ha & Schmidhuber (2018) 的经典概念
|
||||||
|
- 与 [[spiking-neural-networks|SNN]] 的关系:共享生物学灵感但路径不同——CTM 使用连续值 + 梯度优化,SNN 使用离散脉冲 + 事件驱动
|
||||||
20
papers/xing-trails-2024.md
Normal file
20
papers/xing-trails-2024.md
Normal file
@@ -0,0 +1,20 @@
|
|||||||
|
---
|
||||||
|
title: "Trails: Database Native Model Selection (VLDB 2024)"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: paper
|
||||||
|
tags: [database, machine-learning, model-selection]
|
||||||
|
confidence: medium
|
||||||
|
---
|
||||||
|
|
||||||
|
# Trails: Database Native Model Selection
|
||||||
|
|
||||||
|
**Naili Xing, Shaofeng Cai, Gang Chen, Zhaojing Luo, Beng Chin Ooi, Jian Pei** — VLDB 2024
|
||||||
|
|
||||||
|
数据库原生的深度神经网络模型选择系统,是 NeurIDA 的前置工作之一。探索了在数据库系统中自动化选择最优深度神经网络模型的方法。
|
||||||
|
|
||||||
|
## 与 NeurIDA 的关系
|
||||||
|
|
||||||
|
Trails 关注的是**模型选择**(从候选池中选最优),而 [[zeng-neurida-2025|NeurIDA]] 更进一步实现了**动态模型构造**(从共享组件装配定制模型)。
|
||||||
|
|
||||||
|
> 📝 占位页面 — 待完整 ingest 后更新
|
||||||
20
papers/zeng-dynamic-model-slicing-2024.md
Normal file
20
papers/zeng-dynamic-model-slicing-2024.md
Normal file
@@ -0,0 +1,20 @@
|
|||||||
|
---
|
||||||
|
title: "Powering In-Database Dynamic Model Slicing for Structured Data Analytics (VLDB 2024)"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: paper
|
||||||
|
tags: [database, machine-learning, dynamic-modeling]
|
||||||
|
confidence: medium
|
||||||
|
---
|
||||||
|
|
||||||
|
# Powering In-Database Dynamic Model Slicing
|
||||||
|
|
||||||
|
**Lingze Zeng, Naili Xing, Shaofeng Cai, Gang Chen, Beng Chin Ooi, Jian Pei, Yuncheng Wu** — VLDB 2024
|
||||||
|
|
||||||
|
NeurIDA 的前置工作,提出了数据库内的动态模型切片(Dynamic Model Slicing)技术,为结构化数据分析提供动力。
|
||||||
|
|
||||||
|
## 与 NeurIDA 的关系
|
||||||
|
|
||||||
|
该工作是 [[zeng-neurida-2025|NeurIDA]] 系统中 [[dynamic-in-database-modeling|动态库内建模]] 范式的早期探索和基础。
|
||||||
|
|
||||||
|
> 📝 占位页面 — 待完整 ingest 后更新
|
||||||
75
papers/zeng-neurida-2025.md
Normal file
75
papers/zeng-neurida-2025.md
Normal file
@@ -0,0 +1,75 @@
|
|||||||
|
---
|
||||||
|
title: "NeurIDA: Dynamic Modeling for Effective In-Database Analytics"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: paper
|
||||||
|
tags: [database, machine-learning, tabular-data, autonomous-system, in-database-analytics]
|
||||||
|
sources: [raw/papers/zeng-neurida-2025.md]
|
||||||
|
---
|
||||||
|
|
||||||
|
# NeurIDA: Dynamic Modeling for Effective In-Database Analytics
|
||||||
|
|
||||||
|
**Lingze Zeng, Naili Xing, Shaofeng Cai, Peng Lu, Gang Chen, Jian Pei, Beng Chin Ooi** — NUS, Zhejiang University, Duke University
|
||||||
|
**arXiv:** [2512.08483v3](https://arxiv.org/abs/2512.08483v3) | cs.DB | 2025-12-15
|
||||||
|
|
||||||
|
## 核心问题
|
||||||
|
|
||||||
|
RDBMS 的核心设计是**动态性**(支持多样且演化的分析查询),而传统 ML 模型是**静态的**(为单一任务训练、部署后固定)。当一个新分析任务到来时,现有模型往往无法直接适应,必须从头构建数据到模型的 pipeline,开发成本高、无法规模化。
|
||||||
|
|
||||||
|
**根本矛盾**:ML 模型的「刚性」vs RDBMS 环境的「动态性」。
|
||||||
|
|
||||||
|
## 方法论贡献
|
||||||
|
|
||||||
|
NeurIDA 是一个**自主端到端系统**,通过 **动态库内建模(Dynamic In-Database Modeling)** 范式解决上述矛盾:
|
||||||
|
|
||||||
|
```
|
||||||
|
NLQ → Query Intent Analyzer → Conditional Model Dispatcher → DIME → Analytical Report Synthesizer → 报告
|
||||||
|
```
|
||||||
|
|
||||||
|
### 四大组件
|
||||||
|
|
||||||
|
1. **[[query-intent-analyzer|Query Intent Analyzer]]** — LLM 驱动的自然语言查询解析,从 NLQ 提取结构化任务画像和数据画像
|
||||||
|
2. **[[conditional-model-dispatcher|Conditional Model Dispatcher]]** — 用 [[zero-cost-proxies|ZCP]] + 历史 EMA 选择最优基础模型,按需触发模型增强
|
||||||
|
3. **[[dime-dynamic-in-database-modeling-engine|DIME]]** — 核心引擎,动态从 [[composable-base-model-architecture|可组合基础模型架构]] 中装配定制模型
|
||||||
|
4. **[[analytical-report-synthesizer|Analytical Report Synthesizer]]** — LLM 将预测结果转化为可解释分析报告
|
||||||
|
|
||||||
|
### DIME 的四阶段建模
|
||||||
|
|
||||||
|
| 阶段 | 作用 | 共享组件 |
|
||||||
|
|------|------|----------|
|
||||||
|
| [[base-table-embedding|Base Table Embedding]] | 捕获表内语义,双路径编码(基础模型 + 统一元组编码器) | 统一元组编码器 |
|
||||||
|
| [[dynamic-relation-modeling|Dynamic Relation Modeling]] | 通过 FK-PK 图上消息传递,融入跨表关系结构 | 关系感知消息传递模块 |
|
||||||
|
| [[dynamic-model-fusion|Dynamic Model Fusion]] | 上下文感知融合,将关联表信息注入目标表元组 | 上下文感知融合模块 |
|
||||||
|
| Task-Aware Prediction | 基于融合嵌入的最终预测 | 任务特定预测头 |
|
||||||
|
|
||||||
|
### 关键设计
|
||||||
|
|
||||||
|
- **[[composable-base-model-architecture|可组合基础模型架构]]**:异构基础模型池 + 共享模型组件 → 查询时动态装配
|
||||||
|
- **[[data-slice|Data Slice]]**:任务特定的数据库子集,通过 SQL 自动生成
|
||||||
|
- **[[relational-graph|Relational Graph]]**:以 FK-PK 为边的元组图,是 DIME 的数据结构基础
|
||||||
|
|
||||||
|
## 实验结果
|
||||||
|
|
||||||
|
- 5 个真实数据集、10 个分析任务
|
||||||
|
- **分类**:AUC-ROC 提升最高 12%
|
||||||
|
- **回归**:MAE 相对降低 10%–25%
|
||||||
|
- 消融实验证实 Dynamic Relation Modeling 和 Dynamic Model Fusion 的独立贡献
|
||||||
|
- 延迟开销仅 1.1×–2.1×,参数量增加 1.2×–2.5×
|
||||||
|
|
||||||
|
## 关键洞察
|
||||||
|
|
||||||
|
1. **从"为每个任务训一个模型"转向"从共享组件装配模型"** — 这是 ML-RDBMS 集成的范式转变
|
||||||
|
2. **关系结构是预测信号**:当目标表特征稀疏时,关联表的关系建模带来最大增益
|
||||||
|
3. **LLM 作为统一界面**:NLQ 输入 → 结构化画像 → 执行 → 自然语言报告,降低使用门槛
|
||||||
|
|
||||||
|
## 相关概念
|
||||||
|
|
||||||
|
- [[in-database-analytics|In-Database Analytics]] 是更广泛的领域背景
|
||||||
|
- 基础模型池涵盖 [[tabular-foundation-models|Tabular Foundation Models]](TabPFN, TabICL)、TRM(FT-Transformer, ARM-Net, TabM)、LTM(TP-BERTa)等
|
||||||
|
- [[zero-cost-proxies|Zero-Cost Proxies]] 源自 Neural Architecture Search
|
||||||
|
|
||||||
|
## 相关论文
|
||||||
|
|
||||||
|
- [[xing-trails-2024]]: Database Native Model Selection (VLDB 2024)
|
||||||
|
- [[zeng-dynamic-model-slicing-2024]]: Powering In-Database Dynamic Model Slicing (VLDB 2024)
|
||||||
|
- [[zhao-neurdb-2025]]: NeurDB — AI-powered Autonomous Database (CIDR 2025)
|
||||||
20
papers/zhao-neurdb-2025.md
Normal file
20
papers/zhao-neurdb-2025.md
Normal file
@@ -0,0 +1,20 @@
|
|||||||
|
---
|
||||||
|
title: "NeurDB: On the Design and Implementation of an AI-powered Autonomous Database (CIDR 2025)"
|
||||||
|
created: 2026-05-15
|
||||||
|
updated: 2026-05-15
|
||||||
|
type: paper
|
||||||
|
tags: [database, ai, autonomous-system]
|
||||||
|
confidence: medium
|
||||||
|
---
|
||||||
|
|
||||||
|
# NeurDB: AI-powered Autonomous Database
|
||||||
|
|
||||||
|
**Zhanhao Zhao, Shaofeng Cai, Haotian Gao, Hexiang Pan, Siqi Xiang, Naili Xing, Gang Chen, Beng Chin Ooi, Yanyan Shen, Yuncheng Wu, Meihui Zhang** — CIDR 2025
|
||||||
|
|
||||||
|
AI 驱动的自主数据库设计与实现,是 [[in-database-analytics|库内分析]] 方向的重要系统工作之一。
|
||||||
|
|
||||||
|
## 与 NeurIDA 的关系
|
||||||
|
|
||||||
|
NeurDB 和 [[zeng-neurida-2025|NeurIDA]] 来自同一研究组(NUS/ZJU),共同推进 AI 与数据库深度融合的愿景。
|
||||||
|
|
||||||
|
> 📝 占位页面 — 待完整 ingest 后更新
|
||||||
17
raw/papers/darlow-ctm-2025.md
Normal file
17
raw/papers/darlow-ctm-2025.md
Normal file
@@ -0,0 +1,17 @@
|
|||||||
|
---
|
||||||
|
source_url: https://arxiv.org/abs/2505.05522
|
||||||
|
ingested: 2026-05-15
|
||||||
|
sha256: d2d4c033d6f7dd91d6af9eeefa3878c8cee14ac8c1e7acb762b5772c1cf8d004
|
||||||
|
---
|
||||||
|
|
||||||
|
# Continuous Thought Machines
|
||||||
|
|
||||||
|
**Authors:** Luke Darlow, Ciaran Regan, Sebastian Risi, Jeffrey Seely, Llion Jones (Sakana AI, Tokyo; University of Tsukuba; IT University of Copenhagen)
|
||||||
|
|
||||||
|
**arXiv:** 2505.05522v4 | **Category:** cs.LG | **Venue:** NeurIPS 2025
|
||||||
|
|
||||||
|
## Abstract
|
||||||
|
|
||||||
|
Biological brains demonstrate complex neural activity, where neural dynamics are critical to how brains process information. Most artificial neural networks ignore the complexity of individual neurons. We challenge that paradigm. By incorporating neuron-level processing and synchronization, we reintroduce neural timing as a foundational element. We present the Continuous Thought Machine (CTM), a model designed to leverage neural dynamics as its core representation. The CTM has two innovations: (1) neuron-level temporal processing, where each neuron uses unique weight parameters to process incoming histories; and (2) neural synchronization as a latent representation. The CTM aims to strike a balance between neuron abstractions and biological realism.
|
||||||
|
|
||||||
|
We demonstrate the CTM's performance across a range of tasks, including solving 2D mazes, ImageNet-1K classification, parity computation, and more. Beyond displaying rich internal representations and offering a natural avenue for interpretation, the CTM can perform tasks that require complex sequential reasoning. The CTM can also leverage adaptive compute, where it can stop earlier for simpler tasks, or keep computing when faced with more challenging instances.
|
||||||
19
raw/papers/zeng-neurida-2025.md
Normal file
19
raw/papers/zeng-neurida-2025.md
Normal file
@@ -0,0 +1,19 @@
|
|||||||
|
---
|
||||||
|
source_url: https://arxiv.org/abs/2512.08483v3
|
||||||
|
ingested: 2026-05-15
|
||||||
|
sha256: 27f965211912f0ca663f249e72f2a2090294ca4538ada72e5dfc0ccf25f6523d
|
||||||
|
---
|
||||||
|
|
||||||
|
# NeurIDA: Dynamic Modeling for Effective In-Database Analytics
|
||||||
|
|
||||||
|
**Authors:** Lingze Zeng (NUS), Naili Xing (NUS), Shaofeng Cai (NUS), Peng Lu (ZJU), Gang Chen (ZJU), Jian Pei (Duke), Beng Chin Ooi (ZJU)
|
||||||
|
|
||||||
|
**arXiv:** 2512.08483v3 | **Category:** cs.DB | **Date:** 2025-12-15
|
||||||
|
|
||||||
|
## Abstract
|
||||||
|
|
||||||
|
Relational Database Management Systems (RDBMS) manage complex, interrelated data and support a broad spectrum of analytical tasks. With the growing demand for predictive analytics, the deep integration of machine learning (ML) into RDBMS has become critical. However, a fundamental challenge hinders this evolution: conventional ML models are static and task-specific, whereas RDBMS environments are dynamic and must support diverse analytical queries. Each analytical task entails constructing a bespoke pipeline from scratch, which incurs significant development overhead and hence limits wide adoption of ML in analytics.
|
||||||
|
|
||||||
|
We present NeurIDA, an autonomous end-to-end system for in-database analytics that dynamically "tweaks" the best available base model to better serve a given analytical task. In particular, we propose a novel paradigm of dynamic in-database modeling to pre-train a composable base model architecture over the relational data. Upon receiving a task, NeurIDA formulates the task and data profile to dynamically select and configure relevant components from the pool of base models and shared model components for prediction. For friendly user experience, NeurIDA supports natural language queries; it interprets user intent to construct structured task profiles, and generates analytical reports with dedicated LLM agents. By design, NeurIDA enables ease-of-use and yet effective and efficient in-database AI analytics.
|
||||||
|
|
||||||
|
**Results:** Up to 12% improvement in AUC-ROC and 25% relative reduction in MAE across ten tasks on five real-world datasets.
|
||||||
87
reviews/ctm-review-20260515.md
Normal file
87
reviews/ctm-review-20260515.md
Normal file
@@ -0,0 +1,87 @@
|
|||||||
|
---
|
||||||
|
title: "Continuous Thought Machines 论文集成 Review"
|
||||||
|
created: 2026-05-15
|
||||||
|
type: review
|
||||||
|
tags: [review, wiki-integration, neural-architecture, temporal-dynamics]
|
||||||
|
---
|
||||||
|
|
||||||
|
# 📌 基本信息
|
||||||
|
|
||||||
|
| 字段 | 内容 |
|
||||||
|
|------|------|
|
||||||
|
| 论文 | Continuous Thought Machines (CTM) |
|
||||||
|
| 作者 | Luke Darlow, Ciaran Regan, Sebastian Risi, Jeffrey Seely, **Llion Jones** |
|
||||||
|
| 机构 | Sakana AI (Tokyo), University of Tsukuba, IT University of Copenhagen |
|
||||||
|
| arXiv | 2505.05522v4 |
|
||||||
|
| 类别 | cs.LG |
|
||||||
|
| 会议 | NeurIPS 2025 |
|
||||||
|
| 集成日期 | 2026-05-15 |
|
||||||
|
|
||||||
|
> Llion Jones 是 "Attention Is All You Need" 的合著者之一。
|
||||||
|
|
||||||
|
# 🎯 核心概念
|
||||||
|
|
||||||
|
1. **[[neuron-level-models|Neuron-Level Models (NLMs)]]** — 颠覆「所有神经元共享同一激活函数」的范式,每个神经元拥有私有参数 MLP,从其激活历史中产生复杂时序动态
|
||||||
|
|
||||||
|
2. **[[neural-synchronization|Neural Synchronization as Representation]]** — 将神经元群体活动历史的时序相关性(内积)直接用作潜在表示,取代传统的单步激活快照
|
||||||
|
|
||||||
|
3. **[[internal-ticks|Internal Ticks]]** — 与数据维度完全解耦的内部时序,CTM 沿自生成的「思考步骤」展开神经动力学
|
||||||
|
|
||||||
|
4. **[[certainty-based-loss|Certainty-Based Loss]]** — 动态选择 argmin(loss) + argmax(certainty) 两个 tick,实现**原生自适应计算**(无需 halting 模块)
|
||||||
|
|
||||||
|
5. **[[continuous-thought-machine|CTM 架构]]** — Synapse Model → NLMs → Synchronization → Cross-Attention 的循环管线
|
||||||
|
|
||||||
|
# 🔗 概念网络
|
||||||
|
|
||||||
|
## 核心连接
|
||||||
|
|
||||||
|
```
|
||||||
|
CTM
|
||||||
|
├── 创新 1: Neuron-Level Models
|
||||||
|
│ ├── Pre-Activation History (M-step rolling buffer)
|
||||||
|
│ └── 每个神经元私有 MLP g_{θ_d}
|
||||||
|
├── 创新 2: Neural Synchronization
|
||||||
|
│ ├── S^t = Z^t · (Z^t)^⊺ (inner product of histories)
|
||||||
|
│ ├── Neuron Pairing (subsampling for efficiency)
|
||||||
|
│ └── Temporal Decay r_ij (multi-scale)
|
||||||
|
├── Infrastructure
|
||||||
|
│ ├── Internal Ticks (decoupled timeline)
|
||||||
|
│ └── Synapse Model (U-Net MLP)
|
||||||
|
└── Training
|
||||||
|
└── Certainty-Based Loss → 原生 Adaptive Computation Time
|
||||||
|
```
|
||||||
|
|
||||||
|
## 涌现属性
|
||||||
|
|
||||||
|
CTM 最迷人之处在于**涌现**——以下行为均无专门设计:
|
||||||
|
|
||||||
|
| 属性 | 表现 |
|
||||||
|
|------|------|
|
||||||
|
| [[adaptive-computation-time|自适应计算]] | 简单样本 <10 ticks 停止 |
|
||||||
|
| 校准 (Calibration) | ImageNet 天然优秀校准 |
|
||||||
|
| Look Around | 分类前顺序扫描图像 |
|
||||||
|
| [[internal-world-model|内部世界模型]] | 无位置编码,泛化至更大迷宫 |
|
||||||
|
| 行波 (Traveling Waves) | UMAP 可视化中的低频行波 |
|
||||||
|
|
||||||
|
## 扩展连接
|
||||||
|
|
||||||
|
- 连接 [[spiking-neural-networks|SNN]](占位页面):共同的生物学灵感,不同的实现路径
|
||||||
|
- 网络规模:320 → **334** 页(+14:1 论文 + 11 概念 + 1 SNN 占位 + 1 Review)
|
||||||
|
|
||||||
|
# 📚 Wiki 集成
|
||||||
|
|
||||||
|
| 指标 | 数值 |
|
||||||
|
|------|------|
|
||||||
|
| 新增页面 | 14 个(1 论文 + 12 概念 + 1 Review) |
|
||||||
|
| Wiki 规模 | 320 → 334 |
|
||||||
|
| 链接完整性 | ✅ 100% 无断链 |
|
||||||
|
|
||||||
|
# 💡 关键洞察
|
||||||
|
|
||||||
|
1. **「神经元即处理器」的激进赌注**:CTM 将每个神经元从「简单非线性门」升级为「小型时序处理器」,这是对深度学习最底层计算原语的重新设计。如果这条路走通,可能触发一场「神经元架构」的创新浪潮——类似 Transformer 取代 RNN 时发生的那样。
|
||||||
|
|
||||||
|
2. **同步 = 解耦 = 涌现**:用同步矩阵而非 z_t 作为表示,看似增加了一层间接性,实则**解放了神经元**——它们不再被迫直接编码任务信息,而是可以自由产生丰富动态。这种「不要求神经元直接有用」的设计哲学,可能是涌现能力的关键。
|
||||||
|
|
||||||
|
3. **Sakana AI 的方向信号**:来自 Transformer 原作者的团队(Llion Jones)押注在「神经动力学」方向,这是一个强烈的信号——他们可能看到了 Attention 范式之外的 Next Big Thing。
|
||||||
|
|
||||||
|
4. **与 StreamingLLM 的意外联系**:CTM 的 pre-activation history(M-step buffer)与 StreamingLLM 的 [[attention-sinks|Attention Sink]] + rolling KV cache 有结构相似性——都在用有限的历史窗口维持时序计算。这暗示「有限历史的时序表示」可能是多个子领域的汇聚点。
|
||||||
70
reviews/neurida-review-20260515.md
Normal file
70
reviews/neurida-review-20260515.md
Normal file
@@ -0,0 +1,70 @@
|
|||||||
|
---
|
||||||
|
title: "NeurIDA 论文集成 Review"
|
||||||
|
created: 2026-05-15
|
||||||
|
type: review
|
||||||
|
tags: [review, wiki-integration, database, machine-learning]
|
||||||
|
---
|
||||||
|
|
||||||
|
# 📌 基本信息
|
||||||
|
|
||||||
|
| 字段 | 内容 |
|
||||||
|
|------|------|
|
||||||
|
| 论文 | NeurIDA: Dynamic Modeling for Effective In-Database Analytics |
|
||||||
|
| 作者 | Lingze Zeng, Naili Xing, Shaofeng Cai, Peng Lu, Gang Chen, Jian Pei, Beng Chin Ooi |
|
||||||
|
| 机构 | NUS, Zhejiang University, Duke University |
|
||||||
|
| arXiv | 2512.08483v3 |
|
||||||
|
| 类别 | cs.DB |
|
||||||
|
| 集成日期 | 2026-05-15 |
|
||||||
|
|
||||||
|
# 🎯 核心概念
|
||||||
|
|
||||||
|
1. **[[dynamic-in-database-modeling|Dynamic In-Database Modeling]]** — 从「为每个任务训练固定模型」转向「查询时从共享组件装配定制模型」的新范式,解决 ML 静态性与 RDBMS 动态性的根本矛盾
|
||||||
|
|
||||||
|
2. **[[dime-dynamic-in-database-modeling-engine|DIME]]** — 四阶段管线引擎:Base Table Embedding → Dynamic Relation Modeling → Dynamic Model Fusion → Task-Aware Prediction
|
||||||
|
|
||||||
|
3. **[[composable-base-model-architecture|Composable Base Model Architecture]]** — 异构基础模型池(传统 ML + TRM + Tabular Foundation + LTM)+ 三个共享模型组件
|
||||||
|
|
||||||
|
4. **[[conditional-model-dispatcher|Conditional Model Dispatcher]]** — ZCP + 历史 EMA 的轻量级模型选择与条件增强调度
|
||||||
|
|
||||||
|
5. **[[query-intent-analyzer|Query Intent Analyzer]]** — LLM 驱动的 NLQ→结构化画像解析器,零人工特征工程
|
||||||
|
|
||||||
|
# 🔗 概念网络
|
||||||
|
|
||||||
|
## 核心连接
|
||||||
|
|
||||||
|
```
|
||||||
|
NeurIDA
|
||||||
|
└─ DIME (核心引擎)
|
||||||
|
├── Base Table Embedding → 统一元组编码器
|
||||||
|
├── Dynamic Relation Modeling → 关系图消息传递
|
||||||
|
└── Dynamic Model Fusion → 上下文感知选择性融合
|
||||||
|
├─ Query Intent Analyzer → Task Profile + Data Profile
|
||||||
|
├─ Conditional Model Dispatcher → ZCP + EMA
|
||||||
|
└─ Analytical Report Synthesizer → LLM 报告生成
|
||||||
|
```
|
||||||
|
|
||||||
|
## 扩展网络
|
||||||
|
- 连接了 **15 个新概念** + **3 篇相关工作**(占位页面)
|
||||||
|
- 与更广泛的 [[in-database-analytics|In-Database Analytics]] 领域对接
|
||||||
|
- 引入 [[tabular-foundation-models|Tabular Foundation Models]] 概念
|
||||||
|
- 引入 [[zero-cost-proxies|Zero-Cost Proxies]](NAS 领域交叉)
|
||||||
|
|
||||||
|
## 断链修复
|
||||||
|
- 为 3 篇相关论文创建占位页面:Trails (VLDB 2024)、Dynamic Model Slicing (VLDB 2024)、NeurDB (CIDR 2025)
|
||||||
|
|
||||||
|
# 📚 Wiki 集成
|
||||||
|
|
||||||
|
| 指标 | 数值 |
|
||||||
|
|------|------|
|
||||||
|
| 新增页面 | 19 个(1 论文 + 15 概念 + 3 文献占位) |
|
||||||
|
| 总页面数 | 300 → 319 |
|
||||||
|
| 链接完整性 | 100%(0 断链) |
|
||||||
|
| 概念平均链接数 | ~5 个出链/页 |
|
||||||
|
|
||||||
|
# 💡 关键洞察
|
||||||
|
|
||||||
|
1. **范式转变的启发性**:NeurIDA 的核心思想——「从训练模型转向装配模型」——不限于数据库领域。它暗示了一个更一般的趋势:从**固定模型**到**条件化模型工厂**。这对 AI agent 设计也有启示:agent 的能力是否也应该从「预定义工具集」转向「查询时动态装配能力模块」?
|
||||||
|
|
||||||
|
2. **LLM 的三重角色**:在 NeurIDA 中,LLM 扮演了三种不同角色——(a) 接口(NLQ 解析)、(b) 推理(Data Profiler 的 CoT 决策)、(c) 输出(报告生成)。这种「一个模型,多种接口」的设计模式对 agent 系统设计有参考价值。
|
||||||
|
|
||||||
|
3. **数据库 + AI 的融合加速**:论文来自 NUS/ZJU 研究组(NeurDB 也是同一团队),表明数据库领域的 AI 化正在从「ML 作为插件」升级到「AI 作为一等公民」的阶段。
|
||||||
Reference in New Issue
Block a user