Files
myWiki/concepts/tpp-training-methods.md

82 lines
2.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "TPP 训练方法"
created: 2026-06-16
updated: 2026-06-16
type: concept
tags: [temporal-point-process, training, estimation, optimization]
sources: [raw/papers/advances-temporal-point-processes-2026.md]
---
# TPP 训练方法 (Training Methods for TPP)
TPP 模型的参数估计面临独特挑战:连续时间的似然函数通常涉及强度函数的积分。不同训练目标在计算与统计效率间有本质权衡。
## 四大训练目标
### 1. 最大似然估计 (MLE)
```
theta_hat = arg max sum log lambda*(t_n) - ∫_0^T lambda*(tau) dtau
```
- **散度类型**KL 散度
- **需要似然**:是
- **需要数值积分**:是(强度积分)
- **统计效率**:渐近最优(最小方差)
- **代表**Ozaki (1979), Paninski (2004)
### 2. Wasserstein 距离训练
```
theta_hat = arg min W(f_data, f_theta)
```
- **散度类型**Wasserstein 距离
- **需要似然**:否
- **需要数值积分**:否
- **统计效率**:一致的,方差较高
- **代表**Xiao et al. (2017a, 2018)
### 3. 噪声对比估计 (NCE)
将参数学习转化为二分类问题——区分真实序列和噪声序列:
```
Loss = E_{T~real}[log sigma(s_theta(T))] + E_{T~noise}[log(1-sigma(s_theta(T)))]
```
- **散度类型**:基于分类
- **需要似然**:否
- **需要数值积分**:否
- **统计效率**:一致的,方差较高
- **代表**Guo et al. (2018), Mei et al. (2020)
### 4. 评分匹配 / Fisher 散度
通过评分函数score function的梯度衡量分布差异
```
theta_hat = arg min E_f[||∇log f(T) - ∇log f_theta(T)||^2]
```
- **散度类型**Fisher 散度
- **需要似然**:否(仅需评分函数)
- **需要数值积分**:否
- **统计效率**:一致的,方差较高
- **代表**Sahani et al. (2016), Li et al. (2023), Cao et al. (2024)
## 关键权衡
- MLE 统计上最优,但积分开销大
- 替代方法避免积分但渐近方差更高
- 选择取决于应用场景的规模 vs 精度需求
- [[intensity-free-modeling|Intensity-free 参数化]] 是另一种绕过积分的策略(保留 MLE 的效率优势)
## 参考
- [[temporal-point-process|时间点过程]]
- [[conditional-intensity-function|条件强度函数]]
- [[neural-temporal-point-process|神经 TPP]]
- [[intensity-free-modeling|Intensity-free 建模]]
- [[advances-temporal-point-processes-2026|TPP 综述]]