92 lines
4.6 KiB
Markdown
92 lines
4.6 KiB
Markdown
# Spurious Predictability in Financial Machine Learning
|
|
|
|
**Authors:** Sotirios D. Nikolopoulos
|
|
**arXiv ID:** 2604.15531v1
|
|
**Published:** 2026-04-16
|
|
**Categories:** q-fin.ST, stat.ME, stat.ML
|
|
**Comments:** 49 pages, 10 figures. The QuantAudit R package and full replication scripts will be made publicly available upon journal publication
|
|
**Subjects:** Statistical Finance (q-fin.ST); Methodology (stat.ME); Machine Learning (stat.ML)
|
|
**MSC classes:** 91G70, 62P20, 62M20, 68T05
|
|
**DOI:** https://doi.org/10.48550/arXiv.2604.15531
|
|
|
|
## Abstract
|
|
|
|
Adaptive specification search generates statistically significant backtests even under martingale-difference nulls. We introduce a falsification audit testing complete predictive workflows against synthetic reference classes, including zero-predictability environments and microstructure placebos. Workflows generating significant walk-forward evidence in these environments are falsified. For passing workflows, we quantify selection-induced performance inflation using an absolute magnitude gap linking optimized in-sample evidence to disjoint walk-forward realizations, adjusted for effective multiplicity. Simulations validate extreme-value scaling under correlated searches and demonstrate detection power under genuine structure. Empirical case studies confirm that many apparent findings represent methodological artifacts rather than genuine predictability.
|
|
|
|
## Key Concepts
|
|
|
|
### 1. Spurious Predictability
|
|
The phenomenon where adaptive specification search (data mining, model selection, hyperparameter tuning) can generate statistically significant backtest results even when the underlying data-generating process has no genuine predictive structure (martingale-difference nulls).
|
|
|
|
### 2. Falsification Audit
|
|
A methodological framework for testing complete predictive workflows against synthetic reference classes:
|
|
- **Zero-predictability environments**: Simulated data with no genuine predictive structure
|
|
- **Microstructure placebos**: Realistic but non-predictive market microstructure features
|
|
|
|
### 3. Selection-Induced Performance Inflation
|
|
The bias introduced by model selection and optimization, quantified as the gap between:
|
|
- Optimized in-sample performance
|
|
- Out-of-sample (walk-forward) performance on disjoint data
|
|
|
|
### 4. Effective Multiplicity
|
|
Adjustment for the multiple comparisons problem in adaptive specification search, accounting for correlated search paths and dependencies between model specifications.
|
|
|
|
## Methodology
|
|
|
|
### Falsification Framework
|
|
1. **Reference class construction**: Create synthetic environments with known properties
|
|
2. **Workflow testing**: Apply the complete predictive workflow to reference classes
|
|
3. **Falsification criteria**: Reject workflows that show significant predictive power in zero-predictability environments
|
|
|
|
### Performance Gap Quantification
|
|
For workflows that pass falsification tests:
|
|
1. **In-sample optimization**: Measure performance on training data
|
|
2. **Walk-forward validation**: Test on disjoint out-of-sample periods
|
|
3. **Gap calculation**: Compute absolute magnitude difference adjusted for effective multiplicity
|
|
|
|
## Empirical Findings
|
|
|
|
### Case Studies
|
|
The paper presents empirical case studies demonstrating that many apparent findings in financial machine learning represent methodological artifacts rather than genuine predictability.
|
|
|
|
### Implications
|
|
1. **Methodological rigor**: Need for robust validation frameworks
|
|
2. **Publication bias**: Tendency to publish positive results without proper falsification
|
|
3. **Replication crisis**: Similar challenges as in other empirical sciences
|
|
|
|
## Technical Contributions
|
|
|
|
### 1. QuantAudit R Package
|
|
The authors will release an R package implementing the falsification audit framework.
|
|
|
|
### 2. Statistical Framework
|
|
- Extreme-value theory for correlated searches
|
|
- Effective multiplicity adjustments
|
|
- Walk-forward validation protocols
|
|
|
|
### 3. Simulation Studies
|
|
Validation of the framework's detection power under various data-generating processes.
|
|
|
|
## Related Concepts
|
|
|
|
- [[cramer-rao-lower-bound]] - Theoretical bounds on parameter estimation
|
|
- [[computerized-adaptive-testing]] - Adaptive testing methodologies
|
|
- [[symbolic-regression]] - Machine learning for discovering mathematical expressions
|
|
- [[formal-verification]] - Formal methods for validation
|
|
|
|
## References
|
|
|
|
- arXiv: https://arxiv.org/abs/2604.15531
|
|
- PDF: https://arxiv.org/pdf/2604.15531
|
|
- HTML: https://arxiv.org/html/2604.15531v1
|
|
|
|
## BibTeX
|
|
|
|
```bibtex
|
|
@article{nikolopoulos2026spurious,
|
|
title={Spurious Predictability in Financial Machine Learning},
|
|
author={Nikolopoulos, Sotirios D.},
|
|
journal={arXiv preprint arXiv:2604.15531},
|
|
year={2026}
|
|
}
|
|
``` |