Files
myWiki/raw/papers/nikolopoulos-spurious-predictability-2026.md
2026-04-22 16:56:53 +08:00

4.6 KiB

Spurious Predictability in Financial Machine Learning

Authors: Sotirios D. Nikolopoulos
arXiv ID: 2604.15531v1
Published: 2026-04-16
Categories: q-fin.ST, stat.ME, stat.ML
Comments: 49 pages, 10 figures. The QuantAudit R package and full replication scripts will be made publicly available upon journal publication
Subjects: Statistical Finance (q-fin.ST); Methodology (stat.ME); Machine Learning (stat.ML)
MSC classes: 91G70, 62P20, 62M20, 68T05
DOI: https://doi.org/10.48550/arXiv.2604.15531

Abstract

Adaptive specification search generates statistically significant backtests even under martingale-difference nulls. We introduce a falsification audit testing complete predictive workflows against synthetic reference classes, including zero-predictability environments and microstructure placebos. Workflows generating significant walk-forward evidence in these environments are falsified. For passing workflows, we quantify selection-induced performance inflation using an absolute magnitude gap linking optimized in-sample evidence to disjoint walk-forward realizations, adjusted for effective multiplicity. Simulations validate extreme-value scaling under correlated searches and demonstrate detection power under genuine structure. Empirical case studies confirm that many apparent findings represent methodological artifacts rather than genuine predictability.

Key Concepts

1. Spurious Predictability

The phenomenon where adaptive specification search (data mining, model selection, hyperparameter tuning) can generate statistically significant backtest results even when the underlying data-generating process has no genuine predictive structure (martingale-difference nulls).

2. Falsification Audit

A methodological framework for testing complete predictive workflows against synthetic reference classes:

  • Zero-predictability environments: Simulated data with no genuine predictive structure
  • Microstructure placebos: Realistic but non-predictive market microstructure features

3. Selection-Induced Performance Inflation

The bias introduced by model selection and optimization, quantified as the gap between:

  • Optimized in-sample performance
  • Out-of-sample (walk-forward) performance on disjoint data

4. Effective Multiplicity

Adjustment for the multiple comparisons problem in adaptive specification search, accounting for correlated search paths and dependencies between model specifications.

Methodology

Falsification Framework

  1. Reference class construction: Create synthetic environments with known properties
  2. Workflow testing: Apply the complete predictive workflow to reference classes
  3. Falsification criteria: Reject workflows that show significant predictive power in zero-predictability environments

Performance Gap Quantification

For workflows that pass falsification tests:

  1. In-sample optimization: Measure performance on training data
  2. Walk-forward validation: Test on disjoint out-of-sample periods
  3. Gap calculation: Compute absolute magnitude difference adjusted for effective multiplicity

Empirical Findings

Case Studies

The paper presents empirical case studies demonstrating that many apparent findings in financial machine learning represent methodological artifacts rather than genuine predictability.

Implications

  1. Methodological rigor: Need for robust validation frameworks
  2. Publication bias: Tendency to publish positive results without proper falsification
  3. Replication crisis: Similar challenges as in other empirical sciences

Technical Contributions

1. QuantAudit R Package

The authors will release an R package implementing the falsification audit framework.

2. Statistical Framework

  • Extreme-value theory for correlated searches
  • Effective multiplicity adjustments
  • Walk-forward validation protocols

3. Simulation Studies

Validation of the framework's detection power under various data-generating processes.

References

BibTeX

@article{nikolopoulos2026spurious,
  title={Spurious Predictability in Financial Machine Learning},
  author={Nikolopoulos, Sotirios D.},
  journal={arXiv preprint arXiv:2604.15531},
  year={2026}
}