Evidence-based investing research
Value Investing Strategy (Strategy Overview)
Allocations for March 2026 (Final)
Cash TLT LQD SPY
Momentum Investing Strategy (Strategy Overview)
Allocations for March 2026 (Final)
1st ETF 2nd ETF 3rd ETF

It Can’t All Be Data Snooping?

December 7, 2018 • Posted in Big Ideas

Is it possible that all the 300+ published factors that predict stock returns (such as size, value, profitability, investment, momentum…) derive from data snooping? In his October 2018 paper entitled “The Limits of Data Mining: A Thought Experiment”, Andrew Chen estimates how much data snooping would be required to “discover” all these factors by pure luck. Specifically, he calibrates a pure luck model built on the assumption that the probability of publishing a factor discovery increases with the degree to which the discovery is convincing (t-statistic). Using this model, he estimates the number of unpublished factor studies required for the published set to be attributable to pure luck. He considers two sets of factor t-statistics: 156 from factor replications via equal-weighted long-short extreme fifths (quintiles) of factor stock sorts; and, a hand-collected set of 316 from published factor studies. Using the specified approach and these two sets of t-statistics, he finds that: (more…)

Subscribe to Keep Reading

Get the research edge serious investors rely on.

  • 1,200+ research articles
  • Monthly strategy signals
  • 20+ years of backtested analysis
$17.99 /month

Cancel anytime