Objective research to aid investing decisions
Value Allocations for October 2019 (Final)
Momentum Allocations for October 2019 (Final)
1st ETF 2nd ETF 3rd ETF

Search Results

Showing results 11 - 20 of 32 for the search term: overfitting.

  • Date Range

True vs. Snooped Sharpe Ratios

Data snooping bias is pervasive in published research and quantitative investment strategies. Should investors resign themselves to the consequence that investment managers/funds offer products picked mostly on past luck? In his May 2018 presentation package entitled “How the Sharpe Ratio Died, and Came Back to Life”, Marcos Lopez de Prado introduces an approach to Sharpe ratio estimation via backtesting that would enable academia, regulators and investors to distinguish between strategies that probably work and those that probably do not. Based on the evolution of Sharpe ratio estimation approaches, he concludes that: Keep Reading

Data Perturb/Replay to Test Strategy Sensitivities

How can investment advisors apply historical asset performance data to address client views regarding future market/economic conditions? In their February 2018 paper entitled “Matching Market Views and Strategies: A New Risk Framework for Optimal Selection”, Adil Reghai and Gaël Riboulet present an approach for quantitatively relating historical asset return statistics to investor views. They intend this approach to address the widespread problem of backtest overfitting, whereby researchers discover good performance by fitting strategy features to noise in an historical dataset. Specifically, they:

  1. Collect historical return data for assets of interest and run backtests of alternative strategies on these data.
  2. Perturb historical average return, volatility, skewness and pairwise correlations up or down for these assets and rerun backtests of alternative strategies on multiple perturbations.
  3. Analyze relationships between directions of these perturbations and performance of alternative strategies.
  4. Match investor views first to directions of perturbations and then to strategies responding favorably (or least unfavorably) to these directions.

They apply this approach to generic algorithmic strategies (equal weight, momentum, mean reversion and carry). Based on mathematical derivations and examples, they conclude that: Keep Reading

Industry Rotation Based on Advanced Regression Techniques

Can advanced regression techniques identify monthly cross-industry lead-lag return relationships that usefully indicate an industry rotation strategy? In their January 2018 paper entitled “Dynamic Return Dependencies Across Industries: A Machine Learning Approach”, David Rapach, Jack Strauss, Jun Tu and Guofu Zhou examine dynamic relationships between past and future returns (lead-lag) across 30 U.S. industries. To guard against overfitting the data, they employ a machine learning regression approach that combines a least absolute shrinkage and selection operator (LASSO) and ordinary least squares (OLS). Their approach allows each industry’s return to respond to lagged returns of all 30 industries. They assess economic value of findings via a long-short industry rotation hedge portfolio that is each month long (short) the fifth, or quintile, of industries with the highest (lowest) predicted returns for the next month based on inception-to-date monthly calculations. They consider three benchmark hedge portfolios based on: (1) historical past average returns of the industries; (2) an OLS-only approach; and, (3) a cross-sectional, or relative, momentum approach that is each month long (short) the quintile of industries with the highest (lowest) returns over the past 12 months. Using monthly returns  for 30 value-weighted U.S. industry groups during 1960 through 2016, they find that:

Keep Reading

Chess, Jeopardy, Poker, Go and… Investing?

How can machine investors beat humans? In the introductory chapter of his January 2018 book entitled “Financial Machine Learning as a Distinct Subject”, Marcos Lopez de Prado prescribes success factors for machine learning as applied to finance. He intends that the book: (1) bridge the divide between academia and industry by sharing experience-based knowledge in a rigorous manner; (2) promote a role for finance that suppresses guessing and gambling; and, (3) unravel the complexities of using machine learning in finance. He intends that investment professionals with a strong machine learning background apply the knowledge to modernize finance and deliver actual value to investors. Based on 20 years of experience, including management of several multi-billion dollar funds for institutional investors using machine learning algorithms, he concludes that: Keep Reading

10 Steps to Becoming a Better Quant

Want your machine to excel in investing? In his January 2018 paper entitled “The 10 Reasons Most Machine Learning Funds Fail”, Marcos Lopez de Prado examines common errors made by machine learning experts when tackling financial data and proposes correctives. Based on more than two decades of experience, he concludes that: Keep Reading

Predicted Factor/Smart Beta Alphas

Which equity factors have high and low expected returns? In their February 2017 paper entitled “Forecasting Factor and Smart Beta Returns (Hint: History Is Worse than Useless)”, Robert Arnott, Noah Beck and Vitali Kalesnik evaluate attractiveness of eight widely used stock factors. They measure alpha for each factor conventionally via a portfolio that is long (short) stocks with factor values having high (low) expected returns, reformed systematically. They compare factor alpha forecasting abilities of six models:

  1. Factor return for the last five years.
  2. Past return over the very long term (multiple decades), a conventionally used assumption.
  3. Simple relative valuation (average valuation of long-side stocks divided by average valuation of short-side stocks), comparing current level to its past average.
  4. Relative valuation with shrunk parameters to moderate forecasts by dampening overfitting to past data.
  5. Relative valuation with shrunk parameters and variance reduction, further moderating Model 4 by halving its outputs.
  6. Relative valuation with look-ahead full-sample calibration to assess limits of predictability. 

They employ simple benchmark forecasts of zero factor alphas. Using 24 years of specified stock data (January 1967 – December 1990) for model calibrations, about 20 years of data (January 1991 – October 2011) to generate forecasts and the balance of data (through December 2016) to complete forecast accuracy measurements, they find that: Keep Reading

Seven Habits of Highly Ineffective Quants

Why don’t machines rule the financial world? In his September 2017 presentation entitled “The 7 Reasons Most Machine Learning Funds Fail”, Marcos Lopez de Prado explores causes of the high failure rate of quantitative finance firms, particularly those employing machine learning. He then outlines fixes for those failure modes. Based on more than two decades of experience, he concludes that: Keep Reading

Brute Force Stock Trading Signal Discovery

How serious is the snooping bias (p-hacking) derived from brute force mining of stock trading strategy variations? In their August 2017 paper entitled “p-Hacking: Evidence from Two Million Trading Strategies”, Tarun Chordia, Amit Goyal and Alessio Saretto test a large number of hypothetical trading strategies to estimate an upper bound on the seriousness of p-hacking and to estimate the likelihood that a researcher can discover a truly abnormal trading strategy. Specifically, they:

  • Collect historical data for 156 firm accounting and stock price/return variables as available for U.S. common stocks in the top 80% of NYSE market capitalizations with price over $3.
  • Exhaustively construct about 2.1 million trading signals from these variables based on their levels, changes and certain combination ratios.
  • Calculate three measures of trading signal effectiveness:
    1. Gross 6-factor alphas (controlling for market, size, book-to-market, profitability, investment and momentum) of value-weighted, annually reformed hedge portfolios that are long the value-weighted tenth, or decile, of stocks with the highest signal values and short the decile with the lowest.
    2. Linear regressions that test ability of the entire distribution of trading signals to explain future gross returns based on linear relationships.
    3. Gross Sharpe ratios of the hedge portfolios used for alpha calculations.
  • Apply three multiple hypothesis testing methods that account for cross-correlations in signals and returns (family-wise error rate, false discovery rate and false discovery proportion.

They deem a signal effective if it survives both statistical hurdles (alpha t-statistic 3.79 and regression t-statistic 3.12) and has a monthly Sharpe ratio higher than that of the market (0.12). Using monthly values of the 156 specified input variables during 1972 through 2015, they find that:

Keep Reading

The Right Math for Analysis of Financial Markets?

Where should investors look for methodological edges in 21st century financial markets? In his brief August 2016 paper entitled “Mathematics and Economics: A Reality Check”, Marcos Lopez de Prado advises finance students (and practitioners) what mathematical/analytical expertise to acquire for successful 21st century investing and trading. Based on his experience with what kinds of analysts and mathematics are most successful in financial markets, he concludes that: Keep Reading

Best Way to Guard Against Investment Strategy Flame-outs?

Can investors avoid strategy flame-outs associated with overly enthusiastic backtesting (overfitting)? In his July 2016 paper entitled “Limitations of Quantitative Claims About Trading Strategy Evaluation”, Michael Harris presents two examples that demonstrate a key limitation of trading strategy backtesting:

  1. U.S. stock market trend following.
  2. U.S. stock market mean reversion.

Specifically, he compares performances of such strategies before and after 1997 to illustrate the interaction of backtesting and change in market conditions. Using daily S&P 500 Index returns (excluding dividends) during January 1950 through December 2015, he finds that: Keep Reading