Which Predictors Make Machine Learning Work?
December 12, 2023 - Investing Expertise
With stock portfolio construction increasingly based on “black box” machine learning models with very large numbers of inputs, how can investors decide whether portfolio recommendations make sense? In their November 2023 paper entitled “The Anatomy of Machine Learning-Based Portfolio Performance”, Philippe Coulombe, David Rapach, Christian Montes Schütte and Sander Schwenk-Nebbe describe a way to use Shapley values to estimate contributions of groups of related inputs to machine learning-based portfolio performance. Their approach applies to any fitted prediction model (or ensemble of models) used to forecast asset returns and construct a portfolio based on the forecasts. They illustrate their approach on an XGBoost machine learning model that each month:
- Uses 207 firm characteristics to forecast next-month returns of associated stocks.
- Excludes stocks in the bottom 20% of NYSE market capitalizations.
- Sorts surviving stocks into fifths, or quintiles, based on forecasted returns.
- Reforms a hedge portfolio that is long (short) the value-weighted top (bottom) quintile.
They then assign each of the 207 inputs to one of 20 groups based on similarities and estimate the contribution of each input group to portfolio performance. Using 207 monthly firm/stock characteristics for all listed U.S. firms and the monthly risk-free rate during January 1960 through December 2021, with portfolio testing commencing January 1973, they find that: