Which economic variables are most important for predicting stock returns? In their October 2018 paper entitled “Sparse Macro Factors”, David Rapach and Guofu Zhou apply machine learning to isolate via sparse principal component analysis (PCA) which of 120 economic variables from the FRED-MD database most influence stocks. These variables span output/income, labor market, housing, consumption, orders/inventories, money/credit, yields/exchange rates and inflation. As a preliminary step, they adjust raw economic variables by, where necessary: (1) transforming them to produce stationary series; (2) adjusting for reporting lags of one or two months. They next execute sparse PCA, which sets small component weights to zero, thereby facilitating interpretation of results without sacrificing much predictive power. For comparison, they also extract the ﬁrst 10 conventional principal components from the same variables. Finally, they use 202 stock portfolios to estimate the influence of sparse and conventional principal components on the cross section of stock returns. Using monthly data for the 120 economic variables and 202 stock portfolios during February 1960 through June 2018, *they find that:*

- Conventional principal components are each linear combinations of all 120 economic variables, making them difficult to interpret.
- In contrast, sparse principal components are relatively simple, with the ﬁrst ten clearly relating to yields, inﬂation, production, housing, employment, yield spreads, wages, optimism, money and credit.
- Only one conventional principal component signiﬁcantly influences the cross section of stock returns.
- Three sparse principal components significantly influence the cross section of stock returns, as follows:
- Yields (nominal interest rates).
- Housing (housing starts, new private housing permits and real estate loans).
- Optimism (real personal income and consumption, retail sales, help wanted, overtime and new orders for durable goods).

- Mimicking factor portfolios for these three sparse principal components generate gross annualized Sharpe ratios 0.85, 1.02 and 0.53, respectively.
- A 4-factor model of stock returns comprised of the market factor and mimicking portfolio returns for the yields, housing and optimism factors perform similarly to or better than:
- The Carhart 4-factor model (market, size, book-to-market, momentum).
- The Fama-French 5-factor model (market, size, book-to-market, profitability, investment).
- The q-factor model (market, size, investment, and profitability ).

In summary, *evidence from U.S. data indicates that carefully curated and screened economic variables related to yields, housing and optimism are as effective as widely used stock/firm characteristics in explaining/predicting stock returns.*

Cautions regarding findings include:

- Some economic variables may have short-term revisions to replace the most recent estimates and/or long-term revisions for adjustments such as seasonality. In other words, source economic data may impound some look-ahead bias.
- The stock portfolios used for testing are essentially indexes, which do not account for trading frictions required for monthly portfolio reformation or for ongoing shorting costs.
- Nor do mimicking factor portfolios for sparse principal components of economic variables account for trading frictions and shorting costs required for monthly reformation, such that the net factor Sharpe ratios would be lower than the reported gross Sharpe ratios.
- The methodology described is beyond the reach of most investors, who would bear fees for delegating the process to an investment/fund manager.

For other approaches, see “Enhancing Stock Market Prediction with Distilled Economic Variables” and “Following the ‘Hot’ Economic Indicators”.