Does P/E10 (or Cyclically Adjusted Price-Earnings ratio, CAPE) usefully predict U.S. stock market returns? Per Robert Shiller’s data set, P/E10 is inflation-adjusted S&P Composite Index level divided by average monthly inflation-adjusted 12-month trailing earnings of index companies over the last ten years. To investigate its usefulness, we consider in-sample regression/ranking and out-of-sample cumulative performance tests. Using monthly values for nominal and real S&P Composite Index (calculated as average of daily closes during the month), associated dividends (smoothed), 12-month trailing real earnings (smoothed) and interest rates during January 1871 through July 2017, *we find that:*

The following chart tracks behavior of the S&P Composite Index (excluding dividends) and P/E10 starting in January 1881 (the first month there is enough historical data to calculate P/E10). We use data only through March 2017, for which a full 10 years of earnings are available. P/E10 ranges from a low of 4.78 in December 1920 to a high of 44.20 in December 1999. Visual inspection suggests that low values of P/E10 are mostly better times to invest than high values, but it is not obvious what values of P/E10 are good entry/exit thresholds. In other words, it is not obvious that P/E10 reverts to some constant average value.

To quantify the relationship, we first consider in-sample regression.

The following scatter plot relates 10-year future capital gains (no dividends) for the S&P Composite Index to P/E10 based on monthly data over the available sample period. In general, the higher the P/E10, the lower the 10-year future return. The Pearson correlation for the relationship is -0.39, and the R-squared statistic is 0.15, indicating that variation in P/E10 explains 15% of the variation in 10-year future return. Using a logarithmic or quadratic, rather than linear, best-fit relationship increases R-squared to 0.17.

Given the large number of low and negative 10-year returns at fairly low values of P/E10, using P/E10 as a valuation indicator does not preclude poor outcomes.

The clumpiness of the data derives from large month-to-month overlaps of P/E10 and future return calculation intervals. This overlap means that the effective sample size is much smaller than indicated by the number of points on the plot. There are only about 14 completely independent 10-year intervals in the sample.

To evaluate any non-linearity in the relationship, we calculate average future returns by range of P/E10.

The next chart summarizes S&P Composite Index average 10-year future capital gains by ranked tenth (decile) of initial P/E10 over the available sample period. Results indicate a somewhat systematic relationship: the lower the initial P/E10, the higher the average future return.

However, even for the highest deciles of initial P/E10, average 10-year future return is substantially positive, so exiting the stock market after moderately high values of P/E10 may not beat a buy-and-hold approach.

Is there a level of P/E10 that strongly indicates poor future returns?

The next chart summarizes average monthly P/E10 by decile of S&P Composite Index 10-year future capital gains over the available sample period (again in-sample). While there is some progression in the relationship, differences across deciles 2-7 and across deciles 8-10 offer little basis for setting a P/E10 threshold. Values of P/E10 in the lowest return decile range form 7.3 to 44.2, with median 18.2.

How might an investor apply P/E10 in a simple out-of-sample investment strategy?

One approach to applying P/E10 as a valuation indicator is to buy (sell) when P/E10 crosses below (above) some threshold related to its average over the long term. To avoid look-ahead bias, an investor can use only past data to estimate the threshold. For example, an investor making a decision in 1971 can know average P/E10 for 1881-1970, but not for 1881-2017.

We consider two binary signal strategies commencing at the end of January 1921 that buy (sell) the S&P Composite Index when P/E10 crosses below (above) its:

- Inception-to-date (ITD) average since 1881.
- Average over a rolling 40-year window (in case the average is unstable). In other words, this approach discards data more than 40 years old as insufficiently relevant to current market environment.

The following two charts show when strategy 1 is in stocks (pink area of upper chart) and when strategy 2 is in stocks (light blue area of lower chart), along with the behavior of the S&P Composite Index on a logarithmic scale. For both strategies, the investor would be out of stocks for all of the 1960s and nearly all of the last two decades. Results suggest that these two strategies may not keep up with a buy-and-hold benchmark.

How do these strategies translate to cumulative performance based on total returns?

Modeling assumptions for this backtest are complex, as follows:

- Funds go to stocks (cash) at the ends of months when P/E10 crosses under (over) its historical average, ignoring the several month lag in availability of monthly P/E10 measurements. As a sensitivity test, we consider a 20-year historical average, as well as the ITD and 40-year historical averages specified above.
- Cash earns each month one twelfth the long-term interest rate in the Shiller data set less 1.49%, the average difference between the 10-year Treasury note yield and the 3-month Treasury bill yield since April 1953 (the earliest month available for both series). This estimate may not be representative for pre-1950s data.
- Dividends accrue while in stocks, with one twelfth the annual yield in the Shiller data set paid each month and frictionless reinvestment. This dividend smoothing assumption could affect results, and frictionless reinvestment of dividends is optimistic.
- There are no trading frictions for moving between stocks and cash. Since there are not many trades, this assumption is mildly optimistic (but frictions may have been high for much of the sample period).
- Ignore tax implications of trading (tax rules change considerably over this long sample period).

The approximations in these assumptions, as well as those of smoothing and interpolation methods used in constructing the early part of the source data set and use of daily averages for monthly index levels, make this analysis more a concept exploration than a strategy test.

The next chart compares on a logarithmic scale cumulative values of $1.00 initial investments for buying and holding the S&P Composite Index and the three specified market timing strategies over the available sample period. The three timing strategies, which spend considerable time in cash, substantially underperform buying and holding the index. The compound annual growth rate for Buy and Hold is 10.5%, compared to 7.7%, 7.7% and 7.8% for Binary ITD Average, Binary 40-year Average and Binary 20-year Average, respectively. The timing strategies still underperform by 1.8% to 1.9% per year with the term spread set to zero (assuming that cash earns the prevailing long-term bond yield, without risk of capital loss).

Maximum drawdown for Buy and Hold is -81% (June 1932), compared to -62%, -56% and -51% for Binary ITD Average, Binary 40-year Average and Binary 20-year Average, respectively. The timing strategies do moderately suppress maximum drawdowns, but drawdowns remain large.

For another perspective, we look at average monthly returns.

The next chart compares average monthly total returns, with one standard deviation variability ranges, for buying and holding the S&P Composite Index and the three market timing strategies over the available sample period. While the timing strategies suppress volatility, they sacrifice considerable average return.

Might some earnings history longer or shorter than ten years work better than P/E10?

The final chart shows (in-sample) correlations between P/EN and future S&P Composite Index capital gains at horizons of one month, one year, three years, five years and 10 years over the available sample period, where N is the number of years of historical earnings (ranging from one year to 15 years) used to calculate P/E. All relationships start in January 1886 to accrue an initial 15 years of earnings data. For each return horizon, the most negative correlation indicates the PE/N most effective in discriminating between good and bad future returns.

All correlations are negative, indicating that a P/E-based strategy may have merit (at least in-sample). The optimal Ns are 11, 10, 9, 8 and 11 for return horizons of one month, one year, three years, five years and 10 years, respectively. However, for a 10-year return horizon, the value of N makes almost no difference.

The fact that the shapes of curves vary and the optimal N differs suggests randomness, but P/E10 is a good compromise. Return on cash and dividends (and tax implications) may be important for selecting N.

In summary, *evidence from simple tests on long-run data indicates that P/E10 has in-sample predictive power for long-term future stock market returns, but this predictive power does not straightforwardly translate to effective market timing.*

Two possible interpretations of the underperformance of P/E10-based market timing strategies include:

- As noted above, P/E10 may not be mean reverting (it may have no stable average).
- The sample period is not long enough to discover the stable average for P/E10. Said differently, P/E10 mean reversion is so slow that an investing lifetime is not long enough to exploit it with confidence.

Cautions regarding findings include:

- As noted above, the number of completely independent P/E10 and 10-year future return measurement intervals is only about 14. Using monthly earnings and returns with nearly 10-year overlaps confounds simple statistics.
- As noted, the Shiller data set involves different kinds of smoothing/interpolation among the numerical series presented and is likely unuseful for short-term analyses.
- As noted, the regression, ranking and P/EN analyses above are in-sample. An investor operating in real time based strictly on past data (out-of-sample) may draw different conclusions at different times.
- As noted, tests of market timing based on P/E10 ignore trading frictions. Incorporating estimates of such frictions would slightly depress associated outcomes.
- Also, as noted, the out-of-sample cumulative performance analysis has some look-ahead bias, not accounting for a several month delay in availability of earnings. At most times, omitting a few months of earnings from a 10-year average has modest effect. This bias may be material when earnings crash.
- Application of different models/parameter values/return intervals on the same data set introduces data snooping bias, thereby overstating expectations for the best combination. Experimentation with different strategies would increase this bias.

See also “Exploiting P/E10 to Time the U.S. Stock Market”.

It costs less than a single trading commission. Learn more here.