Objective research to aid investing decisions

Value Investing Strategy (Strategy Overview)

Allocations for March 2024 (Final)
Cash TLT LQD SPY

Momentum Investing Strategy (Strategy Overview)

Allocations for March 2024 (Final)
1st ETF 2nd ETF 3rd ETF

Guru Grades

Can equity market experts, whether self-proclaimed or endorsed by others (such as publications), reliably provide stock market timing guidance? Do some experts clearly show better market timing intuition than others?

To investigate, during 2005 through 2012 we collected 6,582 forecasts for the U.S. stock market offered publicly by 68 experts, bulls and bears employing technical, fundamental and sentiment indicators. Collected forecasts include those in archives, such that the oldest forecast in the sample is from the end of 1998. For this final report, we have graded all these forecasts.

We assess forecasting acumen of stock market gurus as a group and rank them as individuals according to accuracy. The assessment of aggregate guru stock market forecasting performance is much more reliable, based on sample size and duration, than the evaluations of individuals.

We have performed completeness and integrity checks on the forecast grades, but some errors may remain.

Following the advice of experts is an investment technique. We intend this study as an aid to investors in learning mode regarding how much attention (and funds) this technique merits. This study is not a test of whether guru opinions and arguments are interesting, stimulating or useful in ways other than anticipating the behavior of the broad U.S. stock market. This kind of forecasting ability tested is different from, but may be related to, stock picking expertise.

Picking GurusGrading MethodologyAggregate Results

Individual ResultsRelated StudiesOther Guru Reviews

Picking Gurus to Grade

We restricted reviews to publicly available forecasts for the U.S. stock market (freely available on the web). This approach: (1) avoids intellectual property issues; (2) when the forecast series is on the forecaster’s web site, inhibits manipulation because regular readers would likely notice changes/deletions/additions to the record; and, (3) inhibits mis-grading since anyone can verify forecast wording/context and contest grades. At this point, however, links to archives for some gurus are defunct.

We began selecting experts based on web searches for public archives with enough forecasts spanning enough market conditions to gauge accuracy. Readers subsequently helped identify additional gurus. A few individuals proposed themselves as gurus, but they generally lacked archive transparency per above and/or a long enough record for reasonable evaluations. A long enough record depends to a degree on the nature of forecasts. A relatively short sample period may be sufficient for weekly forecasts based principally on short-term technical conditions. A relatively long sample period may be necessary for forecasts emphasizing economic fundamentals. Rules of thumb are a minimum of about 20 reasonably independent forecasts spanning a minimum of about two years to encompass different market conditions.

The selected public records are sometimes on the web sites of the gurus themselves and sometimes on web sites of other parties (for example, the business media). Especially for the former, we looked for archives that are clearly dated and not retrospectively filtered (cherry-picked).

The following chart tracks the cumulative number of gurus tracked over time based on initial forecast dates. The number of gurus actively issuing forecasts is generally less than the cumulative total tracked, because some gurus stop forecasting (or media stop covering some gurus) over time.

cumulative-gurus-being-tracked

Potential sources of bias in this approach include:

  • There may be data availability bias in the overall sample, because some types of experts may be more likely to offer frequent public commentary than others.
  • The media may tend to start quoting an expert after some dramatically correct forecast and stop after a series of incorrect forecasts (but it is not obvious that such attention bias affects long-term evaluations).

Grading Methodology

The essential grading methodology is to compare forecasts for the U.S. stock market (whether quantified or qualitative) to S&P 500 index returns over the future interval(s) most relevant to the forecast horizon. However, many forecasts contain ambiguities about degree and timing, equivocations and/or conditions. In general, we:

  • Exclude forecasts that are too vague to grade and forecasts that include conditions requiring consideration of data other than stock market returns.
  • Match the frequency of a guru’s commentaries (such as weekly or monthly) to the forecast horizon, unless the forecast specifies some other timing.
  • Detrend forecasts by considering the long-run empirical behavior of the S&P 500 Index, which indicates that future returns over the next week, month, three months, six months and year are “normally” about 0.1%, 0.6%, 2%, 4% and 8%, respectively. For example, if a guru says investors should be bullish on U.S. stocks over the next six months, and the S&P 500 Index is up by only 1% over that interval, we would judge the call incorrect.
  • Grade complex forecasts with elements proving both correct and incorrect as both right and wrong (not half right and half wrong).

Weaknesses in the methodology include:

  • Some forecasts may be more important than others, but all are comparably weighted. In other words, measuring forecast accuracy is unlike measuring portfolio returns.
  • Consecutive forecasts by a given guru often are not independent, in that the forecast publishing interval is shorter than the forecast horizon (suggesting that the guru repetitively uses similar information to generate forecasts). This serial correlation of forecasts effectively reduces sample size.
  • In a few cases, for gurus with small samples, we include forecasts not explicitly tied to future stock market returns. There are not enough of these exceptions to affect aggregate findings.
  • Grading vague forecasts requires judgment. Random judgment errors tend to cancel over time, but judgment biases could affect findings. Detailed grades are available via links below to individual guru records. Within those records are further links to source commentaries and articles (some links are defunct). Readers can therefore inspect forecast grades and (in many cases) forecast selection/context.
  • S&P 500 Index return measurements for grading commence at the close on forecast publication dates, resulting in some looseness in grading because forecast publication may be before the open or after the close. Very few forecast grades are sensitive to a one-day return, and we try to take looseness into account in grading any forecasts that focus on the very short term.

Neither CXO Advisory Group LLC, nor any of its members personally, received any payments from the gurus graded. A few gurus offered access to private forecasts, but we either did not accept or did not utilize such access. Early in the project, we did solicit and use descriptions of market timing methods from three gurus with relatively high accuracies (see “Related Studies” below).

Aggregate Grading Results

The following chart tracks the inception-to-date accuracy of all 6,582 graded forecasts in the sample. The extreme values early in the sample period relate to small cumulative samples. Terminal accuracy is 46.9%, an aggregate value very steady since the end of 2006. With respect to the gradual decline during 2003 through 2006:

  • Grading judgment, as well as number of gurus and number of forecasts, may evolve with experience (becoming a little more strict).
  • Individual guru samples may tend to have a lucky start, thereby attracting media attention or engendering self-confidence for publishing forecasts.

If we average by guru rather than across all forecasts, terminal accuracy is 47.4%.

cumulative-forecasting-accuracy

The next chart shows the distribution of the individual gurus accuracies for the entire sample. The distribution is roughly symmetric about the mean and may be normal, but 68 gurus (some with meager samples) is small for such analysis.

guru-accuracy-histogram

Why is the aggregate accuracy below 50%? Potential explanations include:

  • There may be grading bias derived from some animus toward gurus.
  • Sampled gurus may have motives other than accuracy in publishing forecasts. For example, gurus who make frequent public pronouncements may be those most prone to extreme forecasts to attract attention (and customers) by stimulating greed and fear. See “Why Gurus Go to Extremes” and “A Sign of All Times…”.
  • There may be, on average across sampled gurus, a net trend following aspect that does not adequately account for mean reversion in stock market returns.
  • The overall sample period may be sufficiently different from prior history that gurus anchored in older data may have persistently misjudged stock market behavior (see “Basic Equity Return Statistics”).

Individual Grading Results

The following table lists the gurus graded, along with associated number of forecasts graded and accuracy. Names link to individual guru descriptions and forecast records. Further links to the source forecast archives embedded in these records are in some cases defunct. It appears that a forecasting accuracy as high as 70% is quite rare.

Cautions regarding interpretation of accuracies include:

  • Forecast samples for some gurus are small (especially in terms of forecasts formed on completely new information), limiting confidence in their estimated accuracies.
  • Differences in forecast horizon may affect grades, with a long-range forecaster naturally tending to beat a short-range forecaster (see “Notes on Variability of Stock Market Returns”).
  • Accuracies of different experts often cover different time frames according to the data available. An expert who is stuck on bullish (bearish) would tend to outperform in a rising (declining) stock market. This effect tends to cancel in aggregate.
  • The private (for example, paid subscription) forecasts of gurus may be timelier and more accurate than the forecasts they are willing to offer publicly.
Guru Forecasts Accuracy Guru Forecasts Accuracy
David Nassar 44 68.2% Trading Wire 69 47.8%
Ken Fisher 120 66.4% Don Hays 85 47.1%
Jack Schannep 63 65.6% James Stewart 115 47.0%
David Dreman 45 64.4% Richard Band 31 46.9%
James Oberweis 35 62.9% Jim Cramer 62 46.8%
Steve Sjuggerud 54 62.1% Gary D. Halbert 93 46.4%
Cabot Market Letter 50 60.4% Dennis Slothower 145 45.6%
Louis Navellier 152 60.0% Bill Cara 208 45.6%
Jason Kelly 126 59.7% Gary Savage 134 45.0%
Dan Sullivan 115 59.1% Marc Faber 164 44.6%
John Buckingham 17 58.8% Jeremy Grantham 40 44.2%
Richard Moroney 56 57.1% Tim Wood 182 43.8%
Aden Sisters 40 55.8% Jim Jubak 144 43.4%
Jon Markman 36 55.3% Martin Goldberg 109 43.1%
Carl Swenlin 128 54.9% Price Headley 352 42.0%
Bob Doll 161 54.7% Linda Schurman 57 41.4%
Paul Tracy 52 53.8% Donald Rowe 69 40.6%
Bob Brinker 44 53.3% Igor Greenwald 37 40.5%
Mark Arbeter 230 53.2% Nadeem Walayat 67 40.5%
Gary Kaltbaum 144 53.1% Bob Hoye 57 40.0%
Robert Drach 19 52.6% John Mauldin 211 39.9%
Don Luskin 201 52.0% Jim Puplava 43 39.5%
Laszlo Birinyi 27 51.9% Comstock Partners 224 37.9%
Tobin Smith 281 50.2% Bill Fleckenstein 148 37.3%
James Dines 39 50.0% Gary Shilling 41 36.6%
Ben Zacks 32 50.0% Richard Russell 168 36.5%
Doug Kass 186 49.2% Mike Paulenoff 12 35.7%
Richard Rhodes 42 48.8% Abby Joseph Cohen 56 35.1%
Bernie Schaeffer 81 48.8% Peter Eliades 29 34.5%
Clif Droke 100 48.6% Steven Jon Kaplan 104 32.1%
Stephen Leeb 27 48.3% Curt Hesler 97 32.1%
S&P Outlook 145 48.3% Robert McHugh 132 28.6%
Carl Futia 98 48.2% Steve Saville 35 23.7%
Charles Biderman 48 47.9% Robert Prechter 24 20.8%

For a compilation of general objections and defenses made by guru’s under scrutiny, see “The Demon’s Defense”. For examples of specific objections and defenses, see:

Related Studies

For other analyses of Guru Grades project data, see:

Both “The Annual Business Week Stock Market Forecasts” and “Blogger Sentiment Analysis” relate closely to the Guru Grades project. See also the “Investing Expertise” category for a stream of formal research on the forecasting and investing performance of experts, usually in aggregate. Some items in the “Individual Investing” category are also relevant.

Early in the Guru Grades project, we solicited explanations from three gurus with relatively high accuracies regarding how they approach stock market timing:

Other Guru Reviews

Sometimes, it is possible to measure actual (or hypothetical) investment performance, such as when a guru runs a publicly traded fund, clearly specifies an investing/trading methodology or publishes a meaningfully long archive of recommended trades. See, for example (embedded links to source data may be defunct):

Sometimes there is not enough information to conduct any thorough review, with analyses limited to media searches and general questions such as those outlined in “Pandemonium?”. For example, see:

There are also a few items on gurus who do not espouse market timing, as follows:

Login
Daily Email Updates
Filter Research
  • Research Categories (select one or more)