Objective research to aid investing decisions
Menu
Value Allocations for September 2019 (Final)
Cash TLT LQD SPY
Momentum Allocations for September 2019 (Final)
1st ETF 2nd ETF 3rd ETF

Chapter 8: Two Analysis Regimes

This chapter steps through two analysis regimes via examples to illustrate avoidance and mitigation of the issues covered in Chapters 1 through 6. The first example involves a widely used technical indicator, the 10-month simple moving average, but with an investigation of whether there is more information in the average than conventionally extracted. The second example constructs in detail a portfolio-level view of a short-term trading strategy offered in the quasi-advisory (“educational”) marketplace. The purpose of the examples is to illustrate different ways that most investors can use to analyze investment strategies.

The analysis tool is Microsoft Excel. Some or all of the steps in the examples may be useful in analyzing other potentially useful asset return indicators.

8.1 Timing the U.S. Stock Market with an SMA10 Refinement

The 10-month simple moving average (SMA10) is a technical indicator widely used to define operationally the bull and bear states of a financial market. When a broad market index is above (below) its SMA10, the market state is bullish (bearish). Suppose an investor hypothesizes that the magnitude of the gap between an index and its SMA10, calculated as index level minus SMA10 divided by index level, may be informative about the degree of bullishness or bearishness.

Figure 8-1a is a visualization of the S&P 500 Index (as proxy for the U.S. stock market, on a logarithmic scale) and the gap between the S&P 500 Index and its SMA10 during January 1950 and September 2013 (the index-SMA10 graph starts in October 1950). S&P 500 Index levels are monthly (month ends) from Yahoo!Finance. Visual inspection is inadequate to infer any predictive relationship.

Figure 8-1a: S&P 500 Index and the Index-SMA10 Gap

Figure-8-1a

A reasonable first step in quantitative analysis is to see whether there is a useful linear relationship between the index-SMA10 gap at the end of a month and the index return the next month. A way to both visualize and quantify a potential linear relationship is a scatter plot with trend line and coefficient of determination (the square of the Pearson correlation). Figure 8-1b is a scatter plot relating monthly return for the S&P 500 Index (vertical axis) to prior-month index-SMA10 gap (horizontal axis) over the entire sample period. Inserting the best-fit linear trend line is a chart option, as is display of the coefficient of determination (R2). Though the trend line tilts slightly upward from left to right (a larger index-SMA10 gap last month relates to higher S&P 500 Index return this month), the value of R2 is very small. An R2 of 0.001 indicates that variation in the prior-month index-SMA10 gap explains only 0.1% of the variation in monthly S&P 500 Index return. There is no useful linear relationship at a one-month horizon.

It is possible that an indicator has delayed or spread out, rather than immediate, predictive power. One way to check for that possibility in this case is to relate prior-month index-SMA10 gap to S&P 500 Index returns in months beyond the next month.

Figure 8-1b: Scatter Plot of S&P 500 Index Monthly Return vs. Prior-Month Index-SMA10 Gap

Figure-8-1b

Figure 8-1c summarizes Pearson correlations (which can have values from -1 to +1) between S&P 500 Index monthly return and index-SMA10 gap for lead-lag relationships ranging from index return leads index-SMA10 gap by 10 months (-10) to index-SMA10 gap leads index return by 10 months (10). For example, the value of the graph at “-3” is the Pearson correlation between monthly S&P 500 Index returns and the values of the index-SMA10 gap three months later. The value of the graph at “3” is the Pearson correlation between monthly values of the index-SMA10 gap and the S&P 500 Index return three months later.

Figure 8-1c shows that S&P 500 Index return relates positively to the value of the index-SMA10 gap over the next nine or 10 months. This relationship derives from the definition of SMA10, but is not useful for an investor seeking to predict exploitable returns.

The figure also shows that the value of the index-SMA10 gap has little or no relationship with S&P 500 Index return for any of the next 10 months. Correlations are all positive for months one through four (0.01 to 0.03) and all negative for months five through 10 (-0.01 to -0.05), so there may be some small cumulative effects over these intervals. But, the generally poor predictive power might be reason to abandon the index-SMA10 gap as a potential indicator at this point.

Figure 8-1c: Lead-lag Correlations for S&P 500 Index Monthly Returns and Monthly Index-SMA10 Gap

Figure-8-1c

However, there may be some exploitable non-linearity in the relationship between indicator and future market return. In other words, a best-fit line, correlation and R2 may not be the right tools to discover a trading strategy.

Figure 8-1d summarizes average monthly S&P 500 Index returns by ordered fifth (quintile) of prior-month index-SMA10 gaps over the entire sample period, with a one standard deviation variability range for each quintile. Constructing this chart involves matching monthly returns with prior-month gaps, sorting this set by the size of the prior-month gaps and dividing the sorted set into five equal subsets. The average monthly returns and standard deviations are for these five subsets. There about 150 observations per quintile. The figure shows that the lowest (Most Negative) quintile behaves badly, with relative low average return and high volatility. The preceding linear analysis does not expose this detail.

One way to assess exploitability of this bad quintile is to introduce the return on cash.

Figure 8-1d: Average S&P 500 Index Returns by Quintile of Prior-month Index-SMA10 Gaps

Figure-8-1d

One way to incorporate return on cash into the above (non-linear) quintile analysis is by using S&P 500 Index excess, rather than raw, returns. Excess return is that in addition to the return on cash, estimated from the yield on 13-week U.S. Treasury bills (T-bills). The excess index return each month is the raw index return minus one twelfth the T-bill yield for the same month. T-bill yields are monthly (month ends) from Yahoo!Finance.

Figure 8-1e summarizes the effect of incorporating the return on cash on the above average S&P 500 Index monthly returns by quintile of index-SMA10 gaps. Again, there are about 150 observations per quintile. Average raw returns are the same as those in Figure 8-1d, while average excess returns result from debiting contemporaneous T-bill yields from index returns. Results suggest that an investor would be better off in cash than in the index when prior-month index-SMA10 gaps are most negative.

One way to judge reliability of this finding across market environments is to repeat the test on subperiods.

Figure 8-1e: Average S&P 500 Index Excess Returns by Quintile of Prior-month Index-SMA10 Gaps

Figure-8-1e

Figure 8-1f summarizes average S&P 500 Index monthly excess returns by quintile of index-SMA10 gaps during two equal subperiods. The process of generating quintile results is exactly as above, except for treating the first and second halves of the sample period as separate samples. The number of observations per quintile is therefore only about 75, so subperiod results are less reliable than those above for the full sample. The breakpoint between sample halves is in early 1982. Results are similar for the top three quintiles, but markedly different for quintile 2 and somewhat different for quintile 1 (Most Negative). These differences undermine belief in reliability of the anomaly, but quintile 1 (Most Negative) is still weakest.

The S&P 500 Index series does not account for dividends. One way to assess the impact of dividends on findings is to substitute the dividend-adjusted index-tracking SPDR S&P 500 (SPY) exchange-traded fund for the S&P 500 Index since its inception in January 1993.

Figure 8-1f: Average S&P 500 Index Excess Returns by Quintile of Prior-month
Index-SMA10 Gaps for Two Equal Subperiods

Figure-8-1f

Figure 8-1g summarizes results for repeating the process used to generate Figure 8-1d but using dividend-adjusted SPY returns over the available sample period. Dividend-adjusted SPY data are monthly (month ends) from Yahoo!Finance. To clarify, this process still uses the S&P 500 Index to generate SMA10 and calculate the index-SMA10 gap, but this time calculates raw monthly returns from dividend-adjusted SPY rather than the index. These raw monthly returns implicitly assume that the investor reinvests SPY dividends immediately into more shares of SPY. Results are similar to those in Figure 8-1d, boosting confidence in reliability of any anomaly. In fact, the performance of the lowest (Most Negative) quintile is worse here than in Figure 8-1d, despite inclusion of dividends. However, the January 1993 starting point yields only about 50 observations per quintile, so results are less reliable than those above for the S&P 500 Index full sample and subsamples.

Figure 8-1g: Average SPY Returns by Quintile of Prior-month S&P 500 Index-SMA10 Gaps

Figure-8-1g

Figure 8-1h summarizes results for repeating the process used to incorporate return on cash and generate Figure 8-1e, but using dividend-adjusted SPY returns over the available sample period. Again, this process still uses the S&P 500 Index to generate SMA10 and calculate the index-SMA10 gap, but this time calculates monthly returns from dividend-adjusted SPY rather than the index. Average raw returns are the same as those in Figure 8-1g, while average excess returns result from debiting contemporaneous T-bill yields from SPY returns. Results suggest that an investor would be better off in cash than in SPY when prior-month index-SMA10 gaps are most negative. Again, there are only about 50 observations per quintile.

Findings so far are fairly consistent and robust to including dividends, but limited to average results. One way to check time series effects is to model market timing strategies, including an estimate of investment frictions.

Figure 8-1h: Average SPY Excess Returns by Quintile of Prior-month S&P 500 Index-SMA10 Gaps

Figure-8-1h-r

Figure 8-1i depicts basic performance statistics (average monthly returns with one standard deviation variability ranges) for four scenarios:

  1. Gap Strategy Gross: This scenario is in SPY (cash) whenever the S&P 500 Index-SMA10 gap at the prior-month close is greater than -5%. The -5% threshold is guesstimated from data in the above quintile sorts (this is snooping, to be addressed below). There is no delay between signal and execution, so the investor must slightly anticipate the signal. Switching between SPY and cash is frictionless.
  2. Gap Strategy Net: This scenario is the same as Gap Strategy Gross, except the investment bears an investment friction (transaction fee plus part of the bid-ask spread for SPY) of 0.25% whenever there is a switch between SPY and cash. This level of investment friction is conservative (high) for SPY for most investors.
  3. Buy and Hold: This benchmark scenario buys and holds SPY over the entire sample period.
  4. SMA10 Strategy Net: This benchmark scenario applies the conventional timing rule of being in SPY (cash) whenever the S&P 500 Index is above (below) its SMA10 at the prior-month close. Again, there is no delay between signal and execution, so the investor must slightly anticipate the signal. The investment bears an investment friction of 0.25% whenever there is a switch between SPY and cash

Return-to-risk ratios (average return divided by standard deviation) for the four strategies, in order, are: 0.29, 0.28, 0.18 and 0.30. On this basis, Gap Strategy Net is superior to Buy and Hold and competitive with the conventional SMA10 Strategy Net.

Figure 8-1i: Performance Statistics for a Simple SPY Timing Strategy Based on the “Gap”

Figure-8-1i-r

Terminal value is also an important strategy performance metric. Figure 8-1j compares the cumulative/terminal values of $1.00 initial investments at the end of January 1993 in the above four scenarios. Terminal values for Gap Strategy Net, Buy and Hold and SMA10 Strategy Net are $7.76, $5.61 and $8.10, respectively. It appears that the Gap Strategy and the SMA10 strategy outperform Buy and Hold by switching between SPY and cash to avoid bear markets, but not at exactly the same times. Since there are not many bear markets during the sample period, discrimination between the Gap Strategy and the conventional SMA10 strategy is not very reliable. The bear markets in the sample may not be typical of future bear markets.

One way to test reliability of results for the Gap Strategy is to check sensitivity to the -5% threshold. Is this value lucky?

Figure 8-1j: Cumulative Performance for a Simple SPY Timing Strategy Based on the “Gap”

Figure-8-1j

Figure 8-1k shows how the terminal value of Gap Strategy Net varies as the gap threshold value for switching between SPY and cash varies from -10% (S&P 500 Index is 10% below its SMA10) to +5% (S&P 500 Index is 5% above its SMA10). The assumed level of investment friction for switching between SPY and cash is 0.25%. The chart also shows the terminal value of Buy and Hold as a benchmark. Results suggest that:

  • 0% is optimal (and, in fact, is equivalent to the conventional SMA10 Strategy Net).
  • Thresholds in the range -5% to 0% (not just one lucky setting) work pretty well.
  • Setting the threshold too high does not work.

The middle of the outperforming range (-2%) may be least risky with respect to snooping bias. For investors concerned with taxes, a lower (more negative) threshold means modestly fewer trades and potentially diminished tax impacts. The number of trades ranges from 15 for a -10% threshold to 37 for a +1% threshold.

Another simple sensitivity test shows the effect of varying investment friction.

Figure 8-1k: Sensitivity of Gap Strategy to Gap Timing Threshold

Figure-8-1k

Figure 8-1l shows how the terminal value of Gap Strategy Net varies as the assumed level of friction (again, essentially transaction fee plus part of the bid-ask spread) for switching between SPY and cash varies from 0% to 2.5%. The gap threshold for switching between SPY and cash is -5%. The chart also shows the terminal value of Buy and Hold as a benchmark and the effect of varying friction on the terminal value of the conventional SMA10 Strategy Net.

 Based on net terminal value, the breakeven investment friction relative to Buy and Hold is between 2.00% and 2.25% for Gap Strategy Net, offering a large margin of safety for most investors. The conventional SMA10 strategy varies more steeply and has a slightly lower breakeven friction than Gap Strategy Net because the former switches more times.

Figure 8-1l: Sensitivity of Gap Strategy to Level of Investment Friction

Figure-8-1l

The principal purpose of this section is to illustrate some investment strategy analysis techniques. An ancillary finding is that the conventional SMA10 strategy is close to optimal for exploiting the non-linear power of the gap between the S&P 500 Index and its SMA10 to predict future index returns (in historical data).

8.2 Trading ETFs Based on Complex, Short-term Technical Signals

Suppose an online “educational” source offers a subscription to a series of trading opportunities (entry and exit signals) based on an unspecified combination of technical indicators purported to identify short-term mispricings within a set of fairly liquid exchange-traded funds (ETF). Supporting public material includes a list of 165 trades over about 31 months. Trade data include the ETF traded, whether the trade is long or short, opening date, entry price, closing date, exit price and trade profit. The disclaimer accompanying the trade data states that: (1) reported trade profits are gross of investment frictions; and, (2) the trades are from a backtest and not real trading. The headline statistic promoting the subscription service is aggregate win rate, which at 82% looks very attractive.

A spot check of some trade entry and exit prices verifies that the ETFs involved actually traded at the listed prices on the specified dates. Concerns regarding evaluation of the subscription service include:

  • The reported returns are gross, not net. Debiting reasonable investment frictions (including shorting costs, since many trades are short) may make trades unattractive.
  • Trade data are from a backtest, enabling snooping of technical rules to discover those luckiest for the backtest sample period. Since the technical rules used are not known, parameter sensitivity testing to assess robustness of the rules (as in Figure 8-1k) is not possible.
  • Backtest results are trade-level, not portfolio-level. Imposing capital constraints and accounting for return on idle cash may produce unattractive portfolio-level performance.

A simple initial verification test is to use the backtest data to confirm the gross aggregate win rate. Checking based on the publicly available trade data indicates that the gross win rate is 72%, not 82%. Checking marketing representations can be informative. Average gross profit per trade is 0.94%, with standard deviation 4.18%.

A reasonable next step is to estimate round-trip trading frictions and net trade-level profitability. For example, assume that:

  • Position size for each trade is $10,000.
  • One-way broker transaction fee is $5 per trade.
  • The round-tip effective ETF bid-ask spread is 0.1% of trade value. There is no impact of trading on ETF price.
  • The amortized subscription fee is $36 per round-trip trade ($195 per month subscription fee divided by an average 5.4 trades per month).
  • Ignore costs of shorting and tax implications of trading.

These assumptions produce a per-trade investment friction of 0.56%. Applying this friction to each trade reduces the win rate to 62% and the average profit per trade to 0.38%.

Dividing the sample of trades into equal halves is a quick way to check for performance persistence. Figure 8-2a summarizes simple gross and net per-trade performance statistics overall and for the first and second halves of the sample. On a per-trade basis, the second half of the sample is much less profitable, but also much less variable, than the first half. On average, the second half of trades is unprofitable on an average net basis (with win rate 55%). The weak performance of the second half of the sample suggests that the first half entails a lucky start for the strategy.

Figure 8-2a: Gross and Net Trade Performance Overall and by Subsample

Figure-8-2a

A more granular way to assess strategy performance over time is to plot gross or net profit per trade in sequence, as in Figure 8-2b. The best-fit linear trend line (dark dashed line) has a pronounced downward slope from left to right, dipping below the axis. This trend supports belief that the starting point of the sample period is lucky for the strategy.

Figure 8-2b: Net Profitability by Trade in Sequence

Figure-8-2b

A potentially informative robustness test is to compare the performance of long trades and short trades. Since the above analysis ignores any costs of shorting, better performance by the long trades would be encouraging, and better performance by the short trades would be discouraging. Figure 8-2c summarizes simple gross and net per-trade performance statistics for the 90 long and 75 short trades in the sample. The short trades are on average much more profitable than the long trades, but outcomes of short trades are highly volatile. Any broker charges for maintaining short positions would reduce net profitability of the short trades.

Figure 8-2c: Gross and Net Trade Performance of Long and Short Trades

Figure-8-2c

Implementing this strategy means constructing a portfolio to exploit some or all trades as signaled as constrained by resources available. Figure 8-2d shows the number of the 165 trades open on each trading day during the 31-month sample period (this figure is the same as Figure 6-1). Trading opportunities cluster. The average number of active positions is 1.1 per trading day (a day when the market is open), but there are no open positions 55% of the time.

Suppose an investor implements a portfolio at the beginning of the sample period with a plan of risking about one third of capital on each position entered (such that short positions are initially fully covered by cash against a risk of margin calls if the ETF price goes up). With this constraint of no more than three open positions, the average number of active positions is 0.93 per day. The portfolio is therefore on average in active positions about 31% of the time (0.93 divided by three) and in cash (awaiting new active opportunities) about 69% of the time.

The portfolio misses 37 out of 165 trading opportunities because no capital is available when the service issues signals for 37 trades. This reduction of trades increases the subscription fee per trade, such that round-trip investment friction escalates from 0.56% to 0.67% per trade. The increase in friction reduces average net profit per trade from 0.38% to 0.28% and win rate from 62% to 61%.

Figure 8-2d: Availability of Trade Opportunities over Time

Figure-8-2d

Figure 8-2e shows the number of the trades open on each trading day during the 31-month sample period for the constrained portfolio as specified above. The portfolio is fully invested (three positions) about 18% of the time, two-thirds invested about 12% of the time, one-third invested about 15% of the time and fully in cash about 55% of the time. The sample is recent and return on cash is close to zero over the 31 months.

Incorporating time series effects requires stepping through and compounding the sequence of net trade returns for this deployment of capital (assuming initial position sizes of $10,000 per the investment friction estimate above).

Figure 8-2e: Constrained Availability of Trade Opportunities over Time

Figure-8-2e

Figure 8-2f plots the net cumulative performance of the portfolio depicted in Figure 8-2d. Baseline assumptions underlying the graph are:

  • Initial capital is $30,000.
  • If all in cash, allocate one-third of cash to a new position (including short positions to mitigate the risk of margin calls).
  • If one-third invested, allocate half of cash to the next position.
  • If two-thirds invested, allocate the cash balance to the next position.
  • Skip any recommendations made while fully invested.
  • Round trip trading friction is 0.67% (based on amortization of the fixed subscription fee over 128 trades, plus transaction fee and effective bid-ask spread). This assumption becomes increasingly pessimistic (optimistic) as portfolio value rises (falls).
  • Daily return on cash is about two-thirds the daily yield on 13-week U.S. Treasury bills (T-bill), based on the average percentage of the portfolio that is in cash (for this recent sample, the T-bill yield is near zero so return on cash is negligible). Daily T-bill yields are from Yahoo!Finance.

The graph is “steppy” because calculations do not mark positions to market prices daily while open. Under these assumptions, the combination of investment frictions and return volatility more than offset average gross return per trade. The terminal value of the portfolio is $25,676, a loss of 14.4% over 31 months.

Figure 8-2f: Baseline Three-position Portfolio Net Performance

Figure-8-2f

The assumed level of investment friction depends on transaction fee, effective bid-ask spread and, especially position size (for amortization of the fixed transaction fee and subscription service fee). For example, lowering the effective bid-ask spread from 0.10% to 0.02% and increasing the initial position size from $10,000 to $100,000 reduces the three-position portfolio investment friction from 0.67% to 0.08% per trade. Lower investment friction means better net performance. However, setting trading friction to zero increases terminal value only from $25,676 to just $34,060, a gain of 13.5% over 31 months.

Conversely, keeping the transaction fee at $5 and the effective bid-ask spread at 0.10% while lowering initial position size to $5,000 increases investment friction per trade to 1.23% (driven by subscription service fee amortization) and decreases terminal value to $10,085, a loss of 66% over 31 months.

As a benchmark, buying and holding SPDR S&P 500 (SPY) over the sample period provides a total (dividend-reinvested) return of about 40%.

The above analyses, while requiring substantial effort, are still imprecise regarding exact capital requirements for shorting and variation of investment friction with position size over time. The stopping point for due diligence as “conclusive enough” involves judgment.

As illustrated in this example, estimating portfolio-level performance involves many assumptions, and calculations are fairly complex. Changing the assumptions alters the outcome. An investor considering a trade-oriented service should model a personalized portfolio to estimate portfolio-level profitability. Many such services ignore trading frictions and portfolio-level capital management.

8.3 Summary

Key messages from this chapter are:

  • Due diligence even on straightforward investment strategies to address issues of snooping bias, implementation frictions, market adaptation and portfolio-level returns involves a fair amount of work.
  • Investors should approach due diligence as a set of stress tests to break the strategy, in order to judge just how breakable it is.
  • Charlatans selling trading strategies should focus exclusively on what can possibly (maybe even incredibly) go right with a strategy and ignore what can go wrong.

Chapter 9, for investors seeking a due diligence shortcut, considers the alternative of delegating strategy development to experts.

Daily Email Updates
Login
Research Categories
Recent Research
Popular Posts