Objective research to aid investing decisions

Value Investing Strategy (Strategy Overview)

Allocations for May 2024 (Final)

Momentum Investing Strategy (Strategy Overview)

Allocations for May 2024 (Final)
1st ETF 2nd ETF 3rd ETF

Investing Expertise

Can analysts, experts and gurus really give you an investing/trading edge? Should you track the advice of as many as possible? Are there ways to tell good ones from bad ones? Recent research indicates that the average “expert” has little to offer individual investors/traders. Finding exceptional advisers is no easier than identifying outperforming stocks. Indiscriminately seeking the output of as many experts as possible is a waste of time. Learning what makes a good expert accurate is worthwhile.

Making AI Do Numbers to Predict Stock Returns

Can large language models like ChatGPT work with numbers to support technical analysis of stock returns rather than just words to support sentiment analysis? In his April 2024 paper entitled “StockGPT: A GenAI Model for Stock Prediction and Trading”, Dat Mai introduces StockGPT, an autoregressive model trained and tested on stock returns rather than firm news. He segments 1926 through 2000 daily U.S. stock returns into intervals (tokens) and then trains StockGPT to identify predictive return patterns. He then tests the model, using daily returns over the past 256 trading days to predict next-day returns during 2001 through 2023. He assesses accuracy of return predictions by:

  1. Running cross sectional regressions of actual versus predicted daily returns.
  2. Each day at the market close: (1) removing stocks in the lowest tenth of market values; and, (2) reforming an equal-weighted or value-weighted hedge portfolio that is long (short) stocks within the highest (lowest) tenth, or decile, of predicted returns.

He performs sensitivity tests that account for portfolio costs, portfolio construction delay and elimination of low-priced stocks. Using daily returns for all common stocks traded on NYSE, AMEX or NASDAQ during 1926 through 2023, he finds that: Keep Reading

Failure of Non-causal Factor Strategies

Do widely used associational (rather than causal) methods used by researchers to specify factor models of asset returns work? In their March 2024 paper entitled “The Case for Causal Factor Investing”, Marcos Lopez de Prado, Alex Lipton and Vincent Zoonekynd describe the shortcomings of associational methods of factor model development. They address p-hacking (data snooping), with focus on interferences from variables called colliders (causally influenced by two or more variables) and confounders (influencing both dependent and independent variables). They further describe what can be done to correct these shortcomings. Based on logical/mathematical analysis and the body of financial markets research, they conclude that:

Keep Reading

ChatGPT-generated Financial News Sentiment and NASDAQ Returns

Can ChatGPT extract market sentiment from financial news that is useful for timing equity markets? In their April 2024 paper entitled “Sentiment Analysis of Bloomberg Markets Wrap Using ChatGPT: Application to the NASDAQ”, Baptiste Lefort, Eric Benhamou, Jean-Jacques Ohana, David Saltiel, Beatrice Guez and Thomas Jacquot use ChatGPT to assess whether daily Bloomberg Global Markets Wrap, Market Talks and Morning Reports anticipate NASDAQ returns. Specifically, they each day:

  • Ask ChatGPT to identify important news themes and characterize them as headlines.
  • Ask ChatGPT to assess whether each headline is positive, negative or neutral for future stock prices.
  • Compute a sentiment score that combines sentiments for all daily headlines.
  • Compute a daily cumulative sentiment score (C) for the last 20 trading days.
  • Compute a daily detrended cumulative sentiment score (DC) by comparing C to its value over the last 20 trading days (extending the overall lookback interval to 40 days).
  • If C is positive (negative), take a long (short) position in the NASDAQ index with a 2-day lag to ensure executability and a debit of 0.2% trading frictions for position changes. Repeat this evolution for DC.

They separately examine cumulative performances of the long, short and overall returns of the C and DC variations of this strategy, focusing on Sharpe, Sortino and Calmar ratios as key performance metrics. Their benchmark is buying and holding the NASDAQ index. Using the specified daily financial news sources and daily NASDAQ index returns during 2010 through 2023, they find that:

Keep Reading

Coordinated Retail Traders Won the War with Short Sellers?

Do short-selling hedge funds consistently extract alpha from exuberant retail traders? In their March 2024 paper entitled “Short-Selling Hedge Funds”, Jialin Qian, Zhen Shi and Baozhong Yang examine the performance of hedge funds engaged in short-selling, as follows:

  1. Which hedge funds are likely short-sellers, and how do they compare with other hedge funds?
  2. What factors contribute to the performance of short-selling hedge funds?
  3. How has the 2021 Meme stock phenomenon affected short-selling hedge funds?

They each month identify short-selling hedge funds as those with positive return betas over the past 24 months versus a monthly rebalanced portfolio of short stock positions with weights proportional to their respective short interests. They relate behaviors of short-selling funds to those of other hedge funds and to those of retail traders. Using monthly data for 11,054 U.S. hedge funds, returns and short interests for a broad sample of U.S. stocks and data to measure retail stock trading/sentiment during 2010 through 2022, they find that: Keep Reading

Lookahead Bias in Large Language Model Training Data

Can Large Language Models (LLM) inject lookahead bias into backtests when rigor is lacking in generation of LLM training samples? In their preliminary and incomplete March 2024 paper entitled “Lookahead Bias in Pretrained Language Models”, Suproteem Sarkar and Keyon Vafa examine the potential for lookahead bias in backtests using the Llama-2 LLM to identify future firm risks based on content of earnings calls. They consider cases for which: (1) the backtest falls within the LLM training sample, but the researcher tells the LLM to consider only information before the test period; and, (2) the researcher specifies a training sample that ends before the backtest but generates it long after the end of the training sample. Using Llama-2 to interpret transcripts of selected firm earnings calls from 2018, they find that:

Keep Reading

Informativeness of Seeking Alpha Articles for Stock Returns

Are sentiments conveyed in Seeking Alpha articles useful for stock picking? In their January 2023 paper entitled “Seeking Alpha: More Sophisticated Than Meets the Eye”, Duo Selina Pei, Abhinav Anand and Xing Huan apply two-pass natural language processing to test the informativeness of articles from Seeking Alpha incremental to publicly available earnings data. Specifically, they each month:

  • Associate articles with one or more specific stocks.
  • Extract positive and negative sentiment at both phrase and aggregate levels for each article/stock.
  • Calculate a standardized net sentiment for each article/stock based on the difference between positive and negative mentions, emphasizing event sentiment over general sentiment.
  • Rank articles/stocks based on standardized net sentiment over the last month. Reform equal-weighted portfolios of articles/stocks by ranked tenths (deciles). Calculate both immediate [-1,+1] and 90-day future [+2,+90] average gross raw returns and average gross abnormal returns adjusted for size, book-to-market and momentum.
  • Sort stocks into 20 groups based on monthly standardized net sentiments up to two days before portfolio selection, excluding stocks with few articles or neutral sentiment. Reform an equal-weighted hedge portfolio that is long stocks with the highest sentiments and short stocks with the lowest (on average, 105 long and 86 short positions).

Using 350,095 articles published on Seeking Alpha since its inception in 2004 through the beginning of October 2018, daily returns of matched stocks and their options and associated earnings surprise data as available, they find that: Keep Reading

Day Trading Stocks with ChatGPT

Can artificial intelligence platforms such as ChatGPT be good stock day traders? In his March 2024 paper entitled “Can ChatGPT Generate Stock Tickers to Buy and Sell for Day Trading?”, Sangheum Cho tests whether ChatGPT 3.5 turbo supports profitable day trading. He instructs ChatGPT to pretend to be a professional day trader who picks from among U.S. listed stocks 100 to buy and 100 to sell for short-term returns based on daily Bloomberg and the Wall Street Journal news blurbs on Twitter. Each day, prior to the market open, he:

  • Uses the Refinitiv Eikon News Monitor to collect the selected tweets from the past 24 hours. He removes hyperlinks and duplicate tweets.
  • Segments the tweets into batches to accommodate ChatGPT processing limitations.
  • For each batch, asks ChatGPT to generate 100 BUY and 100 SELL signals, with 30 iterations for each batch to amplify signals by suppressing spurious selections. He then constructs equal-weighted long and short portfolios of stocks with signals.
  • For each stock with signals:
    • Sums BUY and SELL signals across batches/iterations to calculate SUM_BUY and SUM_SELL signals. He constructs signal count-weighted long and short portfolios from these summed signals.
    • Subtracts SUM_SELL from  SUM_BUY to calculate NET_BUY and NET_SELL signals. He constructs signal magnitude-weighted long and short portfolios from these netted signals.

For each portfolio, he excludes stocks with zero daily volume, missing daily prices or incomplete trading histories for the previous five trading days. He measures returns from the market open to the market close. Using 222,659 tweets (only 16,359 of which are firm-specific) and daily opening and closing prices for U.S. listed common stocks during December 2022 through December 2023 (271 trading days), he finds that:

Keep Reading

A Professor’s Stock Picks

Does finance professor David Kass, who presents annual lists of stock picks on Seeking Alpha, make good selections? To investigate, we consider his picks of:

We compare the average return for stocks picks each year with that for SPDR S&P 500 ETF Trust (SPY) for the same year as a benchmark. Using dividend-adjusted returns from Yahoo!Finance for SPY and most stock picks and returns from Barchart.com and Investing.com for three picks during their selection years, we find that: Keep Reading

Compendium of Live ETF Factor/Niche Premium Capture Tests

Some exchange-traded funds (ETF) focus on capturing potentially attractive factor premiums or thematic niches. Their histories offer a way to test these concepts live. We have conducted many such tests, listed here to offer a global view.

  1. “U.S. Equity Premium?” – evidence from simple tests on about 21 years of data suggests that stock market leadership shifts between the U.S. and other developed markets over time, but the U.S. may be better overall.
  2. “Tech Equity Premium?” – evidence from simple tests on 24 years of data suggests long boom, short bust for a tech/innovation-concentrated portfolio. It does not support belief in risk-adjusted outperformance.
  3. “Measuring the Size Effect with Capitalization-based ETFs” – evidence from simple tests of capitalization-based ETFs with nearly 22 years of data offers little support for belief in a long-term, reliably exploitable size effect among U.S. stocks.
  4. “Do Equal Weight ETFs Beat Cap Weight Counterparts?” – evidence from simple tests on some equal-weight U.S. equity ETFs offers little support for belief that equal weighting substantially and reliably beats capitalization weighting on a net basis.
  5. “Measuring the Value Premium with Value and Growth ETFs” – evidence from simple tests with 21.6 years of available data does not support belief that investors reliably capture a value premium via popular value-growth ETFs.
  6. “Are Equity Momentum ETFs Working?” – available evidence on attractiveness of momentum-oriented U.S. stock and sector ETFs is less than compelling.
  7. “Are Stock Quality ETFs Working?” – available evidence offers little support for belief that quality ETFs reliably beat respective benchmarks.
  8. “Are Low Volatility Stock ETFs Working?” – available evidence on attractiveness of low volatility stock ETFs is mixed, with recent data undermining belief in reliability of low volatility outperformance.
  9. “Are Equity Multifactor ETFs Working?” – available evidence offers very little support for belief that equity multifactor ETFs beat their benchmarks, or that they offer material diversification with comparable performance.
  10. “Are Hedge Fund ETFs Working?” – evidence on attractiveness of hedge fund-oriented ETFs is mostly negative.
  11. “Are Managed Futures ETFs Working?” – available evidence on attractiveness of managed futures ETFs in aggregate (but with recent short-sample exceptions) suggests that any benefits from diversification of equities and fixed income are unlikely to compensate for poor absolute returns.
  12. “Best Safe Haven ETF?” – evidence from simple tests over available and common sample periods suggests that silver, gold, longer-term U.S. Treasuries and investment grade corporate bonds are safe havens, while crude oil is clearly not.
  13. “Do High-dividend Stock ETFs Beat the Market?” – evidence from data for high-dividend U.S. stock ETFs does not support belief that high-dividend stocks reliably outperform the broad U.S. stock market.
  14. “Are ESG ETFs Attractive?” – available evidence suggests that ESG ETFs do not perform much differently from selected benchmarks.
  15. “How Are Renewable Energy ETFs Doing?” – available evidence on attractiveness of renewable energy ETFs is adverse overall, but with bursts of market outperformance perhaps due to novelty.
  16. “How Are Robotics-AI ETFs Doing?” – available evidence is that robotics-AI ETFs are less attractive than the broader technology exposure offered by QQQ.
  17. “How Are AI-powered ETFs Doing?” – available evidence does not support belief that ETFs using AI to select and weight assets are particularly attractive.
  18. “Are iShares Core Allocation ETFs Attractive?” – available evidence regarding attractiveness of iShares Core Asset Allocation ETFs is mixed to negative.
  19. “Are Target Retirement Date Funds Attractive?” – evidence offers little support for belief that target retirement date mutual funds are preferable to simple stocks-bonds diversification.
  20. “How Are TIPS ETFs Doing?” – available evidence on attractiveness of TIPS ETFs is mostly favorable after the recent inflation burst, with shorter duration funds offering more reliable inflation protection.
  21. “Are Equity Index Covered Call ETFs Working?” – available evidence on attractiveness of equity index covered call ETFs as either substitutes for or diversifiers of underlying stock indexes is generally adverse.
  22. “Are Equity Put-Write ETFs Working?” – available evidence on attractiveness of equity put-write ETFs is adverse.
  23. “Are IPO ETFs Working?” – available evidence on attractiveness of IPO ETFs is mixed, requiring very high risk tolerance of interested investors.
  24. “Are Preferred Stock ETFs Working?” – available evidence on attractiveness of preferred stock ETFs relative to a 60-40 stocks-bonds portfolio is largely negative.
  25. “Do Convertible Bond ETFs Attractively Meld Stocks and Bonds?” – available evidence suggests that convertible bond ETFs sometimes outperform and sometimes underperform a conventional 60-40 stocks-bonds portfolio.
  26. “Do ETFs Following Gurus/Insiders Work?” – available evidence on attractiveness of guru/insider-following stock ETFs is mostly adverse.
  27. “Congressional Trade Tracking ETFs” – limited available evidence suggests that investors should choose a fund mimicking holdings of Democrat rather than Republican members of Congress.
  28. “The Long and Short of Jim” – available evidence does not support belief that funds based on Jim Cramer’s stock/market recommendations reliably produce attractive short-term returns.
  29. “Live Test of the Stock Market Overnight Move Effect” – early evidence does not support belief in exploitability of the overnight move effect.

The upshot of the above items is that academic factor research and thematic speculations rarely translate to outperformance when implemented with ETFs.

A global caution is that the period since 2009 is strong for broad equity indexes, driven by a few large-capitalization firms. This trend may not persist.

How Are AI-powered ETFs Doing?

How do exchange-traded-funds (ETF) that employ artificial intelligence (AI) to pick assets perform? To investigate, we consider six such ETFs, all currently available, as follows:

We use SPDR S&P 500 ETF Trust (SPY) for comparison, though it is not conceptually matched to some of the ETFs. We focus on monthly return statistics, along with compound annual growth rates (CAGR) and maximum drawdowns (MaxDD). Using monthly total returns for the six AI-powered ETFs and SPY as available through January 2024, we find that: Keep Reading

Daily Email Updates
Filter Research
  • Research Categories (select one or more)