Objective research to aid investing decisions

Value Investing Strategy (Strategy Overview)

Allocations for June 2024 (Final)

Momentum Investing Strategy (Strategy Overview)

Allocations for June 2024 (Final)
1st ETF 2nd ETF 3rd ETF

Investing Expertise

Can analysts, experts and gurus really give you an investing/trading edge? Should you track the advice of as many as possible? Are there ways to tell good ones from bad ones? Recent research indicates that the average “expert” has little to offer individual investors/traders. Finding exceptional advisers is no easier than identifying outperforming stocks. Indiscriminately seeking the output of as many experts as possible is a waste of time. Learning what makes a good expert accurate is worthwhile.

Inherent Misspecification of Factor Models?

Do linear factor model specification choices inherently produce out-of-sample underperformance of investment strategies seeking to exploit factor premiums? In their January 2024 paper entitled “Why Has Factor Investing Failed?: The Role of Specification Errors”, Marcos Lopez de Prado and Vincent Zoonekynd examine whether standard practices induce factor specification errors and how such errors might explain actual underperformance of popular factor investing strategies. They consider potential effects of confounding variables and colliding variables on factor model out-of-sample performance. Based on logical derivations, they conclude that: Keep Reading

The State of LLM Use in Accounting and Finance

How might Large Language Models (LLM), trained to understand, generate and interact with human language via billions or trillions of tuned parameters, impact accounting and finance? In their December 2023 paper entitled “A Scoping Review of ChatGPT Research in Accounting and Finance”, Mengming Dong, Theophanis Stratopoulos and Victor Wang synthesize recent publications and working papers on ChatGPT and related LLMs to inform practitioners and researchers of the latest developments and uses. They also provide a brief history of LLMs. Based on review of about 200 papers released during January 2022 through October 2023, they conclude that: Keep Reading

Performance of Barron’s Annual Top 10 Stocks

Each year in December, Barron’s publishes its list of the best 10 stocks for the next year. Do these picks on average beat the market? To investigate, we scrape the web to find these lists for years 2011 through 2023, calculate the associated calendar year total return for each stock and calculate the average return for the 10 stocks for each year. We use SPDR S&P 500 ETF Trust (SPY) as a benchmark for these averages. We source most stock prices from Yahoo!Finance, but also use Historical Stock Price.com for a few stocks no longer tracked by Yahoo!Finance. Using year-end dividend-adjusted stock prices for the specified stocks-years during 2010 through 2023, we find that: Keep Reading

Which Predictors Make Machine Learning Work?

With stock portfolio construction increasingly based on “black box” machine learning models with very large numbers of inputs, how can investors decide whether portfolio recommendations make sense? In their November 2023 paper entitled “The Anatomy of Machine Learning-Based Portfolio Performance”, Philippe Coulombe, David Rapach, Christian Montes Schütte and Sander Schwenk-Nebbe describe a way to use Shapley values to estimate contributions of groups of related inputs to machine learning-based portfolio performance. Their approach applies to any fitted prediction model (or ensemble of models) used to forecast asset returns and construct a portfolio based on the forecasts. They illustrate their approach on an XGBoost machine learning model that each month:

  • Uses 207 firm characteristics to forecast next-month returns of associated stocks.
  • Excludes stocks in the bottom 20% of NYSE market capitalizations.
  • Sorts surviving stocks into fifths, or quintiles, based on forecasted returns.
  • Reforms a hedge portfolio that is long (short) the value-weighted top (bottom) quintile.

They then assign each of the 207 inputs to one of 20 groups based on similarities and estimate the contribution of each input group to portfolio performance. Using 207 monthly firm/stock characteristics for all listed U.S. firms and the monthly risk-free rate during January 1960 through December 2021, with portfolio testing commencing January 1973, they find that:

Keep Reading

GPT-4 as Stock Ranker

Can the large language model GPT-4 help investors make investment decisions? In their October 2023 paper entitled “Can ChatGPT Assist in Picking Stocks?”, Matthias Pelster and Joel Val conduct a live test during the 2023 second quarter earnings announcements of the value and timeliness of investment advice from GPT-4 augmented with WebChatGPT for internet access. They ask GPT-4 for two separate series of ratings for each S&P 500 firm over approximately two months:

  1. Considering all available information from news outlets and social media discussions, provide on a scale from -5 to +5 a forecast for the next earnings announcement.
  2. Rate on a scale from -5 to +5 the attractiveness of the stock of each firm over the next month.

They apply these two series to assess the accuracy of GPT-4 earnings forecasts and the response of its stock attractiveness ratings to news. They also measure 30-day future returns of equal-weighted portfolios based on GPT-4 attractiveness ratings, reformed with each ratings update. Using the two series of GPT-4 ratings during July 5, 2023 through  September 8, 2023, they find that: Keep Reading


Can large language models (LLM) such as ChatGPT and GPT-4 pass the Chartered Financial Analyst (CFA) exam, which covers fundamentals of investment tools, asset valuation, portfolio management and wealth planning? In their October 2023 paper entitled “Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on Mock CFA Exams”, Ethan Callanan, Amarachi Mbakwe, Antony Papadimitriou, Yulong Pei, Mathieu Sibue, Xiaodan Zhu, Zhiqiang Ma, Xiaomo Liu and Sameena Shah investigate whether ChatGPT and GPT-4 could pass the CFA exam. They ask the models to respond to mock exam questions from the first two of the three levels on the exam:

  • Level I – 180 standalone multiple choice questions (using questions from five mock exams).
  • Level II – 22 vignettes and 88 accompanying multiple choice questions, with a higher proportion requiring interpretation of numerical data and calculations than found in Level I (using questions from two mock exams).
  • Level III – a mix of vignette-related essay questions and vignette-related multiple choice questions (untested due to the difficulty of assessing essay responses).

They assess responses to the Level I and II mock exam questions via three approaches:

  1. Gauging inherent model reasoning abilities without providing any correct examples.
  2. Facilitating model acquisition of new knowledge by providing examples of good responses for either (a) a random sample of questions within level or (b) one question from each exam topic.
  3. Prompting the models to address each question step-by-step and to show their work for calculations.

They then compared responses of the two models to approved answers and estimate whether either could pass based on proficiency thresholds reported by CFA exam takers on Reddit. Using mock CFA Level I and II exam questions and the three test approaches as described above, they find that: Keep Reading

Predicting Short-term Market Returns with LLM-generated Market Sentiment

Does financial news sentiment as interpreted by large language models (LLM) such as ChatGPT and BARD predict short-term stock market returns? In their September 2023 paper entitled “Large Language Models and Financial Market Sentiment”, Shaun Bond, Hayden Klok and Min Zhu separately test the abilities of ChatGPT and BARD to predict daily, weekly and monthly S&P 500 Index returns based on sentiments they extract from daily financial news summaries. ChatGPT is trained on information available on the web through September 2021. In contrast, BARD is connected to the web and updates itself on live information. The authors:

  1. Ask each of ChatGPT and BARD to summarize the most important news from the Thomson Reuters News Archives for each trading day starting in January 2000.
  2. Consolidate each set of daily summaries.
  3. Ask each of ChatGPT and BARD to use their respective set of summaries to quantify market sentiment each day on a scale from 1 (weakest) to 100 (strongest) and separately evaluate the sentiment as positive, neutral or negative.
  4. Relate via regressions each set of daily sentiment measurements to next-day, next-week and next-month S&P 500 Index returns. These regressions control for same-day index return, VIX, short-term credit risk and the term spread (plus additional variables when predicting monthly returns). 

For ChatGPT, analysis extends through September 2021 (the end of its training period). For BARD, analysis continues through July 2023. As benchmarks, they consider sentiment measurements from two traditional dictionary methods and two simple transformer classifiers. To estimate economic value of predictions, they compute certainty equivalent returns (CER) for a mean-variance investor who allocates between the S&P 500 Index and a risk-free asset each day according to out-of-sample sentiment measurements starting in 2006. Using Thomson Reuters News Archives and daily, weekly and monthly S&P 500 Index returns since January 2000, they find that: Keep Reading

Using ChatGPT to Assess Soft Firm-level Risks

Can artificial intelligence (AI) models help investors quantify vague firm risks through textual analysis? In their October 2023 paper entitled “From Transcripts to Insights: Uncovering Corporate Risks Using Generative AI”, Alex Kim, Maximilian Muhn and Valeri Nikolaev explore the value of generative AI tool ChatGPT 3.5 in quantifying firm risks based on politics, climate change and AI as conveyed in earnings conference call transcripts. For each of the three risks, they generate: (1) risk summaries based solely on the transcripts, and (2) risk assessments in full context based on the transcripts plus all ChatGPT training data. They consider risk analysis both within (before September 2021) and outside (January 2022 through March 2023) ChatGPT’s training period. They test the import of ChatGPT-based risk assessments via 5-factor (accounting for market, size, book-to-market, profitability and investment effects) alphas of hedge portfolios that are that are long the fifth (quintile) of stocks with the highest assessed risks and short the quintile with the lowest. Using earnings transcripts and monthly returns for a broad sample of U.S. stocks during January 2018 through March 2023, they find that: Keep Reading

Do ETFs Following Gurus/Insiders Work?

Do exchange-traded funds (ETF) that attempt to mimic holdings of hedge fund gurus and/or firm insiders offer attractive performance? To investigate, we consider seven ETFs, four live and three dead, in order of introduction:

    • Invesco Insider Sentiment (NFO) – focuses on stocks attracting interest of insiders such as company executives, fund managers and sell side analysts. This fund is dead as of February 2020.
    • Invesco BuyBack Achievers (PKW) – tracks the Nasdaq US BuyBack Achievers Index, comprised of stocks of U.S. firms with a net decline in shares outstanding of 5% or more in the last 12 months.
    • Direxion All Cap Insider Sentiment (KNOW) –  tracks the S&P Composite 1500 Executive Activity & Analyst Estimate Index, comprised of U.S. stocks that have favorable analyst ratings and are being acquired by firm insiders (top management, directors and large institutions). This fund is dead as of October 2020.
    • AlphaClone Alternative Alpha – (ALFA) – tracks the proprietary AlphaClone Hedge Fund Masters Index, comprised of U.S. securities held by the highest ranked managers of  hedge funds and institutions. This fund is dead as of August 2022.
    • Global X Guru Index (GURU) – tracks the Solactive Guru Index, comprised of the highest conviction ideas from a select pool of hedge funds.
    • Direxion iBillionaire (IBLN) –  tracks the proprietary iBillionaire Index, comprised of 30 U.S. mid and large cap securities. This fund is dead as of April 2018.
    • Goldman Sachs Hedge Industry VIP (GVIP) – tracks the proprietary GS Hedge Fund VIP Index, comprised of stocks appearing most frequently among the top 10 equity holdings of fundamentally driven hedge fund managers.

We use SPDR S&P 500 (SPY) as a simple benchmark for all these ETFs. We focus on monthly return statistics, along with compound annual growth rates (CAGR) and maximum drawdowns (MaxDD). Using monthly returns for the above guru/insider-following ETFs and SPY as available through September 2023, we find that: Keep Reading

Deep Reinforcement Learning Versus MPT

Does machine learning reliably offer better risk-adjusted portfolio performance than traditional modern portfolio theory (MPT)? In their August 2023 paper entitled “Comparing Deep RL and Traditional Financial Portfolio Methods”, Eric Benhamou, Jean-Jacques Ohana, Beatrice Guez, David Saltiel, Rida Laraki and Jamal Atif compare principles, methodologies and risk-adjusted performances of dynamic deep reinforcement learning (DRL) and MPT. The DRL approach seeks long-only allocations that maximize Sharpe ratio (calculated assuming a zero risk-free rate). DRL training data includes individual asset returns, portfolio drawdown and contextual variables including U.S. and European interest rates, the CBOE volatility index (VIX), credit default swap prices, currency rates (U.S. dollar index), GDP and CPI forecasts, crude oil/gold/copper inventories and global, U.S., European, Japanese and emerging markets economic surprise indexes. DRL training employs an expanding window, each year training on available historical data and testing on the next year. They consider three MPT portfolios also using expanding window of historical data to estimate inputs: (1) full MPT (Markowitz); (2) minimum variance; and, (3) risk parity. Their global test data consists of daily returns of 11 futures contract series for four major equity indexes, four major bond indexes and three major commodity indexes. They assume trading frictions of 0.02% of value traded. Using the specified (groomed) data during 2000 through mid-2023, they find that: Keep Reading

Daily Email Updates
Filter Research
  • Research Categories (select one or more)