Objective research to aid investing decisions

Value Investing Strategy (Strategy Overview)

Allocations for September 2024 (Final)
Cash TLT LQD SPY

Momentum Investing Strategy (Strategy Overview)

Allocations for September 2024 (Final)
1st ETF 2nd ETF 3rd ETF

Investing Expertise

Can analysts, experts and gurus really give you an investing/trading edge? Should you track the advice of as many as possible? Are there ways to tell good ones from bad ones? Recent research indicates that the average “expert” has little to offer individual investors/traders. Finding exceptional advisers is no easier than identifying outperforming stocks. Indiscriminately seeking the output of as many experts as possible is a waste of time. Learning what makes a good expert accurate is worthwhile.

Which Predictors Make Machine Learning Work?

With stock portfolio construction increasingly based on “black box” machine learning models with very large numbers of inputs, how can investors decide whether portfolio recommendations make sense? In their November 2023 paper entitled “The Anatomy of Machine Learning-Based Portfolio Performance”, Philippe Coulombe, David Rapach, Christian Montes Schütte and Sander Schwenk-Nebbe describe a way to use Shapley values to estimate contributions of groups of related inputs to machine learning-based portfolio performance. Their approach applies to any fitted prediction model (or ensemble of models) used to forecast asset returns and construct a portfolio based on the forecasts. They illustrate their approach on an XGBoost machine learning model that each month:

  • Uses 207 firm characteristics to forecast next-month returns of associated stocks.
  • Excludes stocks in the bottom 20% of NYSE market capitalizations.
  • Sorts surviving stocks into fifths, or quintiles, based on forecasted returns.
  • Reforms a hedge portfolio that is long (short) the value-weighted top (bottom) quintile.

They then assign each of the 207 inputs to one of 20 groups based on similarities and estimate the contribution of each input group to portfolio performance. Using 207 monthly firm/stock characteristics for all listed U.S. firms and the monthly risk-free rate during January 1960 through December 2021, with portfolio testing commencing January 1973, they find that:

Keep Reading

GPT-4 as Stock Ranker

Can the large language model GPT-4 help investors make investment decisions? In their October 2023 paper entitled “Can ChatGPT Assist in Picking Stocks?”, Matthias Pelster and Joel Val conduct a live test during the 2023 second quarter earnings announcements of the value and timeliness of investment advice from GPT-4 augmented with WebChatGPT for internet access. They ask GPT-4 for two separate series of ratings for each S&P 500 firm over approximately two months:

  1. Considering all available information from news outlets and social media discussions, provide on a scale from -5 to +5 a forecast for the next earnings announcement.
  2. Rate on a scale from -5 to +5 the attractiveness of the stock of each firm over the next month.

They apply these two series to assess the accuracy of GPT-4 earnings forecasts and the response of its stock attractiveness ratings to news. They also measure 30-day future returns of equal-weighted portfolios based on GPT-4 attractiveness ratings, reformed with each ratings update. Using the two series of GPT-4 ratings during July 5, 2023 through  September 8, 2023, they find that: Keep Reading

AI CFAs?

Can large language models (LLM) such as ChatGPT and GPT-4 pass the Chartered Financial Analyst (CFA) exam, which covers fundamentals of investment tools, asset valuation, portfolio management and wealth planning? In their October 2023 paper entitled “Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on Mock CFA Exams”, Ethan Callanan, Amarachi Mbakwe, Antony Papadimitriou, Yulong Pei, Mathieu Sibue, Xiaodan Zhu, Zhiqiang Ma, Xiaomo Liu and Sameena Shah investigate whether ChatGPT and GPT-4 could pass the CFA exam. They ask the models to respond to mock exam questions from the first two of the three levels on the exam:

  • Level I – 180 standalone multiple choice questions (using questions from five mock exams).
  • Level II – 22 vignettes and 88 accompanying multiple choice questions, with a higher proportion requiring interpretation of numerical data and calculations than found in Level I (using questions from two mock exams).
  • Level III – a mix of vignette-related essay questions and vignette-related multiple choice questions (untested due to the difficulty of assessing essay responses).

They assess responses to the Level I and II mock exam questions via three approaches:

  1. Gauging inherent model reasoning abilities without providing any correct examples.
  2. Facilitating model acquisition of new knowledge by providing examples of good responses for either (a) a random sample of questions within level or (b) one question from each exam topic.
  3. Prompting the models to address each question step-by-step and to show their work for calculations.

They then compared responses of the two models to approved answers and estimate whether either could pass based on proficiency thresholds reported by CFA exam takers on Reddit. Using mock CFA Level I and II exam questions and the three test approaches as described above, they find that: Keep Reading

Predicting Short-term Market Returns with LLM-generated Market Sentiment

Does financial news sentiment as interpreted by large language models (LLM) such as ChatGPT and BARD predict short-term stock market returns? In their September 2023 paper entitled “Large Language Models and Financial Market Sentiment”, Shaun Bond, Hayden Klok and Min Zhu separately test the abilities of ChatGPT and BARD to predict daily, weekly and monthly S&P 500 Index returns based on sentiments they extract from daily financial news summaries. ChatGPT is trained on information available on the web through September 2021. In contrast, BARD is connected to the web and updates itself on live information. The authors:

  1. Ask each of ChatGPT and BARD to summarize the most important news from the Thomson Reuters News Archives for each trading day starting in January 2000.
  2. Consolidate each set of daily summaries.
  3. Ask each of ChatGPT and BARD to use their respective set of summaries to quantify market sentiment each day on a scale from 1 (weakest) to 100 (strongest) and separately evaluate the sentiment as positive, neutral or negative.
  4. Relate via regressions each set of daily sentiment measurements to next-day, next-week and next-month S&P 500 Index returns. These regressions control for same-day index return, VIX, short-term credit risk and the term spread (plus additional variables when predicting monthly returns). 

For ChatGPT, analysis extends through September 2021 (the end of its training period). For BARD, analysis continues through July 2023. As benchmarks, they consider sentiment measurements from two traditional dictionary methods and two simple transformer classifiers. To estimate economic value of predictions, they compute certainty equivalent returns (CER) for a mean-variance investor who allocates between the S&P 500 Index and a risk-free asset each day according to out-of-sample sentiment measurements starting in 2006. Using Thomson Reuters News Archives and daily, weekly and monthly S&P 500 Index returns since January 2000, they find that: Keep Reading

Using ChatGPT to Assess Soft Firm-level Risks

Can artificial intelligence (AI) models help investors quantify vague firm risks through textual analysis? In their October 2023 paper entitled “From Transcripts to Insights: Uncovering Corporate Risks Using Generative AI”, Alex Kim, Maximilian Muhn and Valeri Nikolaev explore the value of generative AI tool ChatGPT 3.5 in quantifying firm risks based on politics, climate change and AI as conveyed in earnings conference call transcripts. For each of the three risks, they generate: (1) risk summaries based solely on the transcripts, and (2) risk assessments in full context based on the transcripts plus all ChatGPT training data. They consider risk analysis both within (before September 2021) and outside (January 2022 through March 2023) ChatGPT’s training period. They test the import of ChatGPT-based risk assessments via 5-factor (accounting for market, size, book-to-market, profitability and investment effects) alphas of hedge portfolios that are that are long the fifth (quintile) of stocks with the highest assessed risks and short the quintile with the lowest. Using earnings transcripts and monthly returns for a broad sample of U.S. stocks during January 2018 through March 2023, they find that: Keep Reading

Do ETFs Following Gurus/Insiders Work?

Do exchange-traded funds (ETF) that attempt to mimic holdings of hedge fund gurus and/or firm insiders offer attractive performance? To investigate, we consider seven ETFs, four live and three dead, in order of introduction:

    • Invesco Insider Sentiment (NFO) – focuses on stocks attracting interest of insiders such as company executives, fund managers and sell side analysts. This fund is dead as of February 2020.
    • Invesco BuyBack Achievers (PKW) – tracks the Nasdaq US BuyBack Achievers Index, comprised of stocks of U.S. firms with a net decline in shares outstanding of 5% or more in the last 12 months.
    • Direxion All Cap Insider Sentiment (KNOW) –  tracks the S&P Composite 1500 Executive Activity & Analyst Estimate Index, comprised of U.S. stocks that have favorable analyst ratings and are being acquired by firm insiders (top management, directors and large institutions). This fund is dead as of October 2020.
    • AlphaClone Alternative Alpha – (ALFA) – tracks the proprietary AlphaClone Hedge Fund Masters Index, comprised of U.S. securities held by the highest ranked managers of  hedge funds and institutions. This fund is dead as of August 2022.
    • Global X Guru Index (GURU) – tracks the Solactive Guru Index, comprised of the highest conviction ideas from a select pool of hedge funds.
    • Direxion iBillionaire (IBLN) –  tracks the proprietary iBillionaire Index, comprised of 30 U.S. mid and large cap securities. This fund is dead as of April 2018.
    • Goldman Sachs Hedge Industry VIP (GVIP) – tracks the proprietary GS Hedge Fund VIP Index, comprised of stocks appearing most frequently among the top 10 equity holdings of fundamentally driven hedge fund managers.

We use SPDR S&P 500 (SPY) as a simple benchmark for all these ETFs. We focus on monthly return statistics, along with compound annual growth rates (CAGR) and maximum drawdowns (MaxDD). Using monthly returns for the above guru/insider-following ETFs and SPY as available through September 2023, we find that: Keep Reading

Deep Reinforcement Learning Versus MPT

Does machine learning reliably offer better risk-adjusted portfolio performance than traditional modern portfolio theory (MPT)? In their August 2023 paper entitled “Comparing Deep RL and Traditional Financial Portfolio Methods”, Eric Benhamou, Jean-Jacques Ohana, Beatrice Guez, David Saltiel, Rida Laraki and Jamal Atif compare principles, methodologies and risk-adjusted performances of dynamic deep reinforcement learning (DRL) and MPT. The DRL approach seeks long-only allocations that maximize Sharpe ratio (calculated assuming a zero risk-free rate). DRL training data includes individual asset returns, portfolio drawdown and contextual variables including U.S. and European interest rates, the CBOE volatility index (VIX), credit default swap prices, currency rates (U.S. dollar index), GDP and CPI forecasts, crude oil/gold/copper inventories and global, U.S., European, Japanese and emerging markets economic surprise indexes. DRL training employs an expanding window, each year training on available historical data and testing on the next year. They consider three MPT portfolios also using expanding window of historical data to estimate inputs: (1) full MPT (Markowitz); (2) minimum variance; and, (3) risk parity. Their global test data consists of daily returns of 11 futures contract series for four major equity indexes, four major bond indexes and three major commodity indexes. They assume trading frictions of 0.02% of value traded. Using the specified (groomed) data during 2000 through mid-2023, they find that: Keep Reading

Should the “Anxious Index” Make Investors Anxious?

Since 1990, the Federal Reserve Bank of Philadelphia has conducted a quarterly Survey of Professional Forecasters. The American Statistical Association and the National Bureau of Economic Research conducted the survey from 1968-1989. Among other things, the survey solicits from experts probabilities of U.S. economic recession (negative GDP growth) during each of the next four quarters. The survey report release schedule is mid-quarter. For example, the release date of the third quarter 2023 report is August 11, 2023, with forecasts through the third quarter of 2024. The “Anxious Index” is the probability of recession during the next quarter. Are these forecasts meaningful for future U.S. stock market returns? Rather than relate the probability of recession to stock market returns, we instead relate one minus the probability of recession (the probability of good times). If forecasts are accurate, a relatively high (low) forecasted probability of good times should indicate a relatively strong (weak) stock market. Using survey results and quarterly S&P 500 Index levels (on survey release dates as available, and mid-quarter before availability of release dates) from the fourth quarter of 1968 through the third quarter of 2023 (220 surveys), we find that:

Keep Reading

AI and Asset Management

Will emerging artificial intelligence (AI) tools such as the generative large language model ChatGPT have important roles in the economy, including asset management? In his September 2023 paper entitled “Generative AI: Overview, Economic Impact, and Applications in Asset Management”, Martin Luk reviews the evolution of generative AI models, their economic impact and their applications in asset management. Specifically, he covers:

  • Key innovations and methodologies in large language models such as ChatGPT and in image-based, multimodal and tool-using generative AI models.
  • Impacts of generative AI on jobs and productivity in various industries, with focus on uses in investment management.
  • Dangers and risks associated with the use of generative AI, including the issue of hallucinations.

Based on review of nearly 200 source papers, he concludes that: Keep Reading

Online, Real-time Test of AI Stock Picking

Will equity funds “managed” by artificial intelligence (AI) outperform human investors? To investigate, we consider the performance of AI Powered Equity ETF (AIEQ). Per the offeror, the EquBot model supporting AIEQ: “…leverages IBM’s Watson AI to conduct an objective, fundamental analysis of U.S. domiciled common stocks, including Special Purpose Acquisitions Corporations (“SPAC”), and real estate investment trusts (“REITs”) based on up to ten years of historical data and apply that analysis to recent economic and news data… Each day, the EquBot Model…identifies approximately 30 to 200 companies with the greatest potential over the next twelve months for appreciation and their corresponding weights, targeting a maximum risk adjusted return versus the broader U.S. equity market. …The EquBot model limits the weight of any individual company to 10%. At times, a significant portion of the Fund’s assets may consist of cash and cash equivalents.” We use SPDR S&P 500 (SPY) as a simple benchmark for AIEQ performance. Using daily and monthly dividend-adjusted closes of AIEQ and SPY from AIEQ inception (October 18, 2017) through September 2023, we find that: Keep Reading

Login
Daily Email Updates
Filter Research
  • Research Categories (select one or more)