Value Investing Strategy (Strategy Overview)

Allocations for July 2025 (Final)

Cash TLT LQD SPY

Momentum Investing Strategy (Strategy Overview)

Allocations for July 2025 (Final)

1st ETF 2nd ETF 3rd ETF

Lookahead Bias in Large Language Model Training Data

April 26, 2024 • Posted in Investing Expertise

Can Large Language Models (LLM) inject lookahead bias into backtests when rigor is lacking in generation of LLM training samples? In their preliminary and incomplete March 2024 paper entitled “Lookahead Bias in Pretrained Language Models”, Suproteem Sarkar and Keyon Vafa examine the potential for lookahead bias in backtests using the Llama-2 LLM to identify future firm risks based on content of earnings calls. They consider cases for which: (1) the backtest falls within the LLM training sample, but the researcher tells the LLM to consider only information before the test period; and, (2) the researcher specifies a training sample that ends before the backtest but generates it long after the end of the training sample. Using Llama-2 to interpret transcripts of selected firm earnings calls from 2018, they find that:

(more…)

Please log in or subscribe to continue reading...
Gain access to hundreds of premium articles, our momentum strategy, full RSS feeds, and more! Learn more

Value Investing Strategy (Strategy Overview)

Momentum Investing Strategy (Strategy Overview)

Lookahead Bias in Large Language Model Training Data

Daily Email Updates

Login

Questions?