Is the conventional linear factor model comprised of a few presumably independent predictors the best, or even a good, way to model differences in returns across assets? In the December 2019 update of their paper entitled “The Cross-Section of Returns: A Non-Parametric Approach”, Enoch Cheng and Clemens Struck compare predictive powers of conventional linear models and less presumptive tree-based methods. The latter accommodate multivariate interactions and non-linearities across all predictors. They consider two linear and two tree-based methods with parameter settings commonly used in other studies:

1a. Logit – a linear regression model including all factors.

1b. LASSO – a linear regression model with a shrinkage term that sets betas to zero for (discards) predictors that do not add information, and thereby acts as a variable selection tool.

2a. Bagged regression trees – bootstrapping to create different samples from the original data, growing an individual tree on each and combining predictions of individual trees by a simple majority vote.

2b. Boosted regression trees – a modification to bagging whereby bagging and growing trees takes place sequentially with bootstrapping subsequently adjusted to improve prediction accuracy for the forest with each new tree.

Specifically, they measure relationships between 59 predictor variables and next-month (4-week) return for a universe of 28 liquid commodity futures series. This asset universe has low trading costs and avoids survivorship bias. They use nearest, second and third month contracts, the latter two only to construct signals and the first for trading. They generally roll contracts 10 days before the last trade date. The 59 predictors include time series (intrinsic or absolute) momentum variants, moving average variants, volatility variants, value metrics, miscellaneous variables, dummies for calendar months and dummies for each of the 28 commodity contract series. They consider long-short portfolios based on top half-bottom half, top five-bottom five and top three-bottom three assets in terms of expected returns. Their break point for in-sample and out-of-sample testing is the end of 2013. Using monthly data for the 28 commodity contract series and the 59 predictors during January 1987 through October 2019, *they find that:* Keep Reading