Targeting Predictors in Random Forest Regression

Daniel Borup Andersen*, Bent Jesper Christensen, Nicolaj Adam Søndergaard Mühlbach, Mikkel Slot Nielsen

*Corresponding author for this work

Research output: Contribution to journal/Conference contribution in journal/Contribution to newspaperJournal articleResearchpeer-review

Abstract

Random forest regression (RF) is an extremely popular tool for the analysis of highdimensional data. Nonetheless, its benefits may be lessened in sparse settings due to weak predictors, and a pre-estimation dimension reduction (targeting) step is required. We show that proper targeting controls the probability of placing splits along strong predictors, thus providing an important complement to RF’s feature sampling. This is supported by simulations using representative finite samples. Moreover, we quantify the immediate gain from targeting in terms of increased strength of individual trees. Macroeconomic and financial applications show that the bias-variance trade-off implied by targeting, due to increased correlation among trees in the forest, is balanced at a medium degree of targeting, selecting the best 5–30% of commonly applied predictors.
Improvements in predictive accuracy of targeted RF relative to ordinary RF are considerable, up to 21%, occurring both in recessions and expansions, particularly at long horizons.
Original languageEnglish
JournalInternational Journal of Forecasting
Volume39
Issue2
Pages (from-to)841-868
Number of pages28
ISSN0169-2070
DOIs
Publication statusPublished - Apr 2023

Keywords

  • Random forests
  • Targeted predictors
  • High-dimensional forecasting
  • Weak predictors
  • Variable selection

Fingerprint

Dive into the research topics of 'Targeting Predictors in Random Forest Regression'. Together they form a unique fingerprint.

Cite this