Random Forest
An ensemble of decision trees trained on bootstrapped data samples, combining predictions to reduce overfitting and improve generalization.
Overview
Random Forest is an ensemble machine learning algorithm that builds multiple decision trees on random subsets of the training data and random subsets of features, then aggregates their predictions. In trading, it can capture complex non-linear relationships between technical indicators, fundamental data, and future returns that linear models like logistic regression cannot detect. Its built-in feature importance scores are also valuable for understanding which signals drive predictions.
How it looks on a chart
Illustration only — synthetic data generated for visual reference.
A single decision tree learns rules like "if RSI < 30 AND price is above the 200-day MA AND MACD histogram is positive, then the stock will likely go up." But a single tree can memorize the training data and fail on new data (overfitting). Random Forest solves this by building hundreds of these decision trees, each trained on a slightly different random sample of the historical data and using a random subset of the available indicators. Each tree votes on the outcome, and the majority wins. Because no single tree is shown all the data or all the indicators, the ensemble is much more robust. One of the most useful features is "feature importance" — the forest automatically tells you which indicators were most useful for making predictions. This helps you understand whether, for example, RSI or MACD or volume was more predictive for a particular asset during a particular period.
Random Forest parameters: n_estimators (number of trees, typically 100–500), max_features (fraction of features sampled per tree, √n_features for classification), max_depth (tree depth — shallower trees reduce overfitting), and min_samples_leaf (minimum samples in leaf nodes for regularization). For time series financial data, random splitting into train/test sets is incorrect — data must be split temporally (train on years 1–5, test on year 6) to avoid look-ahead bias. Walk-forward cross-validation with expanding or sliding training windows is the correct methodology. Feature engineering is critical: raw price levels are not informative inputs (non-stationary). Instead, use returns over multiple periods, technical indicator values (RSI, MACD, ADX), volatility measures (ATR percentile), volume ratios, and calendar features (day of week, month). Scaling features is not strictly necessary for tree models but is good practice.
Random Forest's theoretical advantage over linear models in trading comes from its ability to capture interaction effects: situations where the predictive value of RSI depends on the level of ADX (RSI signals work differently in trending vs. ranging markets). Linear models cannot capture this without explicitly engineering interaction terms. A well-documented weakness of Random Forest in financial applications is temporal covariate shift — the relationship between features and returns changes over market regimes. A forest trained in 2010–2020 may perform poorly in 2022 if the macro environment has changed fundamentally. Adaptive approaches that weight recent data more heavily or retrain more frequently partially address this. In academic finance, ML models including Random Forest have been shown to significantly outperform linear models for cross-sectional equity return prediction (Gu, Kelly, Xiu 2020). Using 94 predictive signals, their gradient boosted trees and neural networks achieve out-of-sample R² of ~0.4% for monthly returns — small but highly significant given the sample size, and sufficient for profitable long-short strategies.
Formula
Prediction = majority_vote(Tree₁(x), Tree₂(x), ..., Treeₙ(x)) Each tree: trained on bootstrap sample, random feature subset at each split Feature Importance: Gini impurity reduction averaged across all trees
- 1.Prepare features (technical indicators as inputs) and labels (future return sign or magnitude).
- 2.Bootstrap n training samples with replacement; select random √p features per split.
- 3.Grow each decision tree to max_depth using the selected features and bootstrapped data.
- 4.Aggregate predictions: majority vote for classification, mean for regression.
- 5.Extract feature importances (mean Gini decrease); validate out-of-sample with walk-forward testing.
Parameters
| Parameter | Default | Range | Description |
|---|---|---|---|
| Number of Trees | 100 | 10–500 | Number of decision trees in the ensemble. |
| Max Depth | 5 | 2–20 | Maximum depth of each decision tree (lower = more regularization). |
| Training Window | 252 | 60–1000 | Number of bars used for model training. |
Trading signals
bullish: Random Forest probability of up move > 0.60
Ensemble of trees predicts positive return — high-confidence long signal.
bearish: Random Forest probability of up move < 0.40
Ensemble predicts negative return — high-confidence short or exit signal.
neutral: Top feature importance shifts to volatility features
Model detecting regime change — reduce position size and monitor closely.
Limitations
- •Computationally expensive to train on large datasets; requires periodic retraining as market conditions evolve.
- •Not inherently suited to sequential (temporal) data without careful feature engineering to capture time dependencies.
- •Black-box nature — while feature importances help, individual predictions are hard to explain compared to linear models.
- •Overfits easily if max_depth is too high or training window is too short relative to feature dimensionality.
Gilito trains rolling Random Forest classifiers on each asset with a 1-year expanding window, using the full suite of 50+ technical indicator features as inputs. Feature importance rankings from these models inform which indicators are most predictive per asset, and the model outputs are combined with LSTM forecasts in a meta-ensemble that drives strategy selection.