Random Forest

An ensemble of decision trees trained on bootstrapped data samples, combining predictions to reduce overfitting and improve generalization.

Overview

Random Forest is an ensemble machine learning algorithm that builds multiple decision trees on random subsets of the training data and random subsets of features, then aggregates their predictions. In trading, it can capture complex non-linear relationships between technical indicators, fundamental data, and future returns that linear models like logistic regression cannot detect. Its built-in feature importance scores are also valuable for understanding which signals drive predictions.

How it looks on a chart

Illustration only — synthetic data generated for visual reference.

Beginner

A single decision tree learns rules like "if RSI < 30 AND price is above the 200-day MA AND MACD histogram is positive, then the stock will likely go up." But a single tree can memorize the training data and fail on new data (overfitting). Random Forest solves this by building hundreds of these decision trees, each trained on a slightly different random sample of the historical data and using a random subset of the available indicators. Each tree votes on the outcome, and the majority wins. Because no single tree is shown all the data or all the indicators, the ensemble is much more robust. One of the most useful features is "feature importance" — the forest automatically tells you which indicators were most useful for making predictions. This helps you understand whether, for example, RSI or MACD or volume was more predictive for a particular asset during a particular period.

Intermediate

Random Forest parameters: n_estimators (number of trees, typically 100–500), max_features (fraction of features sampled per tree, √n_features for classification), max_depth (tree depth — shallower trees reduce overfitting), and min_samples_leaf (minimum samples in leaf nodes for regularization). For time series financial data, random splitting into train/test sets is incorrect — data must be split temporally (train on years 1–5, test on year 6) to avoid look-ahead bias. Walk-forward cross-validation with expanding or sliding training windows is the correct methodology. Feature engineering is critical: raw price levels are not informative inputs (non-stationary). Instead, use returns over multiple periods, technical indicator values (RSI, MACD, ADX), volatility measures (ATR percentile), volume ratios, and calendar features (day of week, month). Scaling features is not strictly necessary for tree models but is good practice.

Advanced

Random Forest's theoretical advantage over linear models in trading comes from its ability to capture interaction effects: situations where the predictive value of RSI depends on the level of ADX (RSI signals work differently in trending vs. ranging markets). Linear models cannot capture this without explicitly engineering interaction terms. A well-documented weakness of Random Forest in financial applications is temporal covariate shift — the relationship between features and returns changes over market regimes. A forest trained in 2010–2020 may perform poorly in 2022 if the macro environment has changed fundamentally. Adaptive approaches that weight recent data more heavily or retrain more frequently partially address this. In academic finance, ML models including Random Forest have been shown to significantly outperform linear models for cross-sectional equity return prediction (Gu, Kelly, Xiu 2020). Using 94 predictive signals, their gradient boosted trees and neural networks achieve out-of-sample R² of ~0.4% for monthly returns — small but highly significant given the sample size, and sufficient for profitable long-short strategies.

Formula

Prediction = majority_vote(Tree₁(x), Tree₂(x), ..., Treeₙ(x))
Each tree: trained on bootstrap sample, random feature subset at each split
Feature Importance: Gini impurity reduction averaged across all trees

1.Prepare features (technical indicators as inputs) and labels (future return sign or magnitude).
2.Bootstrap n training samples with replacement; select random √p features per split.
3.Grow each decision tree to max_depth using the selected features and bootstrapped data.
4.Aggregate predictions: majority vote for classification, mean for regression.
5.Extract feature importances (mean Gini decrease); validate out-of-sample with walk-forward testing.

Parameters

Parameter	Default	Range	Description
Number of Trees	100	10–500	Number of decision trees in the ensemble.
Max Depth	5	2–20	Maximum depth of each decision tree (lower = more regularization).
Training Window	252	60–1000	Number of bars used for model training.

Trading signals

bullish: Random Forest probability of up move > 0.60

Ensemble of trees predicts positive return — high-confidence long signal.

bearish: Random Forest probability of up move < 0.40

Ensemble predicts negative return — high-confidence short or exit signal.

neutral: Top feature importance shifts to volatility features

Model detecting regime change — reduce position size and monitor closely.

Limitations

•Computationally expensive to train on large datasets; requires periodic retraining as market conditions evolve.
•Not inherently suited to sequential (temporal) data without careful feature engineering to capture time dependencies.
•Black-box nature — while feature importances help, individual predictions are hard to explain compared to linear models.
•Overfits easily if max_depth is too high or training window is too short relative to feature dimensionality.

How Gilito AI uses RF

Gilito trains rolling Random Forest classifiers on each asset with a 1-year expanding window, using the full suite of 50+ technical indicator features as inputs. Feature importance rankings from these models inform which indicators are most predictive per asset, and the model outputs are combined with LSTM forecasts in a meta-ensemble that drives strategy selection.

Related indicators

Logistic Regression

LSTM Neural Networks

K-Means Clustering