ARIMA
An autoregressive integrated moving average model that forecasts future values using patterns in the time series' own past values and errors.
Overview
ARIMA (AutoRegressive Integrated Moving Average) is a classical time series forecasting model developed by Box and Jenkins in the 1970s. It combines three components: autoregression (using past values to predict future values), differencing (making the series stationary), and moving average of error terms (correcting for forecast errors). ARIMA is the foundational statistical model for financial time series forecasting.
How it looks on a chart
Illustration only — synthetic data generated for visual reference.
ARIMA is a forecasting model that learns patterns in a time series and uses them to predict future values. It assumes that the best predictor of tomorrow's price is some combination of today's price, yesterday's price, recent forecast errors, and the underlying trend. Think of it as a sophisticated version of "if it went up the last 3 days, it'll probably keep going up slightly, adjusted for how wrong we've been recently." ARIMA quantifies and optimizes these relationships automatically from historical data. The model is usually applied to returns (percent changes) rather than raw prices, because prices are non-stationary (they trend) while returns are more stable. A fitted ARIMA model generates a point forecast (e.g., "expected return of +0.3% tomorrow") and confidence intervals around that forecast.
ARIMA(p, d, q): p = number of autoregressive lags (past prices), d = degree of differencing needed to achieve stationarity, q = number of moving average error terms. For daily financial returns, ARIMA(1,0,1) or ARIMA(2,0,1) are common starting points; d=0 because returns are already stationary; prices typically need d=1. Model selection uses information criteria: AIC (Akaike Information Criterion) or BIC (Bayesian Information Criterion) penalize model complexity to prevent overfitting. The Box-Jenkins methodology involves: (1) stationarity test (ADF test), (2) ACF/PACF analysis to identify p and q, (3) model fitting, (4) residual diagnostics (Ljung-Box test for remaining autocorrelation). In practice, automated ARIMA (auto.arima in R, pmdarima in Python) searches over a grid of (p, d, q) combinations and selects the model with the lowest AIC. However, auto-selection is best validated with out-of-sample testing — AIC-optimal models frequently fail to beat a random walk on financial returns.
ARIMA's fundamental limitation in financial markets is the Efficient Market Hypothesis (weak form): if price changes followed a predictable ARIMA structure, arbitrageurs would trade away the pattern. Empirically, financial return series often cannot be distinguished from ARIMA(0,0,0) — a white noise process — at conventional significance levels, meaning ARIMA adds little to a naive random walk forecast. Where ARIMA has genuine value is in: (1) volatility forecasting — the squared residuals of an ARIMA model often have significant ARCH effects, leading naturally to GARCH models for variance forecasting; (2) high-frequency data where microstructure effects (bid-ask bounce, inventory effects) create genuine short-horizon predictability; and (3) as a component in ensemble forecast combinations. Forecast combination (averaging multiple model forecasts) consistently outperforms individual models in time series forecasting literature (Timmermann 2006). Combining ARIMA, exponential smoothing, and a simple mean provides a more robust forecast than any single model. In Gilito's engine, ARIMA forecasts are combined with GARCH volatility estimates and regression-based trend signals to form composite price forecasts.
Formula
ARIMA(p,d,q): ΔᵈXₜ = c + Σᵢφᵢ·Δᵈ Xₜ₋ᵢ + εₜ + Σⱼθⱼ·εₜ₋ⱼ where φ are AR coefficients, θ are MA coefficients, ε is white noise
- 1.Test for stationarity (ADF test); difference the series d times until stationary.
- 2.Plot ACF and PACF of the stationary series to identify candidate p and q values.
- 3.Fit ARIMA(p, d, q) via maximum likelihood; compare models using AIC/BIC.
- 4.Validate residuals: check for remaining autocorrelation (Ljung-Box test) and normality.
- 5.Generate h-step-ahead forecasts with confidence intervals from the fitted model.
Parameters
| Parameter | Default | Range | Description |
|---|---|---|---|
| AR Order (p) | 1 | 0–5 | Number of autoregressive lag terms. |
| Differencing (d) | 1 | 0–2 | Number of times the series is differenced for stationarity. |
| MA Order (q) | 1 | 0–5 | Number of moving average error terms. |
Trading signals
bullish: ARIMA forecast significantly positive (above confidence threshold)
Model predicts above-average return — consider long entry with probability weighting.
bearish: ARIMA forecast significantly negative
Model predicts below-average return — consider short entry or exit.
neutral: ARIMA forecast within noise confidence interval
Model prediction not statistically significant — no trade signal.
Limitations
- •Financial returns are close to a random walk — ARIMA rarely beats a naive no-change forecast significantly.
- •Linear model — cannot capture non-linear patterns or regime changes.
- •Requires stationarity assumption; parameters must be re-estimated as market dynamics change.
- •Point forecasts are often misleading — always use confidence intervals; forecast intervals for financial series are very wide.
Gilito fits rolling ARIMA models (with automatic order selection via AIC) to each asset on weekly retraining cycles, using ARIMA forecasts as one component in its ensemble forecasting system. ARIMA residual diagnostics also feed into its GARCH volatility model — the two are always estimated as a system in ARMA-GARCH form.