DATA & STATISTICS

Time series

Family of statistical methods for time-ordered data, modeling trend, seasonality, autocorrelation, and noise. Classical decomposition X = T + S + R; canonical parametric models ARIMA (Box and Jenkins, 1976). Forecasting is the central objective.

Extended definition

Time series is the family of statistical methods for analyzing time-ordered data {Xt}\{X_t\}, where temporal dependence is structural — observations close in time are typically correlated. Classical additive decomposition separates the series into three components:

Xt=Tt+St+RtX_t = T_t + S_t + R_t

where TtT_t is the trend (long-term component), StS_t the seasonality (periodic component), and RtR_t the residual (random variation). Modern methods like STL (Seasonal-Trend decomposition using LOESS) refine decomposition with flexibility. For parametric modeling, ARIMA (Box and Jenkins 1976/2015) is canonical: combines autoregressive (AR), integrated (I, differencing for stationarity), and moving average (MA) components. Hyndman and Athanasopoulos (2021, Forecasting: Principles and Practice, 3rd ed., online at otexts.com/fpp3) offer the modern practical reference. Advanced methods include state-space models (Kalman filter), GARCH for conditional volatility in financial series, VAR for multivariate systems, and ML approaches — Prophet (Facebook, 2017), recurrent networks (LSTM), and temporal Transformers. Forecasting (predicting future values) is the central objective; others include anomaly detection, structural change identification, temporal causality analysis (Granger causality).

When it applies

Time series applies to any sequentially time-collected data: economic indicators (GDP, inflation, exchange rates), financial data (stock prices, returns), epidemiological (daily disease cases, mortality), climate (temperature, precipitation), industrial (energy consumption, production quality), web traffic (sessions, conversions), social media (mention volume). It is standard in demand forecasting, financial risk modeling, real-time epidemiological surveillance, environmental monitoring. It applies in longitudinal scientific research with frequent observations of the same subject (physiological series, wearables data) and in ecology (multidecadal population and climate series).

When it does not apply

It does not apply to independent data — classical regression is appropriate. It does not apply to series with unmodeled structural change: rupture by crisis or regulatory regime change invalidates models fit on the previous period; change-point detection is a necessary preliminary step. It does not apply to reliable long-horizon forecasting when the series has high variability: uncertainty accumulates rapidly, CIs explode; realistic horizon is a function of structure. It does not replace causal analysis: temporal correlation does not imply causation; Granger causality is a statistical test of predictive precedence, not causation in the manipulative sense. It does not apply directly to irregularly spaced series: specific methods for irregular sampling (Gaussian processes, irregular ARIMA) are required.

Applications by field

Economics and finance: GDP, inflation, exchange rates; financial series with GARCH; algorithmic trading. — Epidemiology: disease surveillance with SIR models extended with temporal components; outbreak forecasting. — Environmental science: climate modeling with multidecadal series; population ecology. — Operations and logistics: supply-chain demand forecasting; capacity planning.

Common pitfalls

The first pitfall is failing to test stationarity before modeling: ARIMA requires the series to be stationary after differencing; ADF (Augmented Dickey-Fuller) and KPSS tests are standard. The second is overfitting with excessive parameters: AIC and BIC are selection criteria; temporal cross-validation (rolling-origin) is good practice. The third is ignoring structural change: a model fit pre-COVID fails for the post-COVID period in economic series; change-point detection is necessary. The fourth is trusting parametric CIs in non-normal or heteroscedastic series — block bootstrap is a robust alternative. The fifth is treating Granger causality as strong causal evidence: it tests only predictive precedence; real causation requires experimental design or explicit causal inference.

Last updated —