An in-depth exploration of multivariate time-series forecasting methods, focusing on VARIMA, factor models, and the interplay between leading, coincident, and lagging indicators.
You know, I used to think we could just take one tidy time series—like GDP—and predict the future solely based on its past values. But as I went deeper into macroeconomic analysis, I realized that economic variables almost never move in isolation. GDP interacts with inflation, interest rates, unemployment, stock market performance, consumer sentiment—sometimes all at once. Predicting one without paying attention to the others often leads to oversimplified or misleading forecasts.
That’s why multivariate forecasting has become the go-to tool for analysts, policymakers, and portfolio managers who need a more holistic view. Because many variables are interrelated, capturing those relationships can shed light on feedback loops and reveal insights that a single series alone might hide.
From a CFA Level II standpoint, this approach isn’t just an academic exercise. It’s critical for:
• Asset allocation (e.g., understanding how GDP and inflation might jointly affect interest rates and, therefore, bond prices).
• Equity research (e.g., forecasting both overall consumer spending and interest rates that may impact a company’s capital structure decisions).
• Risk management (e.g., capturing the interplay between credit spreads, benchmark rates, and default rates).
A classic approach to handle multiple time series is the Vector Autoregressive (VAR) framework, a multivariate extension of the popular autoregressive integrated moving average (ARIMA) model. In a VAR model, each variable in the system is regressed on its own lagged values plus the lagged values of all other variables in the system. This structure can reveal rich feedback loops.
If the economic indicators are nonstationary (integrated of order one), we might use Vector Error Correction Models (VECMs) if the data are cointegrated. Alternatively, we can add the “I” for integrated series to form a VARIMA structure.
Sometimes you need to impose a bit of economic theory on top of your time-series data. This is where structural models come in. For instance, a simultaneous equation model for supply and demand might specify that the quantity demanded depends on price, income, and consumer preferences, while quantity supplied depends on price, cost of production, and technological factors. Structural models are particularly useful when you have strong reasons to believe in certain cause-and-effect relationships—or if you’ve read enough policy papers to suspect that your forecast needs more than just “the data will speak for itself.”
When dealing with a large number of correlated variables—say, 50 or 100 times series at once—it’s often helpful to reduce dimensionality using factor models. Techniques like Principal Components Analysis (PCA) or dynamic factor models look for a handful of common underlying factors that explain most of the variance across all series. In macroeconomics, these factors might capture broad economic cycles, monetary policy conditions, or common shocks (e.g., oil prices).
The neat trick here is: rather than forecasting each series separately, you model (and forecast) the small set of factors. Then you use those factor forecasts to infer the path for each individual economic indicator. Factor models can lead to more stable forecasting performance—especially if data sets are huge.
Macroeconomists often classify indicators based on their timing relative to the overall economic cycle:
• Leading indicators: These include items like new manufacturing orders, consumer expectations, or the slope of the yield curve. They tend to shift before the economy turns.
• Coincident indicators: Such as industrial production or personal income, which mirror current economic conditions.
• Lagging indicators: Things like CPI-based inflation and unemployment that confirm the trend but usually move after the economy has turned.
A robust forecasting framework may combine all three. Leading indicators help you see the horizon earlier; lagging indices give confirmation signals, and coincident measures help anchor your model to current reality.
Below is a simple diagram illustrating the interplay of these indicators in a forecasting framework:
graph LR A["Leading Indicators"] --> B["Coincident Indicators"] B["Coincident Indicators"] --> C["Lagging Indicators"] C["Lagging Indicators"] --> D["Forecasting Model Output"]
Choosing the “best” model isn’t just about the smallest forecast error on your training data. You also need to think about:
• Forecast Accuracy: Metrics like RMSE (Root Mean Squared Error) or MAE (Mean Absolute Error) give you a numerical grip on how well your model is performing.
• Real-Time Data Availability: If you rely on an indicator that’s released with a three-month lag, that might not be so useful for immediate decisions or short-term forecasting.
• Interpretability: Sometimes you want a “black box” approach that prioritizes predictive power. Other times, you want to pin down cause-and-effect channels (especially if you need to communicate to stakeholders or regulators).
• Structural Stability: Macroeconomic relationships can change over time, especially if there’s a policy shift, technological innovation, or a major global event. If a model that used to be accurate is now failing, there may be a structural break at play.
When you’re dealing with real-world macroeconomic forecasting, you sometimes face events or policy changes that your model didn’t see coming—like a sudden trade embargo, a global pandemic, or an unconventional monetary policy tool. That’s where expert judgment steps in.
And yes, there’s always a risk of allowing biases or wishful thinking to creep in. In practice, you try to ground your judgments in objective information, scenario analysis, or partial equilibrium thinking—where you say, “If this policy is enacted, then we expect X, which would feed into Y,” and so on. The recommendation is to blend your domain expertise with model-driven forecasts carefully.
There’s also an ethical dimension here, per the CFA Institute Code of Ethics and Standards of Professional Conduct. Overly optimistic or pessimistic forecasts can have real consequences for investor decisions, so you must be diligent, thorough, and transparent about the assumptions you inject.
• Overfitting: It’s tempting to add every variable you can find, hoping to eke out more accuracy. But the more parameters you have, the higher the risk that your model is just learning noise rather than signal.
• Data Revisions: Economic data aren’t always final. For instance, initial GDP estimates can be revised months (or years!) later. If you build a model using data that had the benefit of hindsight, you may overstate your real-time forecasting performance.
• Structural Breaks or Regime Shifts: If a new technology disrupts the labor market or a government changes its monetary policy framework, the relationships in your historical data might not hold going forward.
• Lag Selection: In a VAR model, you must choose how many lags to include. Too few and you miss important dynamics; too many and you might waste valuable degrees of freedom or introduce noise.
• Always Test for Stationarity and Cointegration: Before you jump into a VAR, confirm whether your data are stationary. If your series are I(1) and cointegrated, consider a Vector Error Correction Model (VECM).
• Keep an Eye on Model Diagnostics: Residual plots, autocorrelation checks, and tests like the Ljung-Box (for residual autocorrelation) or the Jarque-Bera (for normality) can help you see whether your model is missing something.
• Use Real-Time Datasets Where Possible: Try to simulate the actual environment in which you’d be making decisions.
• Document (and Justify) Your Assumptions: This is crucial both for exam settings—where partial credit may be awarded for correct processes—and for professional practice, where you may need to defend your forecasts.
If you’re prepping for the CFA Level II exam, remember that you might see a vignette describing a multivariate time-series setup with multiple macro variables. The item set could ask you to identify the right model (VAR vs. factor model vs. structural equation) or interpret the results (e.g., explaining why a certain coefficient might be negative or checking for evidence of a structural break). Time management is key—so practice reading vignettes quickly, identifying crucial data points, and scanning for potential red flags (like nonstationary data or suspiciously high R-squared values).
• “Multivariate ARIMA (VARIMA)”: An ARIMA extension that handles more than one interdependent time series in a vector framework.
• “Factor Models”: Models that explain correlations among multiple observed variables with a small set of unobserved common factors.
• “RMSE (Root Mean Squared Error)”: A measure of accuracy computed by taking the square root of the average squared forecast errors.
• “MAE (Mean Absolute Error)”: The average absolute difference between forecasts and the actual observed values.
• Stock, J. H., and M. W. Watson. “Forecasting Using Principal Components from a Large Number of Predictors.” Journal of the American Statistical Association (2002).
• Diebold, F. X. Elements of Forecasting. Thomson South-Western, 2007.
• CFA Institute. “Code of Ethics and Standards of Professional Conduct.”
• CFA Level II Readings on Macroeconomic Forecasting for Equity and Fixed Income Valuation.
Important Notice: FinancialAnalystGuide.com provides supplemental CFA study materials, including mock exams, sample exam questions, and other practice resources to aid your exam preparation. These resources are not affiliated with or endorsed by the CFA Institute. CFA® and Chartered Financial Analyst® are registered trademarks owned exclusively by CFA Institute. Our content is independent, and we do not guarantee exam success. CFA Institute does not endorse, promote, or warrant the accuracy or quality of our products.