Explore Vector Autoregression, impulse response functions, variance decomposition, and their applications in finance—complete with examples, best practices, and exam tips.
Let’s be honest. Many of us have wrestled with single‑equation models—like straightforward autoregressions—only to realize, “Hey, real markets have multiple variables interacting all at once!” That’s where Vector Autoregression (VAR) steps in. Instead of looking at just one variable in isolation, VAR treats multiple variables together, acknowledging that they can and do affect each other over time. This approach is incredibly useful for financial analysts and economists who want to understand, for example, how interest rates, inflation, and economic growth evolve together in a complex dance.
In what follows, we’ll explore what VAR models are all about: how you set them up, how you decide how many lags to include, and how to interpret them using tools like impulse response functions and variance decompositions. And, because we all know exam day can be stressful, we’ll highlight some best practices and typical pitfalls. Let’s get to it.
In a single‑variable autoregressive model (AR), we let the current value of a variable (say, Y) depend on its own lagged values—like Y at time t‑1, t‑2, etc. But what if Y and X are both relevant? Maybe Y is the inflation rate and X is the interest rate set by the central bank. They probably influence each other: interest rates generally affect inflation, but the inflation picture also affects future interest rate decisions.
A VAR extends this idea by incorporating multiple variables—like Y and X—each regressed on the past values of itself and of the other variables. When you combine them, you get a system of linear equations, each capturing a slightly different piece of the puzzle. This is often a more realistic model of how financial and economic variables co‑move and respond.
Let’s consider a simple two‑equation VAR(1) to ease into it. Suppose we have two variables: Y and X. A VAR(1) means each variable depends on one lag of Y and one lag of X. Formally, we have:
Here, \(\varepsilon_{y,t}\) and \(\varepsilon_{x,t}\) are error terms (innovations) in the system, often with zero mean and some covariance structure. These two equations are estimated jointly. For a VAR(p), you just keep adding more lags:
In matrix form (for those who like a more compact representation), you can stack Y and X into a vector, and stack coefficients into matrices. But the main idea is the same: each variable is influenced by both its own history and the histories of the others.
Deciding how many lags (p) to include can be tricky. Too few lags, and you might miss important dynamics. Too many, and your model might become inflated with parameters, leading to overfitting (plus you risk chewing up valuable degrees of freedom). You’ll typically use model selection criteria such as:
• Akaike Information Criterion (AIC)
• Schwarz/Bayesian Information Criterion (BIC)
• Hannan-Quinn (less common, but still used in some circles)
Your data sample size is also critical. If you have a small dataset—maybe monthly data over a few years—don’t go overboard with large p. Because every additional lag means more parameters, your estimation can quickly become unwieldy.
The rules for stationarity in a VAR are not so different from single‑variable AR models. Stationarity roughly means that your data’s underlying statistical properties (mean, variance, autocorrelations) do not change over time. If the variables are trending or have unit roots, you generally need to difference them or transform them in some way. Another scenario is cointegration: if your variables share a common stochastic trend, you might use a Vector Error Correction Model (VECM) structure instead of a VAR in pure levels. VECM basically imposes cointegration constraints but still captures the short‑run dynamics.
Key takeaway: If your variables are not stationary, your model might produce spurious results or blow up in terms of predicted values. So always do the usual checks (like the augmented Dickey‑Fuller test) on each variable to ensure they’re stationary—or treat them in a way consistent with cointegration theory.
Let’s say you have your shiny new VAR. Great. Now how do we interpret it? That’s where impulse response functions (IRFs) come in. An IRF shows how a shock (or an “innovation”) to one variable—like a one‑time unexpected 1% jump in interest rates—will ripple through the entire system over future periods. This is one of the sweetest aspects of VAR: you can see how each variable “responds” (on average) over time to an unanticipated movement in another variable.
Often, we depict IRFs in a plot where the x‑axis is the time horizon (e.g., from t up to t+10) and the y‑axis is the response (like ∆ inflation or ∆ stock returns). You get a set of lines—one for each target variable—showing how they react following the shock. If the line for “inflation” first goes up for a couple of periods and then dips below zero, it might suggest that after an initial burst, inflation falls under some scenario. Economists and finance professionals rely heavily on these impulse response plots to assess the dynamic interplay in the system.
One subtlety is that the order in which you place variables can matter if you use a “recursive” identification approach. The typical procedure is called a Cholesky decomposition, where you assume that the first variable is contemporaneously exogenous (i.e., it is not affected by the second variable in the same time period), the second variable is impacted by the first (but not vice versa, in the same period), and so on. If that assumption is not suitable, you might need alternative identification schemes (Structural VAR, or SVAR) to properly interpret the shocks.
Impulse response functions are one half of the interpretive power of VAR models. The other half is forecasting (and decomposing forecast errors). Variance decomposition (or forecast error variance decomposition, FEVD) slices up the forecast variance of each variable into parts attributable to the different shocks in the system. For instance, if you’re forecasting inflation, you might say: “In a 10‑step‑ahead forecast, 60% of the uncertainty in inflation is due to its own shocks, 20% is from interest rate shocks, and 20% is from GDP shocks.” That can give you valuable insights into which variable is the main driver of fluctuations in your system.
Imagine you want to study the interplay among three key variables:
A three‑variable VAR(1) might look like:
Here, \(\gamma_{RR}\) is the coefficient describing the effect of past stock returns on current returns, \(\gamma_{Ri}\) is the effect of past interest rates on current returns, and so on. You would estimate this system with your historical data.
Once estimated, you can do:
• Impulse Response: Suppose \(\varepsilon_{i,t}\) jumps by 1% in period t. You see how R (stock returns) and g (GDP) respond in period t+1, t+2, etc.
• Variance Decomposition: Assess how much variance in R is explained by R’s own shocks, by i’s shocks, and by g’s shocks, at different forecast horizons.
For instance, you might find that stock returns are largely driven by their own past (momentum or mean reversion patterns), but at longer horizons, interest rate surprises have a bigger effect—maybe 30% or 40% of the forecast error variance for returns is from interest rate moves. That’s really valuable insight for your portfolio management decisions and risk assessments.
Below is a conceptual snippet in Python that outlines how you might estimate a VAR with a typical library such as “statsmodels.” (Don’t worry if you don’t code. This is just to show how straightforward it can be.)
1import pandas as pd
2import statsmodels.api as sm
3from statsmodels.tsa.api import VAR
4
5df_lagged = df.dropna()
6
7model = VAR(df_lagged)
8results = model.fit(maxlags=2, ic='bic') # Let BIC choose an optimal lag up to 2
9print(results.summary())
10
11irf = results.irf(10) # How does a shock affect the system over 10 periods?
12irf.plot(orth=True) # orth=True uses Cholesky factorization
This snippet chooses the lag order based on the BIC, estimates the VAR, and then gives an impulse response plot for each variable. In an exam or real‑world setting, you’d interpret these plots carefully and relate them back to your economic or financial theory.
Below is a simple Mermaid diagram illustrating the flow of how a shock (innovation) in one variable can propagate through a VAR system over subsequent periods:
flowchart LR A["Shock to Variable A"] --> B["Responses in Variables A, B, & C (time t+1)"] B --> C["Feedback into the System (time t+2, t+3, ...)"] C --> D["Full System Equilibrium or Ongoing Response"]
Essentially, a shock in one part of the system influences multiple variables, which, when lagged, come back to affect the system again, possibly generating complex dynamics.
• Overfitting: It’s super easy to keep piling on variables or lags. Resist that temptation unless you have robust statistical justification. More parameters often mean more potential for spurious relationships and less interpretability.
• Economic Theory Guidance: Let your domain knowledge guide variable inclusion. Sure, data‑driven approaches can be fun, but sometimes you end up with variables that are correlated but not causally related—leading to questionable policy or investment decisions.
• Stationarity and Cointegration: If you suspect the presence of unit roots and cointegration, you may need to switch to a Vector Error Correction Model framework. Don’t ignore the structure of your data.
• Identification Concerns: If you need to interpret structural shocks (e.g., “monetary policy shocks”), read up on structural VAR (SVAR) methods. Basic Cholesky doesn’t always solve identification in real‑life data.
• Large Cross‑Sections: For more than three or four variables, your parameter space might blow up. Use dimension‑reduction techniques or consider factor models if you have, say, 15 or 20 variables.
• Model Stability: Keep an eye on parameter stability over time. Financial relationships can shift after major policy changes or crises, so consider tests for structural breaks.
At the end of the day, a Vector Autoregressive model is like orchestrating multiple instruments in an ensemble. Each variable stands on its own, but together, they create something richer—a dynamic interplay that can’t be captured by analyzing them in isolation.
For the Level II exam, here are a few quick pointers:
• Equations and Concepts: Know the basic structure of VAR(p). Understand how to read off coefficients in the system.
• Impulse Response Functions: Be prepared to interpret an IRF plot. The exam might give you a graph and ask how a shock to one variable influences others over time.
• Stationarity: Don’t forget to check for stationarity. If variables are non‑stationary, you might see “differencing” or “cointegration” queries.
• Practical Interpretation: They can give you a finance scenario describing stock returns, interest rates, and inflation. You might be asked to interpret which variable seems to drive the system.
• Don’t Overcomplicate: Sometimes students get stuck in the technical weeds. Keep an eye on the core interpretation and exam question demands.
Manage your time by quickly identifying the relevant pieces of data in the vignette, the type of model used, and precisely what they’re asking (coefficient interpretation? impulse response effect?). Then structure your answer clearly.
Good luck with your time‑series journey—it’s one of the most exciting aspects of quantitative finance, especially when multiple variables collide in the real world.
• Stock, J. H., and M. W. Watson. Introduction to Econometrics. Pearson, latest edition.
• Lütkepohl, H. New Introduction to Multiple Time Series Analysis. Springer, 2005.
• Sims, C. A. “Macroeconomics and Reality.” Econometrica (1980).
• CFA Institute Learning Ecosystem (particularly for advanced time‑series modeling).
Important Notice: FinancialAnalystGuide.com provides supplemental CFA study materials, including mock exams, sample exam questions, and other practice resources to aid your exam preparation. These resources are not affiliated with or endorsed by the CFA Institute. CFA® and Chartered Financial Analyst® are registered trademarks owned exclusively by CFA Institute. Our content is independent, and we do not guarantee exam success. CFA Institute does not endorse, promote, or warrant the accuracy or quality of our products.