Explore comprehensive exam-style scenarios where multiple forms of misspecification—like heteroskedasticity, serial correlation, and multicollinearity—are present. Learn diagnostic testing, step-by-step fixes, and exam-focused best practices for tackling tricky regression models.
Sometimes, we finance folks (myself included, by the way) forget that the beloved multiple regression model marching across our spreadsheets might be quietly hiding all sorts of problems in its residuals. You know, the usual suspects: heteroskedasticity, serial correlation, omitted variables, or even a suspiciously high correlation among independent factors. These issues can come back to bite us when we’re trying to interpret results in a high-stakes setting, like the CFA® exam or a real-world client presentation.
One of the best ways to get a handle on these hidden traps is to see how they play out in item set vignettes. These have become standard in Level II, where you’re given a dense scenario (or short case) describing a regression context, data peculiarities, and some resulting diagnostic stats. Then, the questions probe your ability to identify and correct potential model faults. This approach reveals whether you can connect the story behind the data to the appropriate test or corrective measure.
Below, we’ll walk through a typical exam-like scenario (a portfolio manager regressing stock returns on macroeconomic indicators) and show how we can navigate the detection of heteroskedasticity, multicollinearity, or other issues. We’ll also talk about how to interpret test results, pick the right remedy, and propose a final, improved model—one that’s more likely to generate reliable forecasts and pass the scrutiny of a CFA exam grader.
Before we jump into the actual vignettes, let’s do a quick refresher on the main types of misspecification we’re likely to see. Don’t worry, this is a short list (sort of).
• Heteroskedasticity: Variance of the residuals isn’t constant. Might see a “fan shape” in the residual plot.
• Serial Correlation (Autocorrelation): Errors are correlated over time. Typically uncovered through the Durbin–Watson test or the Breusch–Godfrey test.
• Multicollinearity: Two or more independent variables are highly correlated, creating inflated standard errors and less stable coefficient estimates.
• Omitted Variable Bias: You left out something crucial, so your error term is “picking up” the effect, producing biased and inconsistent estimates.
• Incorrect Functional Form: The true relationship is nonlinear or includes interaction effects, but your model pretends it’s purely linear.
In the upcoming scenarios, we’ll focus on how to systematically detect and fix them.
When you open your exam, you’ll often see a story about an ambitious portfolio manager, or maybe an economist at a rating agency, who has built a regression model to forecast returns, GDP, or some other financial metric. The story might mention suspicious patterns, a short test output table, or conflicting analysis. The subsequent item set questions prompt you to:
• Identify the problem with the model (and the specific type of misspecification).
• Recommend an appropriate test or interpret test statistics that are given.
• Suggest a remedy (Robust Standard Errors, Weighted Least Squares, adding or removing variables, etc.).
You’ll probably need to do a tiny numeric calculation—like computing a test statistic or analyzing partial correlations—but also interpret the bigger picture. This is what the Level II exam is all about: bridging the technical numeric aspects (like an F-statistic) with real-world logic.
Let’s jump into a plausible item-set scenario. We’ll keep it a little informal (I’m a big fan of referencing real experiences that highlight mistakes I’ve made firsthand). Suppose we have the following situation:
A portfolio manager oversees a large-cap equity fund and uses a quarterly dataset of 40 observations of stock returns (dependent variable) and three macroeconomic factors (independent variables):
Early eyeballing of the data suggests that ΔInflation and ΔInterest rates are highly correlated (makes sense in many economies, as interest policies often influence inflation).
The portfolio manager runs an OLS regression:
Rᵢ = β₀ + β₁(ΔGDPᵢ) + β₂(ΔInflationᵢ) + β₃(ΔInterestᵢ) + εᵢ.
Preliminary residual plots show that lower returns produce smaller absolute residuals, but higher returns produce larger absolute residuals. This might point to heteroskedasticity (variance of residuals increasing with the level of the predicted value).
The manager also notices that the p-values on ΔInflation and ΔInterest are both suspiciously high, but the overall R² is quite large. This ring-a-bell for multicollinearity?
The manager runs the following tests:
• Breusch–Pagan (BP) test for heteroskedasticity.
• Variance Inflation Factors (VIFs) for each variable.
• Durbin–Watson for autocorrelation (just to be sure).
The BP test yields a test statistic that’s significantly above the critical chi-square value. In other words, we can reject the null hypothesis of homoskedastic errors. So, yep, we have heteroskedasticity.
VIFs for ΔInflation and ΔInterest are both around 9.0 or higher, which is typically a sign (some folks set a threshold of 5, others at 10) of strong multicollinearity. ΔGDP, on the other hand, has a VIF of ~2.2, which is acceptable.
The DW statistic is around 2.10, which suggests no strong positive or negative autocorrelation. Good news—no immediate sign of serial correlation there.
Someone might say: “Okay, so we have heteroskedasticity. Let’s do Weighted Least Squares (WLS) or at least use some robust standard errors, right?” Precisely. In an exam setting, that’s typically the correct approach. You might also consider “White’s robust standard errors,” which can handle mild forms of heteroskedasticity.
As for the multicollinearity, one immediate fix is removing or combining the variables that are correlated—but only if you can theoretically justify it. Another approach is principal component analysis or some dimension-reduction method if you have more variables to manage (though that might be extreme for a simple model). In a typical exam scenario, you might see an explanation that the central bank sets short-term interest rates primarily in response to inflation, so a candidate might keep only the inflation variable if that’s the primary driver. Alternatively, J. Manager might consider analyzing the monetary policy variable separately.
The story might unfold like this:
Reading the Residual Plots
The item set provides a graph of residuals vs. fitted values. You notice it looks like a funnel shape (small on the left, wide on the right). You suspect heteroskedasticity.
Running the BP Test
The data snippet includes something like “Breusch–Pagan statistic: 14.28, p-value < 0.01.” Instantly, you conclude heteroskedasticity.
Addressing the Issue
The question might ask: “Which technique is most appropriate for mitigating the identified problem?”
Likely answers:
The correct pick is (B) White-corrected (robust) standard errors.
Checking for Multicollinearity
Another part of the vignette might provide the VIFs:
Remedy
The next question might ask: “In the presence of high multicollinearity, which of the following is most likely to improve model reliability?”
Typical solutions:
Below is a small flowchart that might help visualize how we typically proceed when we suspect heteroskedasticity or related misspecifications:
flowchart LR A["Check Residual Plots"] B["Perform Diagnostics <br/>(BP test, etc.)"] C["Interpret Significance <br/>of Results"] D["Apply Remedies <br/>(Robust SE, WLS,)"] E["Refit Model & <br/>Re-check Diagnostics"] A --> B B --> C C --> D D --> E
It’s a cyclical set of steps: spot a problem → test for it → confirm it → fix it → re-test. This approach might feel repetitive, but it’s exactly what strong data scientists and advanced investment analysts do.
In the real exam, you’ll face questions that blend numeric tasks (maybe computing a test statistic for the Breusch–Pagan test or adjusting standard errors) with interpretive tasks (like explaining why you’d pick White’s robust errors over other techniques). Some questions also want you to propose multiple possibilities. Weighted Least Squares or Generalized Least Squares might be equally viable, but the exam might prefer the simpler robust standard errors if you’re in a straightforward scenario.
As we know from Chapter 6 (Time-Series Analysis), heteroskedasticity can show up in time-series models as well, often in the form of ARCH effects if you’re analyzing volatility in asset returns. When you get to advanced machine learning (Chapter 7), you might find that algorithms such as tree-based models or regularized regressions can inadvertently mask or handle issues like high collinearity. But you should still interpret results carefully—“black box” algorithms aren’t automatically free from suspicious data patterns.
Model misspecification is one of those (slightly nerve-wracking) realities of applying regression in finance. The best approach is to be systematic:
• Study your residual plots.
• Run the standard battery of tests (Breusch–Pagan, Durbin–Watson, VIF).
• Apply the correct fix.
• Always retest and reacquaint yourself with the new residual patterns.
It’s kind of like a recurring cycle: things change once you fix the model, so you must ensure you haven’t introduced new problems or overlooked something else. By practicing these item set vignettes, you’ll learn to quickly spot trouble and propose solutions—skills that are absolutely vital not only for passing the exam but also for thriving in actual investment and risk management roles.
• CFA Institute Assigned Readings: Visit the official Level II readings on multiple regression for deeper coverage of these misspecifications.
• “Journal of Portfolio Management” for real-world articles analyzing regression challenges.
• Online Learning Platforms like Kaplan Schweser or Wiley for additional item set question banks and mock exams.
• Chapters 6 (Time‑Series Analysis) and 7 (Machine Learning) of this book for how these issues overlap with more specialized methods.
Important Notice: FinancialAnalystGuide.com provides supplemental CFA study materials, including mock exams, sample exam questions, and other practice resources to aid your exam preparation. These resources are not affiliated with or endorsed by the CFA Institute. CFA® and Chartered Financial Analyst® are registered trademarks owned exclusively by CFA Institute. Our content is independent, and we do not guarantee exam success. CFA Institute does not endorse, promote, or warrant the accuracy or quality of our products.