Stochastic Volatility: When Volatility Itself Is Random
Volatility is not constant. We knew that already. The deterministic volatility surface tries to fix this by making volatility a function of stock price and time. But the surface changes every time you recalibrate. The model is fundamentally incomplete.
What if volatility itself is random? Not deterministic, not a function you can write down, but genuinely unpredictable. This is Chapter 51, and it is one of the most important chapters in the entire book. Stochastic volatility models are the workhorses of modern derivatives pricing.
What Stochastic Volatility Is (and Is Not)
Before diving in, Wilmott makes an important distinction. Random volatility is not the same as stochastic volatility. Here is the difference.
Suppose you flip a coin before each day. Heads means 10% volatility today, tails means 30%. The resulting return distribution has higher peaks and fatter tails than a normal distribution. But as long as the standard deviation of returns exists and scales with the square root of time, you can still use Black-Scholes with the overall standard deviation. This is random volatility but not stochastic in the sense that matters.
Stochastic volatility becomes a real problem when the timescale of volatility changes is the same as the timescale of stock price changes. When volatility evolves continuously alongside the stock price, following its own stochastic differential equation, we have moved beyond the Black-Scholes world.
The Model
The setup is natural. The stock price follows:
dS = mu * S * dt + sigma * S * dX1
And now volatility follows its own process:
d(sigma) = p(S, sigma, t) * dt + q(S, sigma, t) * dX2
The two Brownian motions dX1 and dX2 can be correlated with correlation rho. The function p controls the drift of volatility (is it mean-reverting? trending?) and q controls the volatility of volatility (how random is the randomness?).
The option value now depends on three variables: V(S, sigma, t). One extra dimension compared to Black-Scholes.
The Hedging Problem
Here is where things get fundamentally different. In Black-Scholes, you have one source of randomness (the stock) and one instrument to hedge with (the stock). One equation, one unknown. Perfect hedge.
With stochastic volatility, you have two sources of randomness (stock and volatility). You still have the stock to hedge the first source. But volatility is not traded. You cannot buy or sell “volatility” in the stock market to hedge the second source of randomness.
The way out: hedge with another option. Set up a portfolio with the option you want to price, a position in the stock, and a position in some other traded option. Two instruments for two sources of risk.
But now you have one equation with two unknowns: the value of your option V and the value of the hedging option V1. By the usual separation-of-variables argument (the left side depends on V but not V1, the right side on V1 but not V), both sides must equal the same function of the common variables S, sigma, and t. This function is called lambda.
The Market Price of Volatility Risk
The pricing equation becomes:
dV/dt + 0.5sigma^2S^2 * d^2V/dS^2 + rhosigmaqS * d^2V/(dS d(sigma)) + 0.5q^2 * d^2V/d(sigma)^2 + rS * dV/dS + (p - lambda*q) * dV/d(sigma) - rV = 0
The function lambda(S, sigma, t) is the market price of volatility risk. It is the extra return per unit of volatility risk that the market demands.
This lambda is a problem. Unlike the market price of stock risk (which cancels out in Black-Scholes because the stock is traded), the market price of volatility risk sticks around in the equation because volatility is not traded. You cannot observe lambda directly. You cannot derive it from first principles. It is a modeling choice, and different choices give different option values.
In practice, people either:
- Specify lambda based on assumptions or estimation
- Choose lambda implicitly by calibrating the model to market prices
- Try to avoid the whole issue (see Chapter 54 for this approach)
Risk-Neutral Drift
There is a useful interpretation. The quantity p - lambda*q is called the risk-neutral drift of volatility. Remember how in Black-Scholes, the real drift mu of the stock gets replaced by the risk-free rate r for pricing? The same thing happens here:
| Variable | Real drift | Risk-neutral drift |
|---|---|---|
| Stock S | mu*S | r*S |
| Volatility sigma | p | p - lambda*q |
For pricing, you work with risk-neutral drifts. For Monte Carlo simulations, you simulate the risk-neutral random walks. The real-world dynamics are interesting for risk management but not for pricing.
What Delta Hedging Alone Gets You
Suppose you only delta hedge with the stock and do not hedge volatility risk. Your portfolio earns:
d(portfolio) = r * portfolio * dt + (dV/d(sigma)) * (pdt + qdX2 - lambdaqdt)
For every unit of volatility exposure (measured by dV/d(sigma), the vega), you earn lambda units of excess return per unit of time. This is literally what “market price of risk” means: the extra compensation for bearing volatility risk.
If lambda is positive, you earn more than the risk-free rate by being long vega (long volatility exposure). If lambda is negative, short vega earns more. The sign and magnitude of lambda determine whether it pays to be long or short volatility.
Named Models
The chapter surveys the famous stochastic volatility models. Each makes different choices for p and q.
Hull and White (1987)
Hull and White looked at the case where stock and volatility are uncorrelated and the volatility dynamics do not depend on the stock price. Their key result: the option value is the average of Black-Scholes values, weighted by the distribution of average variance over the option’s life. Beautiful and intuitive.
One specific model they studied:
d(sigma^2) = a * sigma^2 * dt + b * sigma^2 * dX2
Values usually need to be found numerically, but there are Taylor series approximations.
Heston (1993)
This is the most popular stochastic volatility model in practice:
d(v) = kappa * (theta - v) * dt + xi * sqrt(v) * dX2
Here v = sigma^2 is the variance (not volatility). It mean-reverts to a long-term level theta at speed kappa, with a volatility of volatility xi. Correlation between stock and variance is arbitrary.
Why is Heston so popular? Because there are “closed-form” solutions for European options. The word “closed-form” deserves quotes because the solution involves a Fourier transform inversion that must be done numerically. But it is fast, and fast matters.
The four parameters (speed of mean reversion, level of mean reversion, vol of vol, correlation) can be fitted to market data. But experience shows calibrated parameters are often unstable and sometimes unreasonable. The fitted speed of mean reversion might flip sign between calibrations, or the vol of vol might hit unrealistic values.
The 3/2 Model
d(v) = v * (a - b*v) * dt + c * v^(3/2) * dX2
Popular because it also has a closed-form solution.
GARCH Diffusion
GARCH models from econometrics have continuous-time equivalents:
d(v) = a * (b - v) * dt + c * v * dX2
This is the diffusion limit of the GARCH(1,1) process. The parameters are related to the original GARCH specification.
Ornstein-Uhlenbeck for Log-Variance
Model the log of variance instead:
d(log v) = a * (b - log v) * dt + c * dX2
Wilmott notes this matches data well. In the long run, volatility is lognormally distributed, which makes sense: volatility is positive and has a right-skewed distribution in practice. He adds a characteristic footnote: “I lied, again, this is not popular at all, even though it is a very good model.”
REGARCH
The Range-based Exponential GARCH model produces a three-factor system with two volatilities. One volatility (sigma_1) represents actual asset volatility, and a second (sigma_2) represents the level that sigma_1 reverts to, and sigma_2 is itself stochastic.
Typical parameters show that sigma_1 mean-reverts to sigma_2 over about one week, while sigma_2 mean-reverts to the long-term level over about four months. Fast volatility fluctuations and slow regime changes. This matches what traders observe: volatility spikes quickly and subsides over weeks, but the general “regime” (high-vol or low-vol environment) changes slowly.
Heston with Jumps
Add Poisson jumps to the Heston model. More parameters allow better fitting. The jump component has most impact at short timescales (sudden moves), while stochastic volatility matters more at longer horizons. So you can use long-dated options to fit the stochastic vol parameters and short-dated options to fit the jump component.
Downside: the math is more complex, and hedging becomes even harder when the stock can jump.
The Convexity Bias
An important technical detail that catches beginners. We care about the expected variance over the life of the option, not the expected volatility. These are different because of Jensen’s inequality:
E[sigma^2] > E[sigma]^2
If you model sigma and naively estimate E[sigma^2] as E[sigma]^2, you underestimate the expected variance and underprice options. This is the convexity adjustment.
More generally, if you model some transformation f(sigma) (like log sigma), you need to account for this bias when converting back to E[sigma^2]. The adjustment involves the second derivative of the inverse function and the variance of f(sigma). Small but not negligible.
Modeling Implied Volatility Directly
One interesting alternative: instead of modeling actual volatility, model the implied volatility of traded options directly. This approach, due to Schonbucher (1999), bypasses the calibration problem entirely because the model is built to match market prices by construction.
The implied volatility satisfies its own stochastic differential equation with a drift, a component correlated with the stock, and an independent component. The no-arbitrage condition creates a relationship between the drift of implied vol, the implied vol itself, the actual vol, and the volatility of implied vol. You specify the volatility process for implied vol (observable from time series data) and deduce either the actual vol or the risk-neutral drift from the constraint.
This is a neat idea but comes with its own set of complications. The details are involved and Wilmott leaves them to the references.
The Correlation Effect
In examples, Wilmott shows how correlation between stock and volatility dramatically affects option values. With negative correlation (stock drops, volatility rises), out-of-the-money puts become more expensive, creating a skew. With positive correlation, the skew goes the other way. With zero correlation, you get a slight smile.
Negative correlation is empirically the norm for equities: markets crash fast (volatility spikes when stocks drop). This is why equity volatility smiles are skewed downward. The correlation parameter is one of the most important inputs in any stochastic volatility model.
Key Takeaways
Stochastic volatility is the most theoretically sound approach to modeling real-world volatility dynamics. Volatility genuinely is unpredictable, and treating it as random is the honest thing to do.
The price you pay: an extra dimension in your PDE, an unobservable market price of risk, and the need for more instruments to hedge. Models proliferate because the functional forms for drift and diffusion of volatility are hard to pin down from data.
The Heston model dominates practice because of its closed-form solutions, not because it is the most realistic model. Speed wins over accuracy in a business where traders need prices in real time.
And always remember: the market price of volatility risk lambda is not observable, not derivable from first principles, and not stable over time. It is the weakest link in the stochastic volatility chain. The models that avoid lambda entirely (uncertain parameters, mean-variance approach) may be theoretically cleaner, even if the market has not fully embraced them yet.
Previous post: Volatility Surfaces: Smiles, Skews, and Local Vol
Next post: Uncertain Parameters: When You Don’t Even Know the Distribution