Short-Rate Models: Vasicek & CIR for Interest Rate Dynamics

Michael BrenndoerferDecember 6, 202553 min read

Learn Vasicek and CIR short-rate models for interest rate dynamics. Master mean reversion, bond pricing formulas, and derivative valuation techniques.

Reading Level

Choose your expertise level to adjust how many terms are explained. Beginners see more tooltips, experts see fewer to maintain reading flow. Hover over underlined terms for instant definitions.

Short-Rate Interest Rate Models

In Part II, we studied the term structure of interest rates and learned how to extract discount factors and forward rates from observed bond prices. That analysis treated the yield curve as a static object, a snapshot of rates at a single point in time. But interest rates change constantly. The 10-year Treasury yield that was 4.2% yesterday might be 4.3% today and 3.9% next month. To price interest rate derivatives, manage bond portfolio risk, or forecast future rate scenarios, we need models that capture how interest rates evolve through time.

Short-rate models address this need by specifying a stochastic process for the instantaneous interest rate. Building on the Brownian motion and stochastic calculus framework from earlier chapters, these models describe rate dynamics with a combination of deterministic drift and random shocks. The key insight of short-rate models is that once you specify how the instantaneous rate evolves, the entire yield curve at any future time follows from no-arbitrage arguments.

This chapter introduces the two foundational short-rate models: the Vasicek model and the Cox-Ingersoll-Ross (CIR) model. Both incorporate mean reversion, the empirically observed tendency of interest rates to gravitate toward a long-run average level. We'll derive their dynamics, understand their parameters, simulate rate paths, and see how they produce closed-form bond prices. These models laid the groundwork for all subsequent interest rate modeling and remain essential tools for understanding rate dynamics.

The Short Rate and Why We Model It

Short Rate

The short rate r(t)r(t) is the instantaneous interest rate at time tt, the rate earned over an infinitesimally small time interval [t,t+dt][t, t+dt]. It represents the continuously compounded rate for borrowing or lending over the next instant.

The short rate is a theoretical construct that serves as the fundamental building block of interest rate modeling. You cannot observe it directly in markets because no instrument has an instantaneous maturity. The closest observable proxies are overnight rates like the federal funds rate or overnight repo rates, but even these apply to discrete time periods rather than true instants. Despite its abstract nature, the short rate serves as a powerful modeling device because it provides a parsimonious way to describe the entire yield curve with a single state variable.

Short rates are fundamental because of their connection to bond prices. If we know how r(t)r(t) evolves from now until some future time TT, we can compute the discount factor from the fundamental relationship:

P(t,T)=EQ[exp(tTr(s)ds)]P(t,T) = \mathbb{E}^Q\left[\exp\left(-\int_t^T r(s)\,ds\right)\right]

This equation encapsulates the pricing framework. Let's examine each component:

  • P(t,T)P(t,T) price at time tt of a zero-coupon bond paying a face value of 1 at time TT
  • EQ\mathbb{E}^Q: expectation taken under the risk-neutral measure QQ
  • r(s)r(s): instantaneous short rate at time ss

The integral tTr(s)ds\int_t^T r(s)\,ds accumulates the short rate over the entire period from tt to TT. This accumulated rate, when exponentiated with a negative sign, produces the discount factor. The expectation operator is necessary because r(s)r(s) is random: we do not know today what the short rate will be at future times ss. By taking the risk-neutral expectation, we obtain a price consistent with no-arbitrage principles.

This formula, which follows from the risk-neutral valuation framework we developed in Chapter 4 of this Part, tells us that the bond price equals the expected value of the discount factor, where discounting occurs at the random path of short rates. This approach generalizes: once we specify a stochastic process for r(t)r(t), we can price any interest rate derivative by computing appropriate expectations of payoffs discounted at this random rate.

A single stochastic differential equation (SDE) for r(t)r(t) determines the term structure. Different short-rate models make different assumptions about this SDE, leading to different yield curve shapes and dynamics. The choice of SDE reflects your beliefs about how interest rates behave: whether they exhibit mean reversion, how their volatility changes with the rate level, and whether they can become negative.

The Vasicek Model

Model Specification

Oldřich Vašíček's 1977 model established the foundation for modern interest rate modeling. Before this, there was no standard framework for relating interest rate dynamics to bond prices. The Vasicek model assumes the short rate follows an Ornstein-Uhlenbeck process, a continuous-time stochastic process originally developed by physicists to model the velocity of particles subject to friction:

dr(t)=κ(θr(t))dt+σdW(t)dr(t) = \kappa(\theta - r(t))\,dt + \sigma\,dW(t)

This equation captures key economic behaviors through terms that govern rate evolution:

  • κ\kappa speed of mean reversion, controlling how quickly rates return to the long-run mean
  • θ\theta: long-run mean level toward which the process gravitates
  • r(t)r(t): instantaneous short rate at time tt
  • σ\sigma: volatility of the short rate
  • W(t)W(t): standard Brownian motion under the risk-neutral measure

The drift term κ(θr(t))\kappa(\theta - r(t)) provides mean reversion, the model's defining characteristic. To understand this mechanism, consider what happens in different scenarios. When r(t)>θr(t) > \theta, the current rate exceeds the long-run mean, making the term (θr(t))(\theta - r(t)) negative. Multiplied by the positive constant κ\kappa, this produces a negative drift, pushing rates down toward θ\theta. Conversely, when r(t)<θr(t) < \theta, the drift becomes positive, pulling rates up. This self-correcting behavior ensures that rates oscillate around θ\theta rather than wandering off to arbitrarily high or low values.

Out[3]:
Visualization
Line plot showing the drift term as a linear function of the current rate, crossing zero at theta.
The relationship between the instantaneous short rate and the expected drift in the Vasicek model. The linear plot illustrates the mean-reversion mechanism, where drift magnitude increases as the rate deviates from the long-run mean θ, pushing the process back toward equilibrium.

The parameter κ\kappa controls the strength of this gravitational pull. A large κ\kappa means rates snap back quickly after any displacement from θ\theta, while a small κ\kappa allows rates to deviate from the long-run mean for extended periods. Empirically, interest rates do exhibit mean reversion: extremely high rates tend to fall as central banks loosen policy to stimulate growth, while very low rates eventually rise as economies recover. The Vasicek model captures this economic reality through its mathematical structure.

Solving the SDE

The Vasicek SDE is linear and has an explicit solution, making it useful for practical applications. We solve this using the integrating factor method, a technique familiar from ordinary differential equations. The key insight is to transform the equation into one that can be integrated directly.

Consider the transformation f(t)=r(t)eκtf(t) = r(t)e^{\kappa t}. This choice is motivated by the observation that if we can eliminate the r(t)r(t) term from the right-hand side, integration becomes straightforward. Applying Itô's Lemma to this transformation, we obtain:

d(r(t)eκt)=eκtdr(t)+κeκtr(t)dt(product rule)=eκt[κ(θr(t))dt+σdW(t)]+κeκtr(t)dt(substitute dr(t))=(κθeκtκr(t)eκt)dt+σeκtdW(t)+κr(t)eκtdt(expand drift)=κθeκtdt+σeκtdW(t)(simplify)\begin{aligned} d(r(t)e^{\kappa t}) &= e^{\kappa t}dr(t) + \kappa e^{\kappa t}r(t)dt && \text{(product rule)} \\ &= e^{\kappa t}[\kappa(\theta - r(t))dt + \sigma dW(t)] + \kappa e^{\kappa t}r(t)dt && \text{(substitute } dr(t)\text{)} \\ &= (\kappa\theta e^{\kappa t} - \kappa r(t)e^{\kappa t})dt + \sigma e^{\kappa t}dW(t) + \kappa r(t)e^{\kappa t}dt && \text{(expand drift)} \\ &= \kappa\theta e^{\kappa t}dt + \sigma e^{\kappa t}dW(t) && \text{(simplify)} \end{aligned}

The κr(t)eκt-\kappa r(t)e^{\kappa t} term from the original drift cancels with the term arising from differentiating the exponential, which is the purpose of this transformation. The resulting expression contains only the deterministic term κθeκtdt\kappa\theta e^{\kappa t}dt and the stochastic term σeκtdW(t)\sigma e^{\kappa t}dW(t), both of which can be integrated directly.

Integrating from tt to TT:

r(T)eκTr(t)eκt=tTκθeκsds+tTσeκsdW(s)r(T)e^{\kappa T} - r(t)e^{\kappa t} = \int_t^T \kappa\theta e^{\kappa s}ds + \int_t^T \sigma e^{\kappa s}dW(s)

The first integral on the right-hand side is a standard calculus integral that evaluates to θ(eκTeκt)\theta(e^{\kappa T} - e^{\kappa t}). The second integral is a stochastic integral against Brownian motion. Multiplying both sides by eκTe^{-\kappa T} and rearranging yields the explicit solution:

r(T)=r(t)eκ(Tt)+θ(1eκ(Tt))+σtTeκ(Ts)dW(s)r(T) = r(t)e^{-\kappa(T-t)} + \theta(1 - e^{-\kappa(T-t)}) + \sigma\int_t^T e^{-\kappa(T-s)}\,dW(s)

Each component of this solution has an intuitive interpretation:

  • r(T)r(T) short rate at future time TT
  • r(t)r(t): current short rate at time tt
  • κ\kappa: speed of mean reversion
  • θ\theta: long-run mean level
  • σ\sigma: volatility of the short rate
  • W(s)W(s): standard Brownian motion

The first two terms are deterministic, showing how the initial rate r(t)r(t) decays exponentially toward θ\theta. The term r(t)eκ(Tt)r(t)e^{-\kappa(T-t)} represents the "memory" of the initial condition, which fades exponentially over time. The term θ(1eκ(Tt))\theta(1 - e^{-\kappa(T-t)}) is the complementary contribution from the long-run mean, which grows as the initial condition fades. As TtT - t becomes large, the first term approaches zero and the second approaches θ\theta, so the expected future rate converges to the long-run mean regardless of where it started.

The stochastic integral σtTeκ(Ts)dW(s)\sigma\int_t^T e^{-\kappa(T-s)}\,dW(s) represents accumulated random shocks, with exponential weighting that gives more influence to recent shocks. Shocks that occurred long ago have been "forgotten" due to the mean-reverting drift, while recent shocks still influence the current rate. This weighting function eκ(Ts)e^{-\kappa(T-s)} is largest when s=Ts = T (equal to 1) and decays for earlier times.

Since the stochastic integral of a deterministic function against Brownian motion is normally distributed (a fundamental property of Itô integrals), r(T)r(T) conditional on r(t)r(t) follows a normal distribution:

r(T)r(t)N(μ(t,T),v2(t,T))r(T) | r(t) \sim \mathcal{N}\left(\mu(t,T), v^2(t,T)\right)

The conditional mean and variance can be computed explicitly:

μ(t,T)=r(t)eκ(Tt)+θ(1eκ(Tt))v2(t,T)=σ22κ(1e2κ(Tt))\begin{aligned} \mu(t,T) &= r(t)e^{-\kappa(T-t)} + \theta(1 - e^{-\kappa(T-t)}) \\ v^2(t,T) &= \frac{\sigma^2}{2\kappa}\left(1 - e^{-2\kappa(T-t)}\right) \end{aligned}

These expressions reveal the long-run behavior of the process:

  • μ(t,T)\mu(t,T) conditional mean of r(T)r(T) given r(t)r(t)
  • v2(t,T)v^2(t,T): conditional variance of r(T)r(T) given r(t)r(t)
  • σ22κ\frac{\sigma^2}{2\kappa}: long-run variance of the short rate
  • r(t)r(t): current short rate
  • κ\kappa: mean reversion speed
  • θ\theta: long-run mean
  • σ\sigma: volatility
  • TtT-t: time horizon

As the horizon TtT - t grows large, the conditional mean approaches θ\theta regardless of the starting point r(t)r(t). Similarly, the conditional variance approaches the limiting value σ22κ\frac{\sigma^2}{2\kappa}, which represents the steady-state variance of the rate process. This limiting variance decreases with stronger mean reversion (larger κ\kappa) because the process is pulled back toward θ\theta more frequently, reducing dispersion.

Out[4]:
Visualization
Multiple probability density curves showing how the rate distribution evolves from a narrow peak at 2% to a wider distribution centered at 5%.
Evolution of the conditional distribution of future rates in the Vasicek model. Starting from r(0) = 2%, the distribution shifts toward θ = 5% and widens over time, eventually converging to a stationary distribution centered at θ with variance σ²/(2κ).

Bond Pricing

The Vasicek model yields analytical formulas for bond prices. The Vasicek model belongs to the class of affine term structure models. This means bond prices have the exponential-affine form

P(t,T)=A(t,T)eB(t,T)r(t)P(t,T) = A(t,T)e^{-B(t,T)r(t)}

This structure is powerful because the bond price depends on the current short rate r(t)r(t) in a very specific way: exponentially, with a coefficient B(t,T)B(t,T) that depends only on the time to maturity and model parameters. The components of this expression are:

  • P(t,T)P(t,T) price of a zero-coupon bond maturing at TT
  • A(t,T),B(t,T)A(t,T), B(t,T): deterministic functions of time depending on model parameters
  • r(t)r(t): current short rate

To derive the specific forms of AA and BB, we substitute the assumed form into the fundamental partial differential equation governing bond prices and match coefficients. For the Vasicek model, this procedure yields:

B(t,T)=1eκ(Tt)κA(t,T)=exp[(θσ22κ2)(B(t,T)(Tt))σ24κB(t,T)2]\begin{aligned} B(t,T) &= \frac{1 - e^{-\kappa(T-t)}}{\kappa} \\ A(t,T) &= \exp\left[\left(\theta - \frac{\sigma^2}{2\kappa^2}\right)(B(t,T) - (T-t)) - \frac{\sigma^2}{4\kappa}B(t,T)^2\right] \end{aligned}

Understanding what these functions represent helps build intuition for bond pricing under mean reversion:

  • A(t,T),B(t,T)A(t,T), B(t,T) deterministic functions determining the bond price structure
  • κ,θ,σ\kappa, \theta, \sigma: Vasicek model parameters
  • TtT-t: time to maturity

The function B(t,T)B(t,T) measures the sensitivity of the bond price to the short rate, which is closely related to the bond's duration. When κ\kappa is small, B(t,T)B(t,T) approaches TtT - t, and bond price sensitivity grows linearly with maturity. When κ\kappa is large, the mean reversion prevents long-term rate changes from persisting, so sensitivity saturates at 1/κ1/\kappa for very long maturities.

Out[5]:
Visualization
Line plot showing B(t,T) curves that start linear and saturate at different levels depending on kappa.
The B(t,T) function for different mean reversion speeds. With no mean reversion, bond sensitivity would grow linearly with maturity (dashed line). Mean reversion causes B(t,T) to saturate at 1/κ, as long-term rate changes are not expected to persist.

The function A(t,T)A(t,T) captures two effects. First, it incorporates the effect of the mean-reverting drift toward θ\theta. Second, it includes a convexity adjustment arising from volatility. This adjustment reflects the fact that bondholders benefit from interest rate volatility due to the convex relationship between rates and prices: when rates fall, prices rise more than they fall when rates rise by the same amount.

These closed-form expressions make the Vasicek model computationally tractable. Given parameters (κ,θ,σ)(\kappa, \theta, \sigma) and the current short rate r(t)r(t), you can instantly compute the price of any zero-coupon bond and therefore construct the entire yield curve. This stands in contrast to models without analytical solutions, which require time-consuming numerical methods.

The continuously compounded yield for maturity τ=Tt\tau = T - t follows directly from the bond price:

y(t,T)=lnP(t,T)τ=B(t,T)r(t)lnA(t,T)τy(t,T) = -\frac{\ln P(t,T)}{\tau} = \frac{B(t,T)r(t) - \ln A(t,T)}{\tau}

This expression shows that the yield is an affine (linear plus constant) function of the short rate:

  • y(t,T)y(t,T) continuously compounded yield for maturity TT
  • P(t,T)P(t,T): zero-coupon bond price
  • A(t,T),B(t,T)A(t,T), B(t,T): Vasicek model functions defined above
  • τ=Tt\tau = T-t: time to maturity
  • r(t)r(t): current short rate

The affine structure extends to yields, which is why this class of models is called "affine term structure models." The linearity of yields in the short rate has important implications: parallel shifts in the short rate produce parallel shifts in the yield curve, scaled by the maturity-dependent factor B(t,T)/τB(t,T)/\tau.

The Negative Rate Problem

A critical feature of the Vasicek model is that r(t)r(t) is normally distributed, which means it can take negative values with positive probability. For decades, this was considered a significant drawback of the model. How could interest rates be negative? The reasoning went that if deposit rates were negative, banks would pay you to borrow money, which seemed economically absurd. Rational agents would simply hold cash rather than accept negative returns.

Then came the post-2008 financial crisis era. Central banks in Europe, Japan, and Switzerland pushed policy rates below zero in an effort to stimulate lending and economic activity. Suddenly, negative rates were not just theoretically possible but empirically observed. Banks charged depositors for holding their money, and some government bonds traded at negative yields.

The Vasicek model's ability to generate negative rates transformed from a bug to a feature for modeling modern interest rate environments. While the model was not designed with negative rates in mind, its mathematical structure accommodates this reality naturally. This historical reversal illustrates an important principle: model properties that seem problematic under one economic regime may become valuable under different circumstances.

The Cox-Ingersoll-Ross Model

Model Specification

John Cox, Jonathan Ingersoll, and Stephen Ross developed their model in 1985 as part of a general equilibrium framework for asset pricing. Unlike Vasicek, who specified the rate dynamics directly, Cox, Ingersoll, and Ross derived their model from underlying assumptions about technology, preferences, and market equilibrium. The CIR model modifies the Vasicek specification by making volatility proportional to the square root of the rate:

dr(t)=κ(θr(t))dt+σr(t)dW(t)dr(t) = \kappa(\theta - r(t))\,dt + \sigma\sqrt{r(t)}\,dW(t)

The drift term is identical to Vasicek, preserving the mean-reverting structure. The key innovation lies in the diffusion term:

  • κ,θ\kappa, \theta mean reversion speed and long-run mean
  • σ\sigma: volatility coefficient
  • r(t)\sqrt{r(t)}: state-dependent volatility scaling factor
  • dW(t)dW(t): Brownian increment

The parameters κ\kappa, θ\theta, and σ\sigma have the same interpretations as in Vasicek: mean-reversion speed, long-run mean, and volatility coefficient respectively. The crucial difference is the r(t)\sqrt{r(t)} term multiplying the Brownian shock.

This square-root diffusion has profound implications for the behavior of the process. When r(t)r(t) is close to zero, the volatility term σr(t)\sigma\sqrt{r(t)} becomes small, reducing the chance of large negative shocks that could push rates below zero. The process essentially "quiets down" as it approaches zero. When r(t)r(t) is larger, volatility increases proportionally, matching the empirical observation that rate volatility tends to be higher during high-rate environments. Historical data shows that volatility depends on the rate level.

The economic intuition behind the square-root specification connects to the general equilibrium derivation. In the Cox-Ingersoll-Ross framework, the interest rate is linked to the marginal product of capital in the economy. When rates are low, the economy is in a depressed state with limited investment opportunities, and uncertainty about future rates is also low. When rates are high, the economy is more active, and there is greater uncertainty about the path of future rates.

The Feller Condition

The CIR model keeps rates non-negative under certain parameter restrictions. The Feller condition states:

2κθσ22\kappa\theta \geq \sigma^2

This inequality places a constraint on the relationship between model parameters:

  • κ\kappa mean reversion speed
  • θ\theta: long-run mean
  • σ\sigma: volatility parameter

When this condition holds, the rate r(t)r(t) will never hit zero. If it is violated, rates can touch zero but will immediately bounce back; they cannot become negative. This distinction matters for certain theoretical analyses but has limited practical impact, as rates near zero behave similarly in both cases.

Feller Condition

The Feller condition 2κθσ22\kappa\theta \geq \sigma^2 ensures the short rate in the CIR model remains strictly positive. It requires that the mean-reversion drift toward θ\theta is strong enough relative to volatility to prevent rates from reaching zero.

The intuition behind the Feller condition is straightforward when viewed as a balance of forces. The condition compares two quantities: the deterministic pull toward θ\theta (captured by κθ\kappa\theta, which represents how fast the drift pushes rates up when they are near zero) and the random shocks (captured by σ2\sigma^2, which measures the magnitude of random fluctuations). If volatility is too high relative to the mean-reversion strength, the random shocks can overwhelm the upward drift near zero, potentially pushing the process to zero despite the drift trying to keep it positive.

The factor of 2 in the condition arises from the mathematical properties of the square-root process. When rr is small, the drift is approximately κθ\kappa\theta (since κr\kappa r is negligible), while the variance of shocks is approximately σ2rdt\sigma^2 r \cdot dt. For the drift to dominate near zero, we need κθdt\kappa\theta \cdot dt to exceed the standard deviation of shocks, which leads to the Feller condition after careful analysis.

Out[6]:
Visualization
Contour plot showing the Feller condition boundary in kappa-sigma space for a fixed theta.
Parameter space for the Feller condition 2κθ ≥ σ². The green region indicates where the condition is satisfied and rates stay strictly positive, while the red region shows parameters where rates can touch zero.

Distribution of Future Rates

Unlike the Vasicek model, the CIR process does not yield normally distributed rates. The square-root diffusion produces a fundamentally different distribution. Specifically, r(T)r(T) conditional on r(t)r(t) follows a non-central chi-squared distribution, a probability distribution that arises naturally in statistics when summing squares of normal random variables:

r(T)r(t)=σ2(1eκ(Tt))4κχd2(λ)r(T) | r(t) = \frac{\sigma^2(1 - e^{-\kappa(T-t)})}{4\kappa} \cdot \chi^2_d(\lambda)

The non-central chi-squared distribution is characterized by two parameters:

  • χd2(λ)\chi^2_d(\lambda) non-central chi-squared random variable
  • dd: degrees of freedom
  • λ\lambda: non-centrality parameter

For the CIR model, these parameters are:

d=4κθσ2λ=4κeκ(Tt)σ2(1eκ(Tt))r(t)\begin{aligned} d &= \frac{4\kappa\theta}{\sigma^2} \\ \lambda &= \frac{4\kappa e^{-\kappa(T-t)}}{\sigma^2(1 - e^{-\kappa(T-t)})} r(t) \end{aligned}

The degrees of freedom dd depends only on the model parameters and is constant over time. Notice that when the Feller condition holds exactly (with equality), d=2d = 2. The non-centrality parameter λ\lambda depends on the current rate r(t)r(t) and decreases as the horizon increases, reflecting the fading influence of the initial condition. The remaining notation:

  • κ,θ,σ\kappa, \theta, \sigma CIR model parameters
  • r(t)r(t): current short rate
  • TtT-t: time horizon

The conditional mean and variance can be computed from the properties of the non-central chi-squared distribution:

E[r(T)r(t)]=r(t)eκ(Tt)+θ(1eκ(Tt))Var[r(T)r(t)]=r(t)σ2κ(eκ(Tt)e2κ(Tt))+θσ22κ(1eκ(Tt))2\begin{aligned} \mathbb{E}[r(T)|r(t)] &= r(t)e^{-\kappa(T-t)} + \theta(1 - e^{-\kappa(T-t)}) \\ \text{Var}[r(T)|r(t)] &= r(t)\frac{\sigma^2}{\kappa}(e^{-\kappa(T-t)} - e^{-2\kappa(T-t)}) + \theta\frac{\sigma^2}{2\kappa}(1 - e^{-\kappa(T-t)})^2 \end{aligned}

These formulas reveal an important relationship between the Vasicek and CIR models:

  • E[r(T)r(t)]\mathbb{E}[r(T)|r(t)] conditional expected value of the rate
  • Var[r(T)r(t)]\text{Var}[r(T)|r(t)]: conditional variance of the rate
  • r(t)r(t): current short rate

Notice that the conditional mean formula is identical to the Vasicek model; both processes have the same expected path. The mean reversion operates identically in both cases, pulling rates toward θ\theta at speed κ\kappa. The difference lies in the variance structure and the shape of the distribution. The CIR variance has two terms: one proportional to r(t)r(t) (reflecting the state-dependent volatility) and one proportional to θ\theta (the contribution from shocks accumulated as the process approaches its steady state).

The non-central chi-squared distribution is positively skewed, unlike the symmetric normal distribution of Vasicek. This skewness becomes more pronounced when rates are low, as the distribution is bounded below by zero but can extend to high values. When rates are high, the distribution becomes more symmetric and approaches normality, consistent with the central limit theorem as the non-centrality parameter grows.

Bond Pricing

The CIR model is also affine, with bond prices taking the form P(t,T)=A(t,T)eB(t,T)r(t)P(t,T) = A(t,T)e^{-B(t,T)r(t)}. The derivation follows the same approach as Vasicek: substitute the assumed form into the bond pricing PDE and solve for the coefficients. The resulting functions are:

B(t,T)=2(eγ(Tt)1)(γ+κ)(eγ(Tt)1)+2γA(t,T)=[2γe(κ+γ)(Tt)/2(γ+κ)(eγ(Tt)1)+2γ]2κθ/σ2\begin{aligned} B(t,T) &= \frac{2(e^{\gamma(T-t)} - 1)}{(\gamma + \kappa)(e^{\gamma(T-t)} - 1) + 2\gamma} \\ A(t,T) &= \left[\frac{2\gamma e^{(\kappa + \gamma)(T-t)/2}}{(\gamma + \kappa)(e^{\gamma(T-t)} - 1) + 2\gamma}\right]^{2\kappa\theta/\sigma^2} \end{aligned}

The auxiliary parameter γ\gamma that appears in these formulas combines the model parameters in a specific way:

  • A(t,T),B(t,T)A(t,T), B(t,T) deterministic functions for the CIR bond price
  • γ=κ2+2σ2\gamma = \sqrt{\kappa^2 + 2\sigma^2}: auxiliary parameter
  • κ,θ,σ\kappa, \theta, \sigma: CIR model parameters

The parameter γ\gamma can be interpreted as an "adjusted" mean-reversion speed that accounts for the risk premium embedded in bond prices. It is always larger than κ\kappa, reflecting that bonds are priced as if rates reverted faster than they actually do under the physical measure.

While more complex than the Vasicek formulas, these expressions remain analytically tractable and enable rapid bond pricing without numerical integration. As in the Vasicek model, B(t,T)B(t,T) measures the bond's sensitivity to the short rate (duration), while A(t,T)A(t,T) captures the effects of mean reversion and the convexity adjustment. The CIR formulas reduce to the Vasicek formulas in the limit as the square-root term becomes approximately constant (which occurs when rate levels are high relative to their volatility).

Comparing Vasicek and CIR

Out[7]:
Visualization
Line plot comparing constant Vasicek volatility to the square-root CIR volatility function.
Instantaneous volatility comparison between Vasicek and CIR models. Vasicek has constant volatility regardless of rate level (horizontal line). CIR volatility scales with the square root of the rate, producing lower volatility when rates are low and higher volatility when rates are high.

The table below summarizes the key differences between the two models:

Comparison of key features between Vasicek and CIR short-rate models.
FeatureVasicekCIR
DiffusionConstant (σ\sigma)State-dependent (σr\sigma\sqrt{r})
Rate distributionNormalNon-central chi-squared
Negative ratesPossibleNot possible (with Feller condition)
Volatility behaviorConstantHigher when rates are high
Analytical tractabilitySimpler formulasMore complex but still closed-form

Both models share the same mean-reversion structure, producing similar expected rate paths. The primary differences concern:

  1. Distributional assumptions Vasicek assumes normal shocks regardless of rate level; CIR assumes volatility scales with rate level.

  2. Positivity: CIR guarantees non-negative rates when the Feller condition holds; Vasicek does not.

  3. Empirical fit: CIR's state-dependent volatility often better matches observed rate dynamics, where volatility tends to be higher during high-rate periods.

Simulation of Short-Rate Paths

Let's implement both models and visualize their behavior. We'll start by simulating multiple paths to understand how mean reversion shapes rate dynamics.

In[8]:
Code
import numpy as np

# Set random seed for reproducibility
np.random.seed(42)

# Model parameters
kappa = 0.5  # Speed of mean reversion
theta = 0.05  # Long-run mean (5%)
sigma = 0.02  # Volatility
r0 = 0.02  # Initial rate (2%)

# Simulation parameters
T = 10  # Time horizon (years)
n_steps = 2500  # Number of time steps
n_paths = 1000  # Number of simulation paths
dt = T / n_steps

# Time grid
t = np.linspace(0, T, n_steps + 1)

Vasicek Simulation

The Vasicek model admits two approaches to simulation. The simpler approach uses Euler-Maruyama discretization which approximates the continuous-time SDE with discrete time steps. The discretization follows the general pattern of replacing drdr with Δr\Delta r, dtdt with Δt\Delta t, and dWdW with ΔtZ\sqrt{\Delta t} Z where ZZ is standard normal:

rt+dt=rt+κ(θrt)dt+σdtZr_{t+dt} = r_t + \kappa(\theta - r_t)dt + \sigma\sqrt{dt}\,Z

The components of this discrete recursion are:

  • rt+dtr_{t+dt} rate at the next time step
  • dtdt: time step size
  • ZZ: standard normal random variable, ZN(0,1)Z \sim \mathcal{N}(0,1)
  • rtr_t: current rate
  • κ,θ,σ\kappa, \theta, \sigma: Vasicek parameters

Since the Vasicek model has an exact solution, we can also simulate using the exact transition density. This approach is superior because it produces samples from exactly the correct distribution regardless of time step size, eliminating discretization error. The implementation below uses this exact simulation method.

In[9]:
Code
def simulate_vasicek(r0, kappa, theta, sigma, T, n_steps, n_paths, seed=None):
    """Simulate Vasicek model paths using exact transition density."""
    if seed is not None:
        np.random.seed(seed)

    dt = T / n_steps
    rates = np.zeros((n_paths, n_steps + 1))
    rates[:, 0] = r0

    for i in range(n_steps):
        # Exact mean and variance of r(t+dt) given r(t)
        exp_decay = np.exp(-kappa * dt)
        mean = rates[:, i] * exp_decay + theta * (1 - exp_decay)
        variance = (sigma**2 / (2 * kappa)) * (1 - np.exp(-2 * kappa * dt))
        std = np.sqrt(variance)

        # Sample from normal distribution
        rates[:, i + 1] = mean + std * np.random.randn(n_paths)

    return rates


# Simulate Vasicek paths
vasicek_paths = simulate_vasicek(
    r0, kappa, theta, sigma, T, n_steps, n_paths, seed=42
)

CIR Simulation

For the CIR model, exact simulation is more complex due to the non-central chi-squared transition distribution. While exact methods exist (involving transformations to chi-squared random variables), practitioners often use the Euler-Maruyama scheme with a reflection or truncation to handle potential negative values:

rt+dt=rt+κ(θrt)dt+σrtdtZr_{t+dt} = r_t + \kappa(\theta - r_t)dt + \sigma\sqrt{r_t}\sqrt{dt}\,Z

Each term in this discretization corresponds to its continuous-time counterpart:

  • ZZ standard normal random variable
  • rt\sqrt{r_t}: diffusion term dependent on current rate level
  • rt+dtr_{t+dt}: rate at next step
  • rtr_t: current rate
  • κ,θ,σ\kappa, \theta, \sigma: CIR parameters
  • dtdt: time step size

The challenge with this discretization is that the right-hand side can become negative even though the continuous-time process stays positive. This occurs because the discrete approximation does not perfectly capture the behavior of the square-root diffusion near zero. We apply a full truncation scheme where we take max(rt,0)\max(r_t, 0) before computing the square root, and again after adding the shock. This is one of several approaches in the literature; alternatives include reflection schemes and implicit methods.

In[10]:
Code
def simulate_cir(r0, kappa, theta, sigma, T, n_steps, n_paths, seed=None):
    """Simulate CIR model paths using Euler-Maruyama with truncation."""
    if seed is not None:
        np.random.seed(seed)

    dt = T / n_steps
    rates = np.zeros((n_paths, n_steps + 1))
    rates[:, 0] = r0

    for i in range(n_steps):
        r_current = np.maximum(
            rates[:, i], 0
        )  # Truncation for numerical stability
        drift = kappa * (theta - r_current) * dt
        diffusion = sigma * np.sqrt(r_current * dt) * np.random.randn(n_paths)
        rates[:, i + 1] = r_current + drift + diffusion
        rates[:, i + 1] = np.maximum(rates[:, i + 1], 0)  # Ensure non-negative

    return rates


# Check Feller condition parameters
feller_lhs = 2 * kappa * theta
feller_rhs = sigma**2
condition_met = feller_lhs >= feller_rhs

# Simulate CIR paths
cir_paths = simulate_cir(r0, kappa, theta, sigma, T, n_steps, n_paths, seed=42)
Out[11]:
Console
Feller condition: 2κθ = 0.0500 >= σ² = 0.0004
Condition satisfied: True

The Feller condition is satisfied with these parameters, so the CIR model should keep rates strictly positive.

Visualizing Path Behavior

Let's plot sample paths from both models to see mean reversion in action.

Out[12]:
Visualization
Line plot showing 50 Vasicek interest rate paths reverting to 5% over 10 years.
Vasicek model simulation with 50 sample paths showing mean reversion toward θ = 5%. The normal distribution of shocks allows rates to become negative, as observed in several paths crossing below zero.
Line plot showing 50 CIR interest rate paths reverting to 5% over 10 years.
CIR model simulation with 50 sample paths showing mean reversion toward θ = 5%. The square-root diffusion process ensures rates remain non-negative, with volatility decreasing as rates approach zero.

The models show clear mean reversion. Paths starting at 2% tend to drift upward toward the 5% long-run mean. The key visual difference is that some Vasicek paths dip below zero, while CIR paths remain positive throughout.

Distribution Analysis

Let's examine the distribution of rates at the 10-year horizon.

Out[13]:
Visualization
Histogram comparing Vasicek and CIR rate distributions at 10-year horizon.
Terminal rate distributions at T=10 years. Vasicek produces a symmetric normal distribution centered near θ, while CIR produces a right-skewed distribution bounded at zero.
In[14]:
Code
# Calculate statistics
vasicek_terminal = vasicek_paths[:, -1]
cir_terminal = cir_paths[:, -1]

# Theoretical mean for both models
exp_decay = np.exp(-kappa * T)
theoretical_mean = r0 * exp_decay + theta * (1 - exp_decay)

# Vasicek statistics
v_mean = vasicek_terminal.mean()
v_std = vasicek_terminal.std()
v_min = vasicek_terminal.min()
v_neg_count = (vasicek_terminal < 0).sum()
v_neg_pct = (vasicek_terminal < 0).mean() * 100

# CIR statistics
c_mean = cir_terminal.mean()
c_std = cir_terminal.std()
c_min = cir_terminal.min()
Out[15]:
Console
Terminal Rate Statistics (T = 10 years):
--------------------------------------------------
Theoretical mean: 4.9798%

Vasicek:
  Simulated mean: 5.0281%
  Std deviation:  2.0835%
  Minimum rate:   -0.5242%
  Negative paths: 5 (0.5%)

CIR:
  Simulated mean: 4.9922%
  Std deviation:  0.4670%
  Minimum rate:   3.8295%

Both models converge toward the theoretical mean of approximately 5%, as expected from the mean-reversion dynamics. The Vasicek distribution is symmetric and normal, while the CIR distribution shows slight right-skewness and is bounded below by zero.

Bond Pricing with Short-Rate Models

Vasicek and CIR models provide closed-form bond pricing formulas. Let's implement these and construct yield curves.

In[16]:
Code
def vasicek_bond_price(r, t, T, kappa, theta, sigma):
    """
    Calculate zero-coupon bond price under Vasicek model.

    Parameters:
    -----------
    r : float or array
        Current short rate
    t : float
        Current time
    T : float
        Maturity time
    kappa, theta, sigma : float
        Vasicek model parameters

    Returns:
    --------
    float or array : Bond price(s)
    """
    tau = T - t
    if tau <= 0:
        return 1.0

    B = (1 - np.exp(-kappa * tau)) / kappa
    A = np.exp(
        (theta - sigma**2 / (2 * kappa**2)) * (B - tau)
        - sigma**2 * B**2 / (4 * kappa)
    )

    return A * np.exp(-B * r)


def cir_bond_price(r, t, T, kappa, theta, sigma):
    """
    Calculate zero-coupon bond price under CIR model.

    Parameters:
    -----------
    r : float or array
        Current short rate
    t : float
        Current time
    T : float
        Maturity time
    kappa, theta, sigma : float
        CIR model parameters

    Returns:
    --------
    float or array : Bond price(s)
    """
    tau = T - t
    if tau <= 0:
        return 1.0

    gamma = np.sqrt(kappa**2 + 2 * sigma**2)
    exp_gamma = np.exp(gamma * tau)

    denominator = (gamma + kappa) * (exp_gamma - 1) + 2 * gamma

    B = 2 * (exp_gamma - 1) / denominator
    A = ((2 * gamma * np.exp((kappa + gamma) * tau / 2)) / denominator) ** (
        2 * kappa * theta / sigma**2
    )

    return A * np.exp(-B * r)

Yield Curve Construction

Let's generate yield curves for different initial short rates.

In[17]:
Code
# Maturities from 0.25 to 30 years
maturities = np.array([0.25, 0.5, 1, 2, 3, 5, 7, 10, 15, 20, 30])
r_scenarios = [0.02, 0.05, 0.08]  # Different starting rates


# Calculate yields for each scenario
def calculate_yields(r, maturities, kappa, theta, sigma, model="vasicek"):
    """Calculate continuously compounded yields for given maturities."""
    yields = np.zeros(len(maturities))
    for i, T in enumerate(maturities):
        if model == "vasicek":
            P = vasicek_bond_price(r, 0, T, kappa, theta, sigma)
        else:
            P = cir_bond_price(r, 0, T, kappa, theta, sigma)
        yields[i] = -np.log(P) / T
    return yields


# Compute yield curves for scenarios
vasicek_yields = np.array(
    [
        calculate_yields(r, maturities, kappa, theta, sigma, "vasicek")
        for r in r_scenarios
    ]
)
cir_yields = np.array(
    [
        calculate_yields(r, maturities, kappa, theta, sigma, "cir")
        for r in r_scenarios
    ]
)
Out[18]:
Visualization
Line plot showing three Vasicek yield curves starting at different rates converging at long maturities.
Vasicek yield curves for three initial rates (2%, 5%, 8%). The curves converge to the long-run mean θ=5% at long maturities, illustrating the effect of mean reversion.
Line plot showing three CIR yield curves starting at different rates converging at long maturities.
CIR yield curves for three initial rates. Like Vasicek, the curves converge to the long-run mean, but the specific shape is influenced by the state-dependent volatility.

The yield curves demonstrate several important features of mean-reverting short-rate models:

  1. Convergence: All curves converge toward a common long-term yield regardless of the initial rate. This reflects the mean-reversion property, where expected rates eventually gravitate toward θ\theta.

  2. Curve shape flexibility: When r(0)<θr(0) < \theta, the yield curve is upward-sloping (normal). When r(0)>θr(0) > \theta, the curve is downward-sloping (inverted). When r(0)=θr(0) = \theta, the curve is relatively flat but shows a slight convexity effect.

  3. Risk premium implicit in long yields: Long-term yields don't equal θ\theta exactly due to the interaction between mean reversion and interest rate volatility, which creates a convexity adjustment in bond prices.

Monte Carlo Bond Pricing

We can also price bonds using Monte Carlo simulation by averaging the discount factor across paths. This approach is more general than the closed-form solutions and extends to models without analytical formulas.

The Monte Carlo estimator for the zero-coupon bond price is:

P^(t,T)=1Ni=1Nexp(j=0M1rj(i)Δt)\hat{P}(t,T) = \frac{1}{N}\sum_{i=1}^N \exp\left(-\sum_{j=0}^{M-1} r_j^{(i)} \Delta t\right)

This formula averages the realized discount factors across all simulated paths:

  • NN number of simulation paths
  • MM: number of time steps
  • rj(i)r_j^{(i)}: short rate on path ii at step jj
  • Δt\Delta t: time step size

The inner sum j=0M1rj(i)Δt\sum_{j=0}^{M-1} r_j^{(i)} \Delta t approximates the integral tTr(s)ds\int_t^T r(s) ds for path ii using a Riemann sum. The exponential of the negative of this integral gives the discount factor for that path. Averaging across all paths produces an unbiased estimator of the bond price, with standard error decreasing as 1/N1/\sqrt{N}.

This approach generalizes to complex interest rate derivatives. For any derivative with payoff h(rT1,rT2,)h(r_{T_1}, r_{T_2}, \ldots) depending on rates at multiple times, we can estimate the price by simulating paths, computing the payoff on each path, discounting at the realized short-rate integral, and averaging.

In[19]:
Code
def monte_carlo_bond_price(rates, dt, T):
    """
    Calculate bond price using Monte Carlo simulation.

    Parameters:
    -----------
    rates : ndarray
        Simulated rate paths, shape (n_paths, n_steps+1)
    dt : float
        Time step size
    T : float
        Bond maturity

    Returns:
    --------
    float : Estimated bond price
    """
    # Number of steps to maturity
    steps_to_T = int(T / dt)

    # Calculate integral of rates using trapezoidal rule
    rate_integral = np.trapz(rates[:, : steps_to_T + 1], dx=dt, axis=1)

    # Discount factor for each path
    discount_factors = np.exp(-rate_integral)

    # Average across paths
    return discount_factors.mean(), discount_factors.std() / np.sqrt(
        len(discount_factors)
    )


# Compare analytical and Monte Carlo prices
test_maturities = [1, 2, 5, 10]
mc_results = []

for T_mat in test_maturities:
    # Analytical prices
    v_ana = vasicek_bond_price(r0, 0, T_mat, kappa, theta, sigma)
    c_ana = cir_bond_price(r0, 0, T_mat, kappa, theta, sigma)

    # Monte Carlo prices
    v_mc, v_se = monte_carlo_bond_price(vasicek_paths, dt, T_mat)
    c_mc, c_se = monte_carlo_bond_price(cir_paths, dt, T_mat)

    mc_results.append((T_mat, v_ana, v_mc, c_ana, c_mc))
Out[20]:
Console
Bond Price Comparison: Analytical vs Monte Carlo
=================================================================
Maturity   Vasicek      MC Vasicek     CIR          MC CIR      
-----------------------------------------------------------------
1          0.973999     0.973864       0.973954     0.973928    
2          0.940067     0.940146       0.939821     0.939827    
5          0.824426     0.826258       0.822950     0.823289    
10         0.647405     0.650255       0.643928     0.644419    

The Monte Carlo estimates closely match the analytical formulas, validating our simulation implementation. The small differences arise from simulation error, which decreases with more paths or finer time steps.

Out[21]:
Visualization
Line plot showing Monte Carlo bond price estimates converging to analytical values as path count increases.
Monte Carlo convergence for 5-year bond pricing. The shaded bands show 95% confidence intervals that narrow as the number of paths increases. Both Vasicek and CIR Monte Carlo estimates converge to their respective analytical values.

Pricing Interest Rate Derivatives

Short-rate models enable pricing of various interest rate derivatives. Let's implement a simple example: pricing a zero-coupon bond option.

European Bond Options

Under the Vasicek model, European options on zero-coupon bonds have closed-form solutions. These formulas resemble the famous Black-Scholes formula for stock options, but with modifications appropriate for the interest rate setting. A call option on a T2T_2-maturity bond, expiring at time T1<T2T_1 < T_2, gives the holder the right to purchase the bond at the strike price KK. The value of this option is:

C=P(0,T2)N(d1)KP(0,T1)N(d2)C = P(0, T_2)N(d_1) - K \cdot P(0, T_1)N(d_2)

Each term in this pricing formula has a clear interpretation:

  • CC value of the European call option
  • P(0,Ti)P(0, T_i): price of zero-coupon bond maturing at TiT_i
  • KK: strike price
  • N()N(\cdot): cumulative standard normal distribution function
  • d1,d2d_1, d_2: standardized limits for the integration
  • σP\sigma_P: volatility of the forward bond price

The first term P(0,T2)N(d1)P(0, T_2)N(d_1) represents the expected present value of receiving the bond, weighted by the probability (under an appropriate measure) that the option will be exercised. The second term KP(0,T1)N(d2)K \cdot P(0, T_1)N(d_2) represents the expected present value of paying the strike price, again weighted by the exercise probability under the risk-neutral measure.

The parameters d1d_1 and d2d_2 measure how far the current forward price is from the strike, scaled by volatility:

d1=ln(P(0,T2)/(KP(0,T1)))σP+σP2d2=d1σP\begin{aligned} d_1 &= \frac{\ln(P(0, T_2) / (K \cdot P(0, T_1)))}{\sigma_P} + \frac{\sigma_P}{2} \\ d_2 &= d_1 - \sigma_P \end{aligned}

The numerator of the first term in d1d_1 compares the forward price of the underlying bond (which equals P(0,T2)/P(0,T1)P(0, T_2)/P(0, T_1) in a no-arbitrage setting) to the strike price KK:

  • d1,d2d_1, d_2 standardized integration limits
  • P(0,Ti)P(0, T_i): zero-coupon bond prices
  • KK: strike price
  • σP\sigma_P: volatility of the forward bond price

The volatility term σP\sigma_P measures how much uncertainty there is about the bond price at the option's expiration. In the Vasicek model, this volatility can be computed analytically:

σP=σB(T1,T2)1e2κT12κ\sigma_P = \sigma B(T_1, T_2) \sqrt{\frac{1 - e^{-2\kappa T_1}}{2\kappa}}

This expression combines several factors that contribute to bond price uncertainty:

  • σ\sigma volatility of the short rate
  • B(T1,T2)=(1eκ(T2T1))/κB(T_1, T_2) = (1 - e^{-\kappa(T_2 - T_1)})/\kappa: Vasicek bond coefficient
  • T1T_1: option expiration time
  • T2T_2: bond maturity time
  • κ\kappa: mean reversion speed

The factor σ\sigma is the underlying volatility of the short rate. The factor B(T1,T2)B(T_1, T_2) translates short-rate volatility into bond price volatility for a bond maturing at T2T_2, as of the option expiration at T1T_1. The square-root term accounts for the accumulation of uncertainty from now until the option expiration, adjusted for mean reversion. Stronger mean reversion (larger κ\kappa) reduces uncertainty because the rate is pulled back toward θ\theta more quickly.

In[22]:
Code
from scipy.stats import norm


def vasicek_bond_option(r0, kappa, theta, sigma, T1, T2, K, option_type="call"):
    """
    Price a European option on a zero-coupon bond under Vasicek model.

    Parameters:
    -----------
    r0 : float
        Current short rate
    kappa, theta, sigma : float
        Vasicek parameters
    T1 : float
        Option expiration time
    T2 : float
        Bond maturity time (T2 > T1)
    K : float
        Strike price
    option_type : str
        'call' or 'put'

    Returns:
    --------
    float : Option price
    """
    # Bond prices
    P_T1 = vasicek_bond_price(r0, 0, T1, kappa, theta, sigma)
    P_T2 = vasicek_bond_price(r0, 0, T2, kappa, theta, sigma)

    # B function for forward bond
    B_T1_T2 = (1 - np.exp(-kappa * (T2 - T1))) / kappa

    # Bond price volatility
    sigma_P = (
        sigma * B_T1_T2 * np.sqrt((1 - np.exp(-2 * kappa * T1)) / (2 * kappa))
    )

    # d1 and d2
    d1 = (np.log(P_T2 / (K * P_T1)) / sigma_P) + 0.5 * sigma_P
    d2 = d1 - sigma_P

    if option_type == "call":
        price = P_T2 * norm.cdf(d1) - K * P_T1 * norm.cdf(d2)
    else:
        price = K * P_T1 * norm.cdf(-d2) - P_T2 * norm.cdf(-d1)

    return price
In[23]:
Code
# Example: Price a 1-year call option on a 5-year zero-coupon bond
T1 = 1.0  # Option expiration
T2 = 5.0  # Bond maturity

# The underlying bond price today
bond_price = vasicek_bond_price(r0, 0, T2, kappa, theta, sigma)

# Strike at-the-money forward
K_atm = vasicek_bond_price(r0, 0, T2, kappa, theta, sigma) / vasicek_bond_price(
    r0, 0, T1, kappa, theta, sigma
)

# Price options at different strikes
strikes = np.linspace(0.75, 0.95, 11)
call_prices = [
    vasicek_bond_option(r0, kappa, theta, sigma, T1, T2, K, "call")
    for K in strikes
]
put_prices = [
    vasicek_bond_option(r0, kappa, theta, sigma, T1, T2, K, "put")
    for K in strikes
]

# Calculate ATM prices for summary
atm_call = vasicek_bond_option(r0, kappa, theta, sigma, T1, T2, K_atm, "call")
atm_put = vasicek_bond_option(r0, kappa, theta, sigma, T1, T2, K_atm, "put")
current_bond_1y = vasicek_bond_price(r0, 0, T1, kappa, theta, sigma)
Out[24]:
Visualization
Line plot showing call and put option prices versus strike price for bond options.
European call and put option prices on a 5-year zero-coupon bond with 1-year expiration under the Vasicek model. Call values decrease and put values increase as the strike price rises, intersecting at the at-the-money forward strike marked by the dashed line.
Out[25]:
Console
Bond Option Pricing Summary:
----------------------------------------
Current 5-year bond price: 0.824426
Current 1-year bond price: 0.973999
ATM forward strike: 0.846434

ATM Call price: 0.009044
ATM Put price:  0.009044

The option prices follow familiar patterns: calls decrease as strike increases, puts increase as strike increases. The intersection near the ATM forward strike reflects put-call parity for bond options.

Key Parameters

The key parameters for the Vasicek and CIR models govern all aspects of rate dynamics and derivative pricing:

  • r(t)r(t) Instantaneous short rate. The fundamental state variable driving the term structure. All bond prices and yields are functions of this single variable, which is why these are called "one-factor" models.

  • κ\kappa: Speed of mean reversion. Controls how quickly the rate returns to the long-run mean θ\theta. Larger values of κ\kappa produce tighter distributions around θ\theta and faster convergence of yield curves to their long-term levels.

  • θ\theta: Long-run mean level. The equilibrium level toward which interest rates gravitate over time. This parameter reflects structural factors in the economy such as potential growth rates and inflation expectations.

  • σ\sigma: Volatility of the short rate. Determines the magnitude of random shocks affecting the rate path. In Vasicek, this is an absolute volatility; in CIR, it is scaled by the square root of the rate level.

  • TT: Time to maturity. Used in bond and option pricing formulas. The effect of maturity on prices depends on the interplay between mean reversion and volatility.

  • KK: Strike price. The exercise price for bond options. Together with the forward price of the underlying, the strike determines the moneyness of the option.

Parameter Impact Analysis

Exploring how parameters affect rate dynamics and bond prices helps build intuition for calibration.

Effect of Mean Reversion Speed

The parameter κ\kappa controls how quickly rates revert to the long-run mean. Higher κ\kappa means faster reversion and lower variance in the long run.

In[26]:
Code
# Different mean reversion speeds
kappa_values = [0.1, 0.5, 1.0, 2.0]

# Simulate paths for each kappa
paths_by_kappa = {}
for k in kappa_values:
    paths_by_kappa[k] = simulate_vasicek(
        r0, k, theta, sigma, T, n_steps, 500, seed=42
    )

# Calculate statistics for plotting
kappa_stats = {}
for k, paths in paths_by_kappa.items():
    kappa_stats[k] = {
        "mean": paths.mean(axis=0) * 100,
        "p5": np.percentile(paths, 5, axis=0) * 100,
        "p95": np.percentile(paths, 95, axis=0) * 100,
    }
Out[27]:
Visualization
Plot showing rate path distributions for different mean reversion speeds.
Impact of the mean reversion speed κ on simulated rate paths. Higher κ values produce faster reversion to the long-run mean θ and result in lower steady-state variance, illustrated by the narrowing of the 5th to 95th percentile bands.

With low κ\kappa, rates wander more freely and take longer to approach θ\theta. With high κ\kappa, rates snap back quickly after any shock, resulting in a tighter distribution around the long-run mean.

Effect of Volatility

The volatility parameter σ\sigma scales the random component. Higher volatility produces wider distributions and more variable rate paths.

In[28]:
Code
# Different volatility levels
sigma_values = [0.01, 0.02, 0.04, 0.06]

# Simulate paths for each sigma
paths_by_sigma = {}
for s in sigma_values:
    paths_by_sigma[s] = simulate_vasicek(
        r0, kappa, theta, s, T, n_steps, 500, seed=42
    )

# Calculate statistics for plotting
sigma_stats = {}
for s, paths in paths_by_sigma.items():
    sigma_stats[s] = {
        "mean": paths.mean(axis=0) * 100,
        "p5": np.percentile(paths, 5, axis=0) * 100,
        "p95": np.percentile(paths, 95, axis=0) * 100,
    }
Out[29]:
Visualization
Plot showing rate path distributions for different volatility levels.
Impact of the volatility parameter σ on interest rate distributions. Higher volatility produces wider 95% confidence bands around the mean path, increasing the probability of extreme rate scenarios.

The mean paths are nearly identical across volatility levels since σ\sigma doesn't affect the expected rate path, only its variance. Higher volatility dramatically widens the confidence bands, increasing the probability of extreme rate scenarios.

Yield Curve Shapes

The interplay between parameters determines what yield curve shapes the model can produce. Let's explore how the current rate relative to the long-run mean affects the curve.

In[30]:
Code
# Define range of initial rates around theta
r_range = np.linspace(0.01, 0.09, 9)  # 1% to 9%
maturities_fine = np.linspace(0.1, 30, 100)

# Calculate yield curves for shape analysis
yield_curves_shape = []
for r_start in r_range:
    curves = calculate_yields(
        r_start, maturities_fine, kappa, theta, sigma, "vasicek"
    )
    yield_curves_shape.append(curves)
Out[31]:
Visualization
Plot showing multiple yield curves ranging from inverted to normal shapes.
Variety of yield curve shapes produced by the Vasicek model for different starting rates relative to θ = 5%. The model generates upward-sloping, flat, or inverted curves based on whether the initial short rate r(0) is below, equal to, or above the long-run mean.

The Vasicek model (and CIR) can produce a range of yield curve shapes, but with important limitations. Both models belong to the class of single-factor models, meaning all yields are driven by a single source of randomness. This implies that yields at different maturities are perfectly correlated, which is not observed in practice where short and long rates sometimes move in opposite directions.

Limitations and Impact

What These Models Enabled

The Vasicek and CIR models represented a major advance in fixed-income modeling when introduced in the 1970s and 1980s. They provided the first rigorous, arbitrage-free frameworks for understanding how interest rates evolve and how this evolution affects derivative prices. Their key contributions include:

The concept of mean reversion in rates became standard. Before these models, practitioners often assumed rates followed random walks without drift, which implied rates could wander arbitrarily far from any long-run level. The mean-reverting specification captured the economic intuition that extremely high or low rates are unsustainable and markets tend to push rates back toward equilibrium levels. This insight continues to inform all modern interest rate modeling.

Closed-form bond pricing formulas enabled practical applications. The affine structure of both models means that bond prices, forward rates, and many derivative prices can be computed analytically without resorting to simulation or numerical methods. This computational efficiency was crucial in the pre-computer-intensive era and remains valuable today for calibration and real-time pricing.

The models also established the template for equilibrium-based interest rate modeling. Both Vasicek and CIR derived their dynamics from underlying assumptions about the economy and investor behavior, grounding the mathematical specification in economic theory rather than pure statistical convenience.

Key Limitations

Despite their foundational importance, these single-factor short-rate models have significant limitations that led to the development of more sophisticated approaches.

The most fundamental limitation is the single-factor structure. With only one source of randomness driving all rates, yields across the entire term structure are perfectly correlated. In reality, central bank actions might push short rates down while inflation concerns push long rates up, creating correlation structures that single-factor models cannot capture. Multi-factor extensions address this but sacrifice some analytical tractability.

Another critical issue is the inability to match the current yield curve exactly. Both models produce yield curves as functions of their parameters, but for any fixed parameter set, the model-implied curve will generally differ from the observed market curve. This means you cannot use these models to price derivatives consistently with observed bond prices unless you either accept some pricing errors or use time-varying parameters.

The Vasicek model's potential for negative rates was historically seen as a flaw, though the post-2008 era of negative policy rates made this feature more acceptable for certain applications. However, the CIR model's requirement that the Feller condition hold can make calibration to historical data challenging, as estimated parameters may violate this constraint.

Both models also assume constant parameters over time. Real interest rate dynamics exhibit regime changes: the high-volatility period of the early 1980s differs dramatically from the low-rate, low-volatility environment of the 2010s. Constant-parameter models cannot capture these shifts without re-estimation.

The Path Forward

These limitations motivated the development of more advanced interest rate models. The Heath-Jarrow-Morton (HJM) framework and LIBOR Market Model (LMM), which we'll explore in the next chapter, address the calibration issue by modeling the entire forward curve rather than just the short rate. Multi-factor models incorporate additional sources of randomness to capture realistic correlation structures. The calibration techniques covered in Chapter 21 provide methods for fitting model parameters to market data while respecting no-arbitrage constraints.

Despite their limitations, Vasicek and CIR remain important for building intuition about rate dynamics, for educational purposes, and as building blocks within more complex models. Understanding these foundational models is essential before tackling the more advanced frameworks used in modern practice.

Summary

This chapter introduced short-rate models, which describe the evolution of the instantaneous interest rate as a stochastic process with mean reversion. The key concepts covered include:

The short rate r(t)r(t) is the instantaneous interest rate driving all bond prices through the relationship P(t,T)=EQ[exp(tTr(s)ds)]P(t,T) = \mathbb{E}^Q[\exp(-\int_t^T r(s)\,ds)]. Specifying a stochastic process for r(t)r(t) determines the entire term structure dynamics.

The Vasicek model dr=κ(θr)dt+σdWdr = \kappa(\theta - r)dt + \sigma dW was the first mean-reverting short-rate model. It produces normally distributed rates, allowing for negative values, and has simple closed-form bond pricing formulas with exponential-affine structure.

The CIR model dr=κ(θr)dt+σrdWdr = \kappa(\theta - r)dt + \sigma\sqrt{r}dW modifies Vasicek by making volatility proportional to r\sqrt{r}. This keeps rates non-negative when the Feller condition 2κθσ22\kappa\theta \geq \sigma^2 holds, producing chi-squared distributed rates with state-dependent volatility.

Mean reversion is the central feature of both models. The parameter κ\kappa controls how fast rates return to the long-run mean θ\theta after a shock. Higher κ\kappa means faster reversion and lower long-run variance.

Yield curves under these models converge to a common long-term rate regardless of the initial short rate. The shape depends on whether r(0)r(0) is above or below θ\theta, producing inverted, flat, or normal curves.

Limitations include the single-factor structure (perfect correlation across maturities), inability to match observed yield curves exactly, and constant parameters that cannot capture regime changes.

These models laid the foundation for all modern interest rate modeling. The next chapter examines advanced frameworks like HJM and the LIBOR Market Model that address some limitations while preserving the arbitrage-free pricing foundation established here.

Quiz

Ready to test your understanding? Take this quick quiz to reinforce what you've learned about short-rate interest rate models.

Loading component...

Reference

BIBTEXAcademic
@misc{shortratemodelsvasicekcirforinterestratedynamics, author = {Michael Brenndoerfer}, title = {Short-Rate Models: Vasicek & CIR for Interest Rate Dynamics}, year = {2025}, url = {https://mbrenndoerfer.com/writing/short-rate-models-vasicek-cir-interest-rate-dynamics}, organization = {mbrenndoerfer.com}, note = {Accessed: 2025-01-01} }
APAAcademic
Michael Brenndoerfer (2025). Short-Rate Models: Vasicek & CIR for Interest Rate Dynamics. Retrieved from https://mbrenndoerfer.com/writing/short-rate-models-vasicek-cir-interest-rate-dynamics
MLAAcademic
Michael Brenndoerfer. "Short-Rate Models: Vasicek & CIR for Interest Rate Dynamics." 2026. Web. today. <https://mbrenndoerfer.com/writing/short-rate-models-vasicek-cir-interest-rate-dynamics>.
CHICAGOAcademic
Michael Brenndoerfer. "Short-Rate Models: Vasicek & CIR for Interest Rate Dynamics." Accessed today. https://mbrenndoerfer.com/writing/short-rate-models-vasicek-cir-interest-rate-dynamics.
HARVARDAcademic
Michael Brenndoerfer (2025) 'Short-Rate Models: Vasicek & CIR for Interest Rate Dynamics'. Available at: https://mbrenndoerfer.com/writing/short-rate-models-vasicek-cir-interest-rate-dynamics (Accessed: today).
SimpleBasic
Michael Brenndoerfer (2025). Short-Rate Models: Vasicek & CIR for Interest Rate Dynamics. https://mbrenndoerfer.com/writing/short-rate-models-vasicek-cir-interest-rate-dynamics