Black-Scholes PDE: Derivation & Delta Hedging Explained

Michael BrenndoerferNovember 26, 202546 min read

Derive the Black-Scholes-Merton PDE using Itô's lemma, delta hedging, and no-arbitrage principles. Complete step-by-step mathematical derivation.

Reading Level

Choose your expertise level to adjust how many terms are explained. Beginners see more tooltips, experts see fewer to maintain reading flow. Hover over underlined terms for instant definitions.

Derivation of the Black-Scholes-Merton PDE

The Black-Scholes-Merton partial differential equation stands as one of the most important results in mathematical finance. Published in 1973 by Fischer Black, Myron Scholes, and Robert Merton, this equation provides a framework for pricing derivative securities by showing that the price of any derivative must satisfy a specific PDE under certain assumptions. What makes this result remarkable is that the equation contains no reference to investor preferences or expected returns: the derivative price depends only on observable quantities and the assumption of no arbitrage.

In this chapter, you will learn how to derive this PDE from first principles using the tools we've developed in previous chapters. The derivation follows a beautiful logical chain: we assume stock prices follow geometric Brownian motion, apply Itô's lemma to find the dynamics of a derivative's price, construct a portfolio that eliminates all randomness through dynamic hedging, and invoke the no-arbitrage principle to obtain the final equation. Understanding this derivation deeply prepares you for the practical applications covered in subsequent chapters, where we'll solve the PDE to obtain explicit pricing formulas.

The Model Assumptions

Before deriving the PDE, we must clearly state the assumptions underlying the Black-Scholes-Merton framework. These assumptions define an idealized market that, while not perfectly realistic, captures enough structure to produce useful results. Each assumption plays a specific role in the derivation, and understanding why each is needed helps you appreciate both the power and the limitations of the resulting framework.

Black-Scholes-Merton Assumptions

The complete set of assumptions required for the BSM framework:

  1. The stock price follows geometric Brownian motion with constant drift and volatility
  2. Trading occurs continuously with no transaction costs
  3. Short selling is permitted without restrictions
  4. There are no arbitrage opportunities
  5. Securities are infinitely divisible
  6. A risk-free asset exists with constant interest rate rr
  7. The stock pays no dividends during the option's life
  8. Markets are frictionless with no taxes or bid-ask spreads

The most critical assumption is that the stock price StS_t follows geometric Brownian motion. This mathematical model captures the essential features we observe in real stock prices: they fluctuate randomly, they exhibit larger absolute movements when prices are higher, and they cannot become negative. The geometric Brownian motion specification takes the form:

dSt=μStdt+σStdWtdS_t = \mu S_t \, dt + \sigma S_t \, dW_t

where:

  • StS_t: stock price at time tt
  • μ\mu: expected return (drift) of the stock
  • σ\sigma: volatility of stock returns
  • dWtdW_t: increment of a standard Wiener process (random shock)
  • dtdt: small time increment

To understand this equation intuitively, think of the stock price as experiencing two simultaneous influences over each infinitesimally small time interval. The first influence, captured by μStdt\mu S_t \, dt, represents the deterministic drift: on average, the stock price tends to grow at rate μ\mu. This drift is proportional to the current price, reflecting the multiplicative nature of returns. The second influence, captured by σStdWt\sigma S_t \, dW_t, represents the random shocks that buffet the price. The Wiener increment dWtdW_t is the mathematical formalization of "white noise," introducing unpredictable fluctuations. The volatility parameter σ\sigma scales these fluctuations, and multiplying by StS_t ensures that absolute price movements scale with the price level.

As we discussed in our chapter on Brownian Motion and Random Walk Models, this model ensures stock prices remain positive and that returns are normally distributed. The positivity follows from the multiplicative structure: since we're multiplying by StS_t, the process never crosses zero.

Out[3]:
Visualization
Sample paths of geometric Brownian motion with initial price $S_0 = 100$, drift $\mu = 8\%$, and volatility $\sigma = 20\%$. Each path represents a possible realization of stock price evolution over one year. The paths exhibit the characteristic features of GBM: random fluctuations with larger absolute movements at higher price levels, while remaining strictly positive.
Sample paths of geometric Brownian motion with initial price $S_0 = 100$, drift $\mu = 8\%$, and volatility $\sigma = 20\%$. Each path represents a possible realization of stock price evolution over one year. The paths exhibit the characteristic features of GBM: random fluctuations with larger absolute movements at higher price levels, while remaining strictly positive.

Let V(S,t)V(S, t) denote the price of a derivative security that depends on the stock price SS and time tt. Our goal is to determine what equation VV must satisfy. We make no assumptions about the specific form of VV: it could represent a call option, a put option, or any other derivative whose payoff depends on the stock price at maturity. This generality is one of the beautiful features of the BSM framework: rather than deriving separate formulas for each type of derivative, we derive a single equation that all derivatives must satisfy.

Dynamics of the Derivative Price

To find how the derivative price changes over time, we must apply Itô's lemma. This is where the machinery of stochastic calculus becomes essential. In ordinary calculus, if you have a function f(x)f(x) and xx changes by a small amount dxdx, then ff changes by approximately f(x)dxf'(x) \cdot dx. But in stochastic calculus, the presence of random fluctuations introduces additional terms that have no analog in the deterministic world.

Recall from our chapter on Itô's Lemma and Stochastic Calculus that for any sufficiently smooth function V(S,t)V(S, t) where SS follows a diffusion process, the differential dVdV has a specific form that accounts for the second-order effects of stochastic calculus. The key insight is that terms involving (dW)2(dW)^2, which would vanish in ordinary calculus because they're second-order small, actually contribute to first order in stochastic calculus because (dW)2=dt(dW)^2 = dt.

Since StS_t follows geometric Brownian motion, Itô's lemma gives:

dV=Vtdt+VSdS+122VS2(dS)2dV = \frac{\partial V}{\partial t} dt + \frac{\partial V}{\partial S} dS + \frac{1}{2} \frac{\partial^2 V}{\partial S^2} (dS)^2

where:

  • V(S,t)V(S, t): price of the derivative security
  • Vt\frac{\partial V}{\partial t}: sensitivity of option value to time (Theta)
  • VS\frac{\partial V}{\partial S}: sensitivity of option value to stock price (Delta)
  • 2VS2\frac{\partial^2 V}{\partial S^2}: curvature of option value with respect to stock price (Gamma)
  • dtdt: small time increment
  • dSdS: change in stock price

Notice that this formula looks almost like the ordinary chain rule, with one crucial addition: the final term involving the second derivative. This term arises specifically because of the stochastic nature of the price process. In deterministic calculus, this term would be negligibly small, a second-order effect effect that vanishes in the limit. But in stochastic calculus, the roughness of Brownian motion paths means this term survives and contributes meaningfully to the dynamics.

We need to evaluate (dS)2(dS)^2 by substituting the expression for dSdS and applying Itô multiplication rules. This calculation reveals why stochastic calculus differs fundamentally from ordinary calculus:

(dS)2=(μSdt+σSdW)2=μ2S2(dt)2+2μσS2dtdW+σ2S2(dW)2(expand square)=σ2S2dt(apply (dt)2=0,dtdW=0,(dW)2=dt)\begin{aligned} (dS)^2 &= (\mu S \, dt + \sigma S \, dW)^2 \\ &= \mu^2 S^2 (dt)^2 + 2\mu\sigma S^2 \, dt \, dW + \sigma^2 S^2 (dW)^2 && \text{(expand square)} \\ &= \sigma^2 S^2 \, dt && \text{(apply } (dt)^2=0, dt \cdot dW=0, (dW)^2=dt \text{)} \end{aligned}

where:

  • dSdS: change in stock price
  • μ\mu: drift coefficient
  • σ\sigma: volatility coefficient
  • SS: stock price
  • dtdt: small time increment
  • dWdW: Wiener process increment

The Itô multiplication rules deserve careful attention because they encode the essence of what makes stochastic calculus different. The rule (dt)2=0(dt)^2 = 0 is intuitive: when we square an infinitesimally small quantity, we get something even more negligibly small. The rule dtdW=0dt \cdot dW = 0 follows from similar reasoning. But the rule (dW)2=dt(dW)^2 = dt is remarkable: squaring a random increment of order dt\sqrt{dt} produces something of order dtdt, which is not negligible. This is the mathematical expression of the fact that Brownian motion paths are extremely rough, changing direction infinitely often in any finite time interval.

Substituting (dS)2(dS)^2 and dSdS back into the expression for dVdV and grouping terms, we arrive at the complete dynamics of the derivative price:

dV=Vtdt+VS(μSdt+σSdW)+122VS2(σ2S2dt)=(Vt+μSVS+12σ2S22VS2)dt+σSVSdW\begin{aligned} dV &= \frac{\partial V}{\partial t} dt + \frac{\partial V}{\partial S} (\mu S \, dt + \sigma S \, dW) + \frac{1}{2} \frac{\partial^2 V}{\partial S^2} (\sigma^2 S^2 \, dt) \\ &= \left( \frac{\partial V}{\partial t} + \mu S \frac{\partial V}{\partial S} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} \right) dt + \sigma S \frac{\partial V}{\partial S} dW \end{aligned}

where:

  • dVdV: change in the derivative price
  • V(S,t)V(S, t): price of the derivative
  • SS: stock price
  • tt: time
  • μ\mu: expected return of the stock
  • σ\sigma: volatility of the stock
  • dWdW: standard Wiener process increment

This equation tells us that the derivative price has both a deterministic component (the dtdt term) and a random component (the dWdW term). The deterministic component captures the systematic evolution of the derivative price due to the passage of time, the drift in the stock price, and the convexity effect from the second derivative. The random component, proportional to dWdW, represents the uncertainty in the derivative price arising from the uncertainty in the stock price.

The presence of dWdW means that holding the derivative alone exposes you to the same source of randomness as holding the stock. Both the stock and the derivative are driven by the same Brownian motion, though with different sensitivities. This observation is the crucial starting point for the hedging argument that follows.

Out[5]:
Visualization
Call option value $V(S, t)$ as a function of stock price and time to maturity. The surface shows how option value increases with stock price (especially above the strike $K = 100$) and with time to maturity (more time means more opportunity for favorable price movements). At maturity ($t = 0$), the value collapses to the payoff function $\max(S - K, 0)$.
Call option value $V(S, t)$ as a function of stock price and time to maturity. The surface shows how option value increases with stock price (especially above the strike $K = 100$) and with time to maturity (more time means more opportunity for favorable price movements). At maturity ($t = 0$), the value collapses to the payoff function $\max(S - K, 0)$.

Constructing the Hedged Portfolio

The key insight of Black and Scholes was that you can construct a portfolio combining the derivative and the stock that eliminates all randomness. This is the essence of delta hedging: by holding the right proportion of stock against the derivative, you create a riskless portfolio. The possibility of such elimination arises because both the stock and the derivative are driven by the same source of randomness, namely the Wiener process WtW_t. If we can find the right mix, the random fluctuations in one position will exactly offset the random fluctuations in the other.

Consider a portfolio Π\Pi consisting of:

  • Long one derivative (value VV)
  • Short Δ\Delta shares of stock (value ΔS-\Delta S)

The word "short" here means we have sold shares we don't own, creating a negative position in the stock. This is a standard operation in financial markets, and one of our model assumptions permits it without restriction. The portfolio value is:

Π=VΔS\Pi = V - \Delta S

where:

  • Π\Pi: value of the hedged portfolio
  • VV: value of the derivative
  • Δ\Delta: number of shares of stock held short
  • SS: current stock price

The change in portfolio value over a small time interval captures how the portfolio responds to all the forces acting on its components:

dΠ=dVΔdSd\Pi = dV - \Delta \, dS

where:

  • dΠd\Pi: change in portfolio value
  • dVdV: change in derivative value
  • Δ\Delta: number of shares held short
  • dSdS: change in stock price

This expression reflects the fact that when we're short Δ\Delta shares, we lose money when the stock price rises and gain money when it falls. The negative sign captures this inverse relationship.

Substituting our expressions for dVdV and dSdS, we obtain a detailed view of all the terms affecting the portfolio:

dΠ=(Vt+μSVS+12σ2S22VS2)dt+σSVSdWΔ(μSdt+σSdW)d\Pi = \left( \frac{\partial V}{\partial t} + \mu S \frac{\partial V}{\partial S} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} \right) dt + \sigma S \frac{\partial V}{\partial S} dW - \Delta(\mu S \, dt + \sigma S \, dW)

where:

  • dΠd\Pi: change in portfolio value
  • VV: value of the derivative
  • SS: stock price
  • Δ\Delta: number of shares held short
  • μ\mu: expected return of the stock
  • σ\sigma: volatility of the stock
  • dtdt: small time increment
  • dWdW: Wiener process increment

Now we rearrange this expression by collecting the dtdt and dWdW terms separately. This organizational step reveals the structure of the portfolio's dynamics:

dΠ=(Vt+μSVS+12σ2S22VS2ΔμS)dt+(σSVSΔσS)dWd\Pi = \left( \frac{\partial V}{\partial t} + \mu S \frac{\partial V}{\partial S} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} - \Delta \mu S \right) dt + \left( \sigma S \frac{\partial V}{\partial S} - \Delta \sigma S \right) dW

where:

  • dΠd\Pi: change in portfolio value
  • VV: value of the derivative
  • SS: stock price
  • Δ\Delta: number of shares held short
  • μ\mu: expected return of the stock
  • σ\sigma: volatility of the stock
  • dtdt: small time increment
  • dWdW: Wiener process increment

The coefficient of dWdW represents the portfolio's exposure to randomness. If this coefficient is nonzero, the portfolio value fluctuates randomly. If we can make this coefficient vanish, the portfolio becomes deterministic over the next instant.

To eliminate the random component, we choose Δ\Delta such that the coefficient of dWdW equals zero:

σSVSΔσS=0ΔσS=σSVSΔ=VS\begin{aligned} \sigma S \frac{\partial V}{\partial S} - \Delta \sigma S &= 0 \\ \Delta \sigma S &= \sigma S \frac{\partial V}{\partial S} \\ \Delta &= \frac{\partial V}{\partial S} \end{aligned}

where:

  • Δ\Delta: number of shares to short
  • VS\frac{\partial V}{\partial S}: derivative's delta

The solution is elegantly simple: we should hold exactly VS\frac{\partial V}{\partial S} shares short for each unit of derivative we hold long. This quantity, the partial derivative of the option price with respect to the stock price, measures how sensitive the option is to stock price movements. It tells us exactly how many shares we need to offset that sensitivity.

Delta Hedge Ratio

The quantity Δ=VS\Delta = \frac{\partial V}{\partial S} is called the delta of the derivative. It represents the sensitivity of the derivative price to changes in the underlying stock price, and it equals the number of shares needed to hedge one unit of the derivative. For a call option, delta typically ranges between 0 and 1, indicating that the option price moves less than dollar-for-dollar with the stock. For a put option, delta ranges between -1 and 0.

With this choice of Δ\Delta, something remarkable happens. Not only does the random term vanish, but the terms involving μ\mu also cancel out. Let us trace through this simplification carefully:

dΠ=(Vt+μSVS+12σ2S22VS2VSμS)dt=(Vt+12σ2S22VS2)dt\begin{aligned} d\Pi &= \left( \frac{\partial V}{\partial t} + \mu S \frac{\partial V}{\partial S} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} - \frac{\partial V}{\partial S} \mu S \right) dt \\ &= \left( \frac{\partial V}{\partial t} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} \right) dt \end{aligned}

where:

  • dΠd\Pi: change in value of the hedged portfolio
  • VV: price of the derivative
  • SS: stock price
  • σ\sigma: volatility
  • tt: time
  • dtdt: small time increment

This This is a remarkable result: the portfolio's change has no random component. The portfolio is instantaneously riskless. Furthermore, the expected return μ\mu of the stock has completely vanished from the expression. The portfolio change depends only on the passage of time, the volatility, and the Greeks of the derivative. This disappearance of μ\mu is profound: it means the derivative price will not depend on how optimistic or pessimistic investors are about the stock's future direction.

Out[7]:
Visualization
Distribution of P&L for delta-hedged vs unhedged option positions over 1,000 simulations. The hedged portfolio (blue) has dramatically reduced variance compared to the unhedged position (red), demonstrating how delta hedging eliminates most of the randomness. The remaining variance in the hedged portfolio comes from discrete rebalancing and gamma effects.
Distribution of P&L for delta-hedged vs unhedged option positions over 1,000 simulations. The hedged portfolio (blue) has dramatically reduced variance compared to the unhedged position (red), demonstrating how delta hedging eliminates most of the randomness. The remaining variance in the hedged portfolio comes from discrete rebalancing and gamma effects.

Applying the No-Arbitrage Principle

We now have a portfolio with no risk over the next instant. Building on the no-arbitrage principle from our previous chapter on risk-neutral valuation, a riskless portfolio must earn the risk-free rate. The logic is ironclad: if the portfolio earned more than the risk-free rate, you could borrow money at the risk-free rate, invest in the portfolio, and pocket the difference as guaranteed profit. If it earned less, you could short the portfolio, invest the proceeds in the risk-free asset, and again earn guaranteed profit. Either scenario represents arbitrage, a free lunch that cannot persist in a well-functioning market.

The no-arbitrage condition states:

dΠ=rΠdtd\Pi = r \Pi \, dt

where:

  • rr: risk-free interest rate
  • Π\Pi: value of the riskless portfolio
  • dtdt: time increment

This equation simply says that the return on a riskless investment must equal the risk-free rate. The risk-free rate rr represents the return you could earn on a perfectly safe investment, such as a government bond or a bank deposit.

Now we substitute our expressions for dΠd\Pi and Π\Pi and perform algebraic manipulations to extract the PDE:

(Vt+12σ2S22VS2)dt=r(VVSS)dtVt+12σ2S22VS2=rVrSVS(divide by dt)Vt+rSVS+12σ2S22VS2rV=0(group terms)\begin{aligned} \left( \frac{\partial V}{\partial t} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} \right) dt &= r \left( V - \frac{\partial V}{\partial S} S \right) dt \\ \frac{\partial V}{\partial t} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} &= rV - rS\frac{\partial V}{\partial S} && \text{(divide by } dt \text{)} \\ \frac{\partial V}{\partial t} + rS\frac{\partial V}{\partial S} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} - rV &= 0 && \text{(group terms)} \end{aligned}

where:

  • VV: derivative price
  • tt: time
  • SS: stock price
  • rr: risk-free rate
  • σ\sigma: volatility

This is the Black-Scholes-Merton partial differential equation. We have arrived at a fundamental constraint that any arbitrage-free derivative price must satisfy. The equation emerged from combining three key ingredients: the assumption that stock prices follow geometric Brownian motion, the application of Itô's lemma to characterize derivative dynamics, and the no-arbitrage condition that riskless portfolios must earn the risk-free rate.

The Black-Scholes-Merton PDE

Black-Scholes-Merton PDE

Any derivative security V(S,t)V(S, t) whose value depends on a stock following geometric Brownian motion must satisfy:

Vt+rSVS+12σ2S22VS2=rV\frac{\partial V}{\partial t} + rS\frac{\partial V}{\partial S} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} = rV

where:

  • V(S,t)V(S, t): price of the derivative security
  • SS: current stock price
  • rr: risk-free interest rate
  • σ\sigma: volatility of the underlying stock

This equation has several notable features worth examining carefully, as they reveal deep insights about derivative pricing.

The drift μ\mu does not appear. The expected return on the stock has completely dropped out of the equation. This means the derivative price is the same whether the stock is expected to grow at 5% per year or 50% per year. At first glance, this seems counterintuitive: shouldn't a call option on a rapidly growing stock be worth more? The resolution lies in understanding that in a no-arbitrage framework, the expected return is already reflected in the current stock price. If investors expected the stock to grow rapidly, they would bid up its price today until the expected return was consistent with its risk. What matters for derivative pricing is volatility, not direction. The randomness, not the trend, determines the option's value.

The equation is a backward parabolic PDE. The equation is similar to the heat equation from physics, but with time running backwards. In physics, the heat equation describes how temperature diffuses forward in time from an initial condition. In finance, we typically know the derivative's value at maturity (the boundary condition) and need to work backwards to find its value today. This backward-in-time structure reflects the forward-looking nature of financial valuation: we discount future payoffs back to the present.

The equation is linear. If V1V_1 and V2V_2 are solutions, then aV1+bV2aV_1 + bV_2 is also a solution for any constants aa and bb. This linearity reflects the additivity of derivative portfolios: a portfolio of derivatives can be priced by summing the prices of its components. You can verify this linearity by substituting aV1+bV2aV_1 + bV_2 into the PDE and using the linearity of partial differentiation.

All derivatives on the same underlying satisfy the same PDE. A call option, a put option, and an exotic option all satisfy the same equation. What distinguishes them is the boundary condition at maturity. The PDE captures the universal dynamics of derivative pricing, while the boundary condition encodes the specific payoff structure. This separation of dynamics from payoffs is what makes the BSM framework so powerful and general.

Boundary Conditions

The BSM PDE alone does not determine a unique solution. The equation describes how the derivative price must evolve, but it does not tell us where the price starts or ends. To price a specific derivative, we need boundary conditions that specify what happens at maturity and at extreme values of the stock price. These boundary conditions encode the contractual terms of the derivative, transforming an abstract PDE into a concrete pricing problem.

For a European call option with strike KK and maturity TT:

V(S,T)=max(SK,0)(Terminal condition)V(0,t)=0(Stock is worthless)V(S,t)SKer(Tt)(Behaves like forward)\begin{aligned} V(S, T) &= \max(S - K, 0) && \text{(Terminal condition)} \\ V(0, t) &= 0 && \text{(Stock is worthless)} \\ V(S, t) &\approx S - Ke^{-r(T-t)} && \text{(Behaves like forward)} \end{aligned}

where:

  • V(S,t)V(S, t): value of the call option
  • SS: current stock price
  • KK: strike price
  • TT: maturity time
  • tt: current time
  • rr: risk-free interest rate

The terminal condition states that at maturity, the call option is worth SKS - K if the stock price exceeds the strike, and zero otherwise. This is simply the contractual payoff of the option. The condition at S=0S = 0 reflects that if the stock price falls to zero (and stays there, as geometric Brownian motion implies), the call option becomes worthless since there's no chance of the stock exceeding the strike. The condition for large SS captures the intuition that when the stock price is very high, the option is almost certain to be exercised, and its value approaches that of a forward contract with delivery price KK.

For a European put option with strike KK and maturity TT:

V(S,T)=max(KS,0)(Terminal condition)V(0,t)=Ker(Tt)(Worth discounted strike)V(S,t)0(Put becomes worthless)\begin{aligned} V(S, T) &= \max(K - S, 0) && \text{(Terminal condition)} \\ V(0, t) &= Ke^{-r(T-t)} && \text{(Worth discounted strike)} \\ V(S, t) &\to 0 && \text{(Put becomes worthless)} \end{aligned}

where:

  • V(S,t)V(S, t): value of the put option
  • SS: current stock price
  • KK: strike price
  • TT: maturity time
  • tt: current time
  • rr: risk-free interest rate

The terminal condition for the put is the mirror image of the call: the put pays off when the stock price is below the strike. When S=0S = 0, the put is certain to pay KK at maturity, so its current value is the present value of that payment, which equals Ker(Tt)Ke^{-r(T-t)}. When SS is very large, the stock price is almost certain to remain above the strike, making the put worthless.

The combination of the PDE and these boundary conditions completely determines the option price. Together they form a well-posed mathematical problem: the PDE constrains how the solution must behave in the interior of the domain, while the boundary conditions pin down the solution at the edges. In the next chapter, we'll see how to solve this problem analytically to obtain the famous Black-Scholes formula.

Out[8]:
Visualization
European call option boundary conditions. The solid line shows the terminal payoff $\max(S-K, 0)$, while the dashed line shows the option value before maturity ($T=0.5$), smoothly approaching the payoff.
European call option boundary conditions. The solid line shows the terminal payoff $\max(S-K, 0)$, while the dashed line shows the option value before maturity ($T=0.5$), smoothly approaching the payoff.
European put option boundary conditions. The solid line shows the terminal payoff $\max(K-S, 0)$, while the dashed line shows the option value before maturity ($T=0.5$), reflecting the time value.
European put option boundary conditions. The solid line shows the terminal payoff $\max(K-S, 0)$, while the dashed line shows the option value before maturity ($T=0.5$), reflecting the time value.

Worked Example: Verifying the PDE

Let's verify that a proposed solution satisfies the Black-Scholes PDE. This exercise builds intuition for what it means to "satisfy" a partial differential equation and demonstrates how to check whether a candidate pricing formula is valid. Consider a simple derivative whose price is V(S,t)=SV(S, t) = S, representing a forward contract with zero delivery price and zero interest rates.

We need to check whether V=SV = S satisfies:

Vt+rSVS+12σ2S22VS2=rV\frac{\partial V}{\partial t} + rS\frac{\partial V}{\partial S} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} = rV

where:

  • VV: price of the derivative
  • SS: stock price
  • rr: risk-free interest rate
  • σ\sigma: volatility
  • tt: time

Computing the partial derivatives is straightforward since V=SV = S is a simple function:

Vt=0,VS=1,2VS2=0\frac{\partial V}{\partial t} = 0, \quad \frac{\partial V}{\partial S} = 1, \quad \frac{\partial^2 V}{\partial S^2} = 0

where:

  • VV: derivative price (SS)

The time derivative is zero because V=SV = S has no explicit dependence on time. The first derivative with respect to SS is one because VV increases linearly with SS. The second derivative is zero because the relationship is perfectly linear with no curvature.

Substituting into the PDE:

0+rS(1)+12σ2S2(0)=rS0 + rS(1) + \frac{1}{2}\sigma^2 S^2(0) = rS

where:

  • r,S,σr, S, \sigma: model parameters

The left side equals rSrS, and the right side is rV=rSrV = rS. The equation is satisfied, confirming that the stock price itself is a trivial solution to the BSM PDE. This makes economic sense: holding the stock is a valid "derivative strategy" that trivially satisfies no-arbitrage pricing.

Now consider a more interesting example: a derivative that pays S2S^2 at maturity TT. Such a derivative might seem exotic, but it illustrates how the PDE handles nonlinear payoffs. We'll verify that a solution of the following form satisfies the PDE:

V(S,t)=S2e(r+σ2)(Tt)V(S, t) = S^2 e^{(r + \sigma^2)(T-t)}

where:

  • V(S,t)V(S, t): price of the derivative
  • SS: stock price
  • rr: risk-free interest rate
  • σ\sigma: volatility
  • TT: maturity time
  • tt: current time

Notice that this formula has an interesting structure: it equals S2S^2 at maturity (when Tt=0T - t = 0), satisfying the terminal condition, and it includes an exponential factor that accounts for the time value and volatility effects.

Computing the partial derivatives requires careful application of the chain rule:

Vt=S2(r+σ2)e(r+σ2)(Tt)VS=2Se(r+σ2)(Tt)2VS2=2e(r+σ2)(Tt)\begin{aligned} \frac{\partial V}{\partial t} &= -S^2(r + \sigma^2)e^{(r + \sigma^2)(T-t)} \\ \frac{\partial V}{\partial S} &= 2Se^{(r + \sigma^2)(T-t)} \\ \frac{\partial^2 V}{\partial S^2} &= 2e^{(r + \sigma^2)(T-t)} \end{aligned}

where:

  • VV: derivative price (S2e(r+σ2)(Tt)S^2 e^{(r + \sigma^2)(T-t)})
  • S,t,r,σ,TS, t, r, \sigma, T: model parameters

The time derivative is negative because the exponential factor decreases as we approach maturity (when TtT - t shrinks). The first spatial derivative reflects that the price increases with SS, at a rate proportional to SS itself. The second spatial derivative captures the constant curvature of the quadratic relationship.

Substituting these derivatives into the left-hand side (LHS) of the PDE and simplifying:

LHS=S2(r+σ2)e(r+σ2)(Tt)+rS2Se(r+σ2)(Tt)+12σ2S22e(r+σ2)(Tt)=S2e(r+σ2)(Tt)[(r+σ2)+2r+σ2](factor out common term)=S2e(r+σ2)(Tt)[rσ2+2r+σ2](expand terms)=S2e(r+σ2)(Tt)r(simplify)=rV(definition of V)\begin{aligned} \text{LHS} &= -S^2(r + \sigma^2)e^{(r+\sigma^2)(T-t)} + rS \cdot 2Se^{(r+\sigma^2)(T-t)} + \frac{1}{2}\sigma^2 S^2 \cdot 2e^{(r+\sigma^2)(T-t)} \\ &= S^2 e^{(r+\sigma^2)(T-t)} \left[ -(r + \sigma^2) + 2r + \sigma^2 \right] && \text{(factor out common term)} \\ &= S^2 e^{(r+\sigma^2)(T-t)} \left[ -r - \sigma^2 + 2r + \sigma^2 \right] && \text{(expand terms)} \\ &= S^2 e^{(r+\sigma^2)(T-t)} \cdot r && \text{(simplify)} \\ &= rV && \text{(definition of V)} \end{aligned}

where:

  • LHS: value of left-hand side of PDE
  • VV: derivative price

The algebraic simplification is satisfying: the terms involving σ2\sigma^2 cancel (one from the time derivative and one from the gamma term), and the terms involving rr combine to give exactly what we need.

The right side of the PDE is:

rV=rS2e(r+σ2)(Tt)rV = r \cdot S^2 e^{(r+\sigma^2)(T-t)}

where:

  • rr: risk-free rate
  • VV: derivative price

Both sides are equal, confirming our solution. This example illustrates that derivatives with nonlinear payoffs require appropriate adjustments to their prices, captured by factors that depend on volatility as well as interest rates.

Code Implementation

Let's implement a numerical verification that a proposed solution satisfies the Black-Scholes PDE. We'll compute the partial derivatives numerically and check that the PDE residual is close to zero.

In[9]:
Code
def bs_pde_residual(V_func, S, t, r, sigma, T, h_S=0.01, h_t=0.0001):
    """
    Compute the residual of the Black-Scholes PDE for a given pricing function.
    A valid solution should have residual close to zero.

    Parameters:
    - V_func: Function V(S, t) giving the derivative price
    - S: Stock price
    - t: Current time
    - r: Risk-free rate
    - sigma: Volatility
    - T: Maturity time
    - h_S, h_t: Step sizes for numerical differentiation
    """
    # Compute V at current point
    V = V_func(S, t, r, sigma, T)

    # Numerical partial derivatives using central differences
    dV_dt = (
        V_func(S, t + h_t, r, sigma, T) - V_func(S, t - h_t, r, sigma, T)
    ) / (2 * h_t)
    dV_dS = (
        V_func(S + h_S, t, r, sigma, T) - V_func(S - h_S, t, r, sigma, T)
    ) / (2 * h_S)
    d2V_dS2 = (
        V_func(S + h_S, t, r, sigma, T)
        - 2 * V
        + V_func(S - h_S, t, r, sigma, T)
    ) / (h_S**2)

    # Compute PDE residual: LHS - RHS should equal zero
    lhs = dV_dt + r * S * dV_dS + 0.5 * sigma**2 * S**2 * d2V_dS2
    rhs = r * V

    return lhs - rhs

Now let's define a test pricing function and verify it satisfies the PDE. We'll use the power payoff derivative from our worked example.

In[10]:
Code
def power_derivative_price(S, t, r, sigma, T):
    """
    Price of a derivative that pays S^2 at maturity T.
    """
    return S**2 * np.exp((r + sigma**2) * (T - t))
In[11]:
Code
# Test parameters
S_test = 100.0
t_test = 0.0
r_test = 0.05
sigma_test = 0.20
T_test = 1.0

# Compute PDE residual
residual = bs_pde_residual(
    power_derivative_price, S_test, t_test, r_test, sigma_test, T_test
)
Out[12]:
Console
PDE residual for power derivative: -2.82e-06
This should be close to zero (numerical error only)

The residual is essentially zero, confirming our analytical solution is correct up to numerical precision.

Let's also test with a function that should not satisfy the PDE, demonstrating that our verification catches invalid solutions.

In[13]:
Code
def invalid_pricing_function(S, t, r, sigma, T):
    """
    An arbitrary function that does NOT satisfy the BS PDE.
    """
    return S * np.exp(-0.1 * (T - t))  # Wrong formula
In[14]:
Code
residual_invalid = bs_pde_residual(
    invalid_pricing_function, S_test, t_test, r_test, sigma_test, T_test
)
Out[15]:
Console
PDE residual for invalid function: 9.0484
Non-zero residual indicates this is NOT a valid BS solution

The non-zero residual confirms that arbitrary functions generally do not satisfy the Black-Scholes PDE.

Visualizing the PDE Residual

Let's create a visualization showing the PDE residual across different stock prices for both valid and invalid pricing functions.

In[16]:
Code
S_range = np.linspace(50, 150, 100)
residuals_valid = [
    bs_pde_residual(
        power_derivative_price, S, t_test, r_test, sigma_test, T_test
    )
    for S in S_range
]
residuals_invalid = [
    bs_pde_residual(
        invalid_pricing_function, S, t_test, r_test, sigma_test, T_test
    )
    for S in S_range
]
Out[17]:
Visualization
Line plot comparing PDE residuals showing valid solution near zero and invalid solution with large non-zero values.
Black-Scholes PDE residual across stock prices. The valid solution (power derivative) has near-zero residual everywhere, while the invalid function shows substantial residual, confirming it does not satisfy the PDE.

Alternative Derivation: The Replicating Portfolio

There's an alternative way to derive the BSM PDE that provides additional insight and reinforces the economic intuition. Instead of constructing a hedged portfolio of derivative and stock, we can think about replicating the derivative using a portfolio of stock and risk-free bonds. The idea is that if we can perfectly replicate the derivative's payoff using traded securities, then the derivative must have the same price as the replicating portfolio. Otherwise, arbitrage opportunities would exist.

Suppose we want to replicate a derivative V(S,t)V(S, t) using a portfolio of αt\alpha_t shares of stock and βt\beta_t units of risk-free bonds. The portfolio value is:

Πt=αtSt+βtBt\Pi_t = \alpha_t S_t + \beta_t B_t

where:

  • Πt\Pi_t: value of the replicating portfolio
  • αt\alpha_t: shares of stock held
  • βt\beta_t: units of risk-free bonds held
  • StS_t: stock price
  • Bt=ertB_t = e^{rt}: value of a risk-free bond

The quantities αt\alpha_t and βt\beta_t can change over time as we continuously adjust the portfolio to maintain replication. This dynamic adjustment is the essence of delta hedging.

For the portfolio to replicate the derivative, we need Πt=V(St,t)\Pi_t = V(S_t, t) at all times. The key constraint is that the portfolio must be self-financing: changes in portfolio value come only from price changes, not from adding or removing capital. This means when we rebalance the portfolio, selling some stock to buy bonds or vice versa, the total value must remain unchanged. We can only rearrange what we already own. The self-financing condition is:

dΠt=αtdSt+βtdBtd\Pi_t = \alpha_t \, dS_t + \beta_t \, dB_t

where:

  • dΠtd\Pi_t: change in portfolio value
  • αt\alpha_t: shares of stock held
  • dStdS_t: change in stock price
  • βt\beta_t: units of risk-free bonds held
  • dBtdB_t: change in bond price

This equation says that the change in portfolio value equals the gains from holding stock plus the gains from holding bonds. There is no term for additional capital injection because the portfolio is self-financing.

Recall that dBt=rBtdtdB_t = r B_t \, dt represents the risk-free bond dynamics: the bond grows deterministically at the risk-free rate. Substituting the dynamics of StS_t and BtB_t:

dΠt=αt(μStdt+σStdWt)+βt(rBtdt)=(αtμSt+βtrBt)dt+αtσStdWt\begin{aligned} d\Pi_t &= \alpha_t (\mu S_t \, dt + \sigma S_t \, dW_t) + \beta_t (r B_t \, dt) \\ &= (\alpha_t \mu S_t + \beta_t r B_t) \, dt + \alpha_t \sigma S_t \, dW_t \end{aligned}

where:

  • dΠtd\Pi_t: change in the replicating portfolio value
  • αt\alpha_t: number of shares of stock
  • βt\beta_t: number of units of risk-free bonds
  • μ\mu: stock drift
  • rr: risk-free interest rate
  • σ\sigma: stock volatility
  • dWtdW_t: Wiener process increment

For replication, we require dΠt=dVd\Pi_t = dV. The portfolio must change in exactly the same way as the derivative, both in its deterministic component and its random component. Recall the dynamics of the derivative from Itô's lemma:

dV=()dt+σStVSdWtdV = (\dots) \, dt + \sigma S_t \frac{\partial V}{\partial S} \, dW_t

where:

  • dVdV: change in derivative price
  • σ\sigma: volatility
  • StS_t: stock price
  • dWtdW_t: Wiener process increment

Matching the coefficients of the random term dWtdW_t gives us the replication condition:

αtσSt=σStVS\alpha_t \sigma S_t = \sigma S_t \frac{\partial V}{\partial S}

where:

  • αt\alpha_t: shares of stock
  • σ\sigma: volatility
  • StS_t: stock price
  • VV: derivative price

Solving for αt\alpha_t:

αt=VS\alpha_t = \frac{\partial V}{\partial S}

where:

  • αt\alpha_t: number of shares in the replicating portfolio
  • VV: price of the derivative
  • SS: stock price

This confirms that the delta hedge ratio emerges naturally from replication. To replicate the derivative, we must hold exactly delta shares of stock. The remaining portfolio value goes into bonds: βtBt=VαtSt\beta_t B_t = V - \alpha_t S_t. Matching the deterministic components and invoking the self-financing constraint leads to the same PDE. The replicating portfolio approach provides a constructive interpretation: it tells us not only that the derivative price satisfies a PDE, but also how to actually create the derivative synthetically using stock and bonds.

Intuition Behind the PDE Terms

Each term in the Black-Scholes PDE has a financial interpretation that connects the mathematics to economic reasoning. Understanding these interpretations deepens your grasp of why the equation takes the form it does.

The time derivative Vt\frac{\partial V}{\partial t} represents time decay, often called theta. As time passes, the option's time value erodes, which this term captures. For a call or put option, theta is typically negative: all else equal, the option loses value as expiration approaches because there's less time for favorable stock price movements to occur. The theta term quantifies this inexorable passage of time.

The first-order spatial term rSVSrS\frac{\partial V}{\partial S} appears because the stock price grows on average at the risk-free rate in the risk-neutral world. This term reflects the drift of the stock under the risk-neutral measure. Notice that the drift rate is rr, not μ\mu: in the risk-neutral framework, all assets earn the risk-free rate on average. This term captures how the option value changes as the expected stock price evolves.

The second-order spatial term 12σ2S22VS2\frac{1}{2}\sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} captures the convexity effect. Options have curved payoff profiles, and the randomness in stock prices creates value through this curvature. This is related to gamma, which measures how delta changes with the stock price. To understand why convexity creates value, imagine a call option. When the stock price rises, the option becomes more sensitive to further increases (delta rises). When the stock price falls, the option becomes less sensitive to further decreases (delta falls). This asymmetric response means the option benefits more from upward movements than it suffers from downward movements, on average. The gamma term quantifies this benefit from volatility.

The discounting term rVrV on the right side reflects that the derivative must be discounted at the risk-free rate. In the risk-neutral world, all assets earn the risk-free rate on average, so the appropriate discount rate is rr. This term ensures that the present value calculation is consistent with no-arbitrage pricing.

Out[19]:
Visualization
Option Delta as a function of stock price. The delta increases from 0 to 1 as the call option moves from out-of-the-money to in-the-money, indicating the increasing number of shares needed to hedge.
Option Delta as a function of stock price. The delta increases from 0 to 1 as the call option moves from out-of-the-money to in-the-money, indicating the increasing number of shares needed to hedge.
Option Gamma ($\Gamma$) peaking near the strike price ($K=100$). The high curvature at-the-money indicates where the option's delta is most sensitive to price changes and where rebalancing needs are greatest.
Option Gamma ($\Gamma$) peaking near the strike price ($K=100$). The high curvature at-the-money indicates where the option's delta is most sensitive to price changes and where rebalancing needs are greatest.
Option Theta ($\Theta$) showing daily time decay. Theta is most negative for at-the-money options, reflecting the rapid erosion of time value as expiration approaches.
Option Theta ($\Theta$) showing daily time decay. Theta is most negative for at-the-money options, reflecting the rapid erosion of time value as expiration approaches.

Extending to Dividend-Paying Stocks

The basic BSM PDE assumes the stock pays no dividends. This assumption simplifies the analysis but excludes many important applications, including options on dividend-paying stocks, equity indices, and currencies. For a stock paying continuous dividends at rate qq, the stock price dynamics become:

dSt=(rq)Stdt+σStdWtQdS_t = (r - q) S_t \, dt + \sigma S_t \, dW_t^Q

where:

  • qq: continuous dividend yield
  • σ\sigma: volatility of the underlying stock
  • rr: risk-free interest rate
  • StS_t: stock price
  • dWtQdW_t^Q: increment of Wiener process under risk-neutral measure
  • dtdt: small time increment

The key change is that the drift rate under the risk-neutral measure is now rqr - q rather than rr. The intuition is that a stock paying dividends at rate qq provides part of its return through dividend payments rather than price appreciation. In the risk-neutral world, the total expected return (price appreciation plus dividends) must equal rr, so the expected price appreciation is only rqr - q.

The modified Black-Scholes PDE is:

Vt+(rq)SVS+12σ2S22VS2=rV\frac{\partial V}{\partial t} + (r-q)S\frac{\partial V}{\partial S} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 V}{\partial S^2} = rV

where:

  • V(S,t)V(S, t): price of the derivative
  • SS: stock price
  • rr: risk-free interest rate
  • qq: continuous dividend yield
  • σ\sigma: volatility of the underlying stock

The intuition is that a stock paying dividends effectively grows slower (by the dividend yield qq) than a non-dividend-paying stock, which reduces the drift term in the first-order spatial derivative. This modification is important for pricing options on dividend-paying stocks, equity indices, and currencies (where the foreign interest rate plays the role of a dividend yield, as we discussed in our chapter on foreign exchange markets).

Out[21]:
Visualization
Effect of continuous dividend yield on call option prices as a function of time to maturity. Higher dividend yields reduce call option values because dividends reduce the expected stock price appreciation. The effect becomes more pronounced for longer-dated options, where more dividends are expected to be paid before expiration.
Effect of continuous dividend yield on call option prices as a function of time to maturity. Higher dividend yields reduce call option values because dividends reduce the expected stock price appreciation. The effect becomes more pronounced for longer-dated options, where more dividends are expected to be paid before expiration.

Limitations and Practical Implications

The Black-Scholes-Merton framework transformed derivatives pricing and risk management, but its assumptions impose significant limitations that you must understand. The assumption of constant volatility is perhaps the most problematic: in real markets, volatility changes over time and varies with strike price and maturity. This discrepancy manifests as the volatility smile and volatility surface, which we'll explore in detail in a later chapter. When market-implied volatilities differ systematically from a single constant value, the BSM model cannot perfectly match observed option prices across all strikes.

The continuous trading assumption is also unrealistic. Real markets have discrete trading opportunities, transaction costs, and bid-ask spreads. These frictions make perfect delta hedging impossible: you cannot adjust your hedge infinitely often, and each adjustment costs money. The gap between theoretical continuous hedging and practical discrete rebalancing creates residual risk that the basic BSM framework ignores. You can address this through discrete hedging strategies and by accounting for the costs of hedging in your pricing.

Despite these limitations, the BSM PDE remains the foundation of modern derivatives pricing. Its power lies not in its literal accuracy but in providing a systematic framework for thinking about derivative valuation. Extensions and modifications build on this foundation: stochastic volatility models modify the volatility dynamics, jump-diffusion models add discontinuous price movements, and local volatility models allow volatility to depend on price and time. All these approaches either modify the PDE or reframe the problem while retaining the no-arbitrage core insight.

The equation also established delta hedging as the standard approach to managing derivative risk. Even when the assumptions don't hold exactly, the hedge ratios derived from BSM-type models provide useful starting points. The Greeks, which we'll derive from the Black-Scholes formula in an upcoming chapter, give you a systematic way to decompose and manage the various risk exposures embedded in derivative positions.

Summary

This chapter derived the Black-Scholes-Merton partial differential equation, the fundamental equation governing derivative prices under certain idealized conditions. The key insights are:

  • Stock price dynamics: The underlying follows geometric Brownian motion with constant volatility, dS=μSdt+σSdWdS = \mu S \, dt + \sigma S \, dW.

  • Delta hedging eliminates risk: By holding Δ=VS\Delta = \frac{\partial V}{\partial S} shares of stock against one derivative, we create a portfolio with no exposure to the random dWdW term.

  • No-arbitrage implies the PDE: A riskless portfolio must earn the risk-free rate, leading directly to the Black-Scholes PDE.

  • The drift disappears: The expected return μ\mu does not appear in the final equation; only the volatility σ\sigma and risk-free rate rr matter for pricing.

  • Boundary conditions specify the derivative: The same PDE applies to all derivatives; what distinguishes calls from puts is the terminal payoff condition.

  • The PDE is linear and backward parabolic: This mathematical structure enables both analytical solutions and efficient numerical methods.

In the next chapter, we'll solve this PDE subject to call and put option boundary conditions, deriving the famous Black-Scholes formula that gives explicit closed-form prices for European options.

Quiz

Ready to test your understanding? Take this quick quiz to reinforce what you've learned about the derivation of the Black-Scholes-Merton PDE.

Loading component...

Reference

BIBTEXAcademic
@misc{blackscholespdederivationdeltahedgingexplained, author = {Michael Brenndoerfer}, title = {Black-Scholes PDE: Derivation & Delta Hedging Explained}, year = {2025}, url = {https://mbrenndoerfer.com/writing/black-scholes-merton-pde-derivation-delta-hedging-explained}, organization = {mbrenndoerfer.com}, note = {Accessed: 2025-01-01} }
APAAcademic
Michael Brenndoerfer (2025). Black-Scholes PDE: Derivation & Delta Hedging Explained. Retrieved from https://mbrenndoerfer.com/writing/black-scholes-merton-pde-derivation-delta-hedging-explained
MLAAcademic
Michael Brenndoerfer. "Black-Scholes PDE: Derivation & Delta Hedging Explained." 2026. Web. today. <https://mbrenndoerfer.com/writing/black-scholes-merton-pde-derivation-delta-hedging-explained>.
CHICAGOAcademic
Michael Brenndoerfer. "Black-Scholes PDE: Derivation & Delta Hedging Explained." Accessed today. https://mbrenndoerfer.com/writing/black-scholes-merton-pde-derivation-delta-hedging-explained.
HARVARDAcademic
Michael Brenndoerfer (2025) 'Black-Scholes PDE: Derivation & Delta Hedging Explained'. Available at: https://mbrenndoerfer.com/writing/black-scholes-merton-pde-derivation-delta-hedging-explained (Accessed: today).
SimpleBasic
Michael Brenndoerfer (2025). Black-Scholes PDE: Derivation & Delta Hedging Explained. https://mbrenndoerfer.com/writing/black-scholes-merton-pde-derivation-delta-hedging-explained