Modern Portfolio Theory: Mean-Variance Optimization Guide

Michael BrenndoerferDecember 13, 202551 min read

Learn Modern Portfolio Theory and mean-variance optimization. Master the efficient frontier, diversification mathematics, and optimal portfolio construction.

Reading Level

Choose your expertise level to adjust how many terms are explained. Beginners see more tooltips, experts see fewer to maintain reading flow. Hover over underlined terms for instant definitions.

Modern Portfolio Theory and Mean-Variance Optimization

In 1952, Harry Markowitz published a paper titled "Portfolio Selection" that fundamentally changed how investors think about risk and return. Before Markowitz, investment analysis focused almost exclusively on picking individual securities with the best expected returns. Markowitz introduced a revolutionary insight: the risk of a portfolio depends not just on the risks of individual assets, but critically on how those assets move together. This observation gave birth to Modern Portfolio Theory (MPT) and provided the first rigorous mathematical framework for constructing optimal portfolios.

The core idea is straightforward. Investors care about two things: maximizing expected return and minimizing risk. Markowitz formalized risk as the variance (or standard deviation) of portfolio returns, creating what we now call the mean-variance framework. Within this framework, rational investors should only hold portfolios that offer the highest expected return for a given level of risk, or equivalently, the lowest risk for a given expected return. These optimal portfolios form the efficient frontier, a concept that remains central to investment management today.

This chapter develops the mathematical machinery of mean-variance optimization from first principles. We'll see how diversification arises naturally from the mathematics of combining assets with less-than-perfect correlation, and we'll implement practical algorithms for finding optimal portfolios. The framework we develop here provides the foundation for the Capital Asset Pricing Model and factor models that follow in subsequent chapters.

Portfolio Returns and Risk

Before we can optimize portfolios, we need precise definitions of portfolio return and risk. These definitions form the foundation for Modern Portfolio Theory. We need accurate measures of portfolio return and risk to compare investment strategies and identify optimal allocations.

Suppose we allocate wealth across nn assets. The fundamental question is: how should we combine individual asset characteristics to describe the portfolio as a whole? Let wiw_i denote the fraction of wealth invested in asset ii, where the weights must sum to one:

i=1nwi=1\sum_{i=1}^{n} w_i = 1

where:

  • wiw_i: fraction of wealth invested in asset ii
  • nn: total number of assets in the portfolio

This constraint ensures we account for all invested capital. A weight of 0.3 means 30% of the portfolio is allocated to that asset. We allow short selling, meaning weights can be negative (representing borrowed positions). Shorting an asset involves selling borrowed shares with the obligation to return them later, creating a negative position that profits when the price falls.

Expected Portfolio Return

The expected return of a portfolio follows naturally from the linearity of expectation, a concept we explored in our probability foundations. If each asset ii has expected return μi=E[Ri]\mu_i = E[R_i], the portfolio's expected return is simply the weighted average:

μp=E[Rp]=i=1nwiμi=wμ\mu_p = E[R_p] = \sum_{i=1}^{n} w_i \mu_i = \mathbf{w}^\top \boldsymbol{\mu}

where:

  • μp\mu_p: expected return of the portfolio
  • E[Rp]E[R_p]: expected value of portfolio return
  • wiw_i: weight of asset ii
  • μi\mu_i: expected return of asset ii
  • w\mathbf{w}: vector of portfolio weights, (w1,,wn)(w_1, \ldots, w_n)^\top
  • μ\boldsymbol{\mu}: vector of expected returns, (μ1,,μn)(\mu_1, \ldots, \mu_n)^\top

This formula tells us something intuitive: a portfolio's expected return is determined entirely by how much we invest in each asset and what return we expect from that asset. There are no interaction effects or nonlinearities. If we double our allocation to a high-return asset, we increase the portfolio's expected return proportionally. Vector notation provides a compact way to represent many assets.

Portfolio Variance

The portfolio variance requires more care and reveals how diversification works mathematically. Unlike expected return, variance does not simply average across assets. The portfolio return is:

Rp=i=1nwiRiR_p = \sum_{i=1}^{n} w_i R_i

where:

  • RpR_p: return of the portfolio
  • wiw_i: weight of asset ii
  • RiR_i: return of asset ii
  • nn: number of assets

To find the variance of this weighted sum, we must account for how each pair of assets moves together. Taking the variance and using the linearity properties we covered in Part I:

σp2=Var(Rp)=Var(i=1nwiRi)=i=1nj=1nwiwjCov(Ri,Rj)(bilinearity of covariance)\begin{aligned} \sigma_p^2 &= \text{Var}(R_p) \\ &= \text{Var}\left(\sum_{i=1}^{n} w_i R_i\right) \\ &= \sum_{i=1}^{n} \sum_{j=1}^{n} w_i w_j \text{Cov}(R_i, R_j) && \text{(bilinearity of covariance)} \end{aligned}

where:

  • σp2\sigma_p^2: variance of the portfolio
  • wi,wjw_i, w_j: weights allocated to assets ii and jj
  • Cov(Ri,Rj)\text{Cov}(R_i, R_j): covariance between returns of assets ii and jj

The double summation captures all pairwise interactions between assets. When i=ji = j, the covariance term becomes the variance of asset ii, contributing wi2σi2w_i^2 \sigma_i^2 to portfolio variance. When iji \neq j, the covariance measures how assets ii and jj move together, and this is where diversification benefits emerge.

Let Σ\Sigma denote the n×nn \times n covariance matrix, where Σij=Cov(Ri,Rj)\Sigma_{ij} = \text{Cov}(R_i, R_j). Then the portfolio variance is:

σp2=wΣw\sigma_p^2 = \mathbf{w}^\top \Sigma \mathbf{w}

where:

  • σp2\sigma_p^2: variance of the portfolio
  • w\mathbf{w}: vector of portfolio weights
  • Σ\Sigma: covariance matrix of asset returns

This compact matrix notation, which we developed in Part I's linear algebra chapter, makes optimization tractable. The quadratic form wΣw\mathbf{w}^\top \Sigma \mathbf{w} appears throughout financial mathematics because it elegantly encapsulates how portfolio risk depends on both individual asset volatilities (the diagonal of Σ\Sigma) and their correlations (the off-diagonal elements).

Covariance Matrix Properties

The covariance matrix Σ\Sigma is symmetric (Σij=Σji\Sigma_{ij} = \Sigma_{ji}) and positive semi-definite, meaning wΣw0\mathbf{w}^\top \Sigma \mathbf{w} \geq 0 for all weight vectors w\mathbf{w}. This ensures portfolio variance is always non-negative. When Σ\Sigma is positive definite, the portfolio variance is strictly positive for any non-trivial portfolio.

The Two-Asset Case

The two-asset case provides crucial intuition before we tackle the general problem. By working through this simplified scenario, we can develop a geometric understanding of how risk and return interact, seeing clearly how correlation shapes the set of achievable portfolios. Consider a portfolio with weights ww in asset 1 and (1w)(1-w) in asset 2.

The expected return is:

μp=wμ1+(1w)μ2\mu_p = w\mu_1 + (1-w)\mu_2

where:

  • μp\mu_p: expected return of the portfolio
  • ww: weight invested in asset 1
  • μ1,μ2\mu_1, \mu_2: expected returns of assets 1 and 2

This formula shows that expected return varies linearly with the weight ww. As we shift allocation from asset 2 toward asset 1, expected return moves along a straight line between μ2\mu_2 (when w=0w=0) and μ1\mu_1 (when w=1w=1). There are no surprises here: return blending is straightforward.

The variance, however, tells a richer story:

σp2=w2σ12+(1w)2σ22+2w(1w)ρ12σ1σ2\sigma_p^2 = w^2\sigma_1^2 + (1-w)^2\sigma_2^2 + 2w(1-w)\rho_{12}\sigma_1\sigma_2

where:

  • σp2\sigma_p^2: variance of the portfolio
  • ww: weight invested in asset 1
  • σ1,σ2\sigma_1, \sigma_2: standard deviations (volatility) of assets 1 and 2
  • ρ12\rho_{12}: correlation coefficient between returns of assets 1 and 2

This equation has three terms, each revealing a different aspect of portfolio risk. The first term, w2σ12w^2\sigma_1^2, captures the contribution of asset 1's own volatility, scaled by the square of its weight. The second term, (1w)2σ22(1-w)^2\sigma_2^2, does the same for asset 2. The third term, the cross-term 2w(1w)ρ12σ1σ22w(1-w)\rho_{12}\sigma_1\sigma_2, is where the diversification benefit arises. This term depends on the correlation ρ12\rho_{12}, and its sign and magnitude determine whether combining the two assets reduces or amplifies total portfolio risk.

The Diversification Effect

The diversification effect emerges from this variance formula. Consider what happens when ρ12<1\rho_{12} < 1. The cross-term 2w(1w)ρ12σ1σ22w(1-w)\rho_{12}\sigma_1\sigma_2 is smaller than it would be under perfect correlation, reducing total portfolio variance. When correlation is negative, this term actually becomes negative, further reducing portfolio risk below what the individual variances would suggest.

To understand this intuitively, imagine two assets that tend to move in opposite directions. When one zigs, the other zags. In a portfolio containing both, these opposite movements partially cancel each other, smoothing the overall portfolio return and reducing its volatility. The lower the correlation, the more pronounced this cancellation effect.

Let's see this numerically:

In[2]:
Code
import numpy as np

# Two assets with different risk-return profiles
mu1, sigma1 = 0.10, 0.15  # Asset 1: 10% return, 15% volatility
mu2, sigma2 = 0.06, 0.08  # Asset 2: 6% return, 8% volatility

# Portfolio weights for asset 1
weights = np.linspace(-0.5, 1.5, 100)


def portfolio_stats(w, mu1, mu2, sigma1, sigma2, rho):
    """Calculate portfolio return and standard deviation."""
    mu_p = w * mu1 + (1 - w) * mu2
    var_p = (
        w**2 * sigma1**2
        + (1 - w) ** 2 * sigma2**2
        + 2 * w * (1 - w) * rho * sigma1 * sigma2
    )
    return mu_p, np.sqrt(var_p)
Out[3]:
Visualization
Line chart showing portfolio frontiers for correlation values of 1.0, 0.5, 0.0, -0.5, and -1.0.
Risk-return tradeoff for two-asset portfolios under different correlation assumptions ranging from 1.0 to -1.0. Lower correlation enables greater diversification benefits, curving the frontier leftward toward lower risk, with perfect negative correlation allowing for the theoretical existence of a risk-free portfolio.

The figure reveals several key insights about how correlation shapes investment opportunities:

  • Perfect positive correlation (\rho = 1): The frontier is a straight line. Diversification provides no risk reduction; portfolio risk is simply the weighted average of individual risks. The two assets move in perfect lockstep, so combining them cannot smooth out volatility.
  • Imperfect correlation (ρ<1\rho < 1): The frontier curves leftward, meaning some portfolios achieve lower risk than either asset alone. This is the diversification benefit. The curvature increases as correlation decreases, opening up more favorable risk-return combinations.
  • Perfect negative correlation (ρ=1\rho = -1): A risk-free portfolio exists. When w=σ2/(σ1+σ2)w = \sigma_2 / (\sigma_1 + \sigma_2), the portfolio variance is exactly zero. This extreme case illustrates the theoretical maximum of diversification: if two assets move in perfectly opposite directions, we can combine them to eliminate all uncertainty.

Minimum Variance Portfolio

For the two-asset case, we can find the minimum variance portfolio analytically. This portfolio represents the allocation that achieves the lowest possible risk, regardless of the expected return it delivers. To find the minimum variance portfolio, we differentiate the variance σp2\sigma_p^2 with respect to ww and set the result to zero:

σp2w=w(w2σ12+(1w)2σ22+2w(1w)ρ12σ1σ2)=2wσ12+2(1w)(1)σ22+2(12w)ρ12σ1σ2(differentiate)=2wσ122(1w)σ22+2(12w)ρ12σ1σ2(simplify)=0(set to zero)\begin{aligned} \frac{\partial \sigma_p^2}{\partial w} &= \frac{\partial}{\partial w} \left( w^2\sigma_1^2 + (1-w)^2\sigma_2^2 + 2w(1-w)\rho_{12}\sigma_1\sigma_2 \right) \\ &= 2w\sigma_1^2 + 2(1-w)(-1)\sigma_2^2 + 2(1-2w)\rho_{12}\sigma_1\sigma_2 && \text{(differentiate)} \\ &= 2w\sigma_1^2 - 2(1-w)\sigma_2^2 + 2(1-2w)\rho_{12}\sigma_1\sigma_2 && \text{(simplify)} \\ &= 0 && \text{(set to zero)} \end{aligned}

The derivative includes three components corresponding to the three terms in the variance formula. Setting this derivative to zero identifies the point where small changes in ww produce no change in variance, which is precisely the minimum. Rearranging the first-order condition to isolate ww:

2wσ12+2wσ224wρ12σ1σ2=2σ222ρ12σ1σ2(group terms with w)w(2σ12+2σ224ρ12σ1σ2)=2(σ22ρ12σ1σ2)(factor out w)w(σ12+σ222ρ12σ1σ2)=σ22ρ12σ1σ2(divide by 2)\begin{aligned} 2w\sigma_1^2 + 2w\sigma_2^2 - 4w\rho_{12}\sigma_1\sigma_2 &= 2\sigma_2^2 - 2\rho_{12}\sigma_1\sigma_2 && \text{(group terms with } w \text{)} \\ w(2\sigma_1^2 + 2\sigma_2^2 - 4\rho_{12}\sigma_1\sigma_2) &= 2(\sigma_2^2 - \rho_{12}\sigma_1\sigma_2) && \text{(factor out } w \text{)} \\ w(\sigma_1^2 + \sigma_2^2 - 2\rho_{12}\sigma_1\sigma_2) &= \sigma_2^2 - \rho_{12}\sigma_1\sigma_2 && \text{(divide by 2)} \end{aligned}

Solving for ww:

w=σ22ρ12σ1σ2σ12+σ222ρ12σ1σ2w^* = \frac{\sigma_2^2 - \rho_{12}\sigma_1\sigma_2}{\sigma_1^2 + \sigma_2^2 - 2\rho_{12}\sigma_1\sigma_2}

where:

  • ww^*: optimal weight for asset 1 to minimize portfolio variance
  • σ1,σ2\sigma_1, \sigma_2: volatilities of assets 1 and 2
  • ρ12\rho_{12}: correlation between assets 1 and 2

This formula balances the volatilities and correlation to minimize risk. If the assets are uncorrelated (ρ12=0\rho_{12}=0), the weight ww^* is proportional to σ22\sigma_2^2, meaning the portfolio holds more of the asset with lower volatility. This makes intuitive sense: when assets don't interact, we should tilt toward the less risky one. As correlation decreases (becomes more negative), the optimal weight shifts to exploit the hedging benefit, potentially allocating more to the higher-volatility asset because its movements offset those of the other asset.

In[4]:
Code
def minimum_variance_weight(sigma1, sigma2, rho):
    """Calculate the weight in asset 1 for the minimum variance portfolio."""
    numerator = sigma2**2 - rho * sigma1 * sigma2
    denominator = sigma1**2 + sigma2**2 - 2 * rho * sigma1 * sigma2
    return numerator / denominator


# Example with zero correlation
rho = 0.0
w_mv = minimum_variance_weight(sigma1, sigma2, rho)
mu_mv, sigma_mv = portfolio_stats(w_mv, mu1, mu2, sigma1, sigma2, rho)
Out[5]:
Console
Minimum Variance Portfolio (ρ = 0.0):
  Weight in Asset 1: 0.2215
  Weight in Asset 2: 0.7785
  Expected Return: 0.0689 (6.89%)
  Standard Deviation: 0.0706 (7.06%)

Compare to individual assets:
  Asset 1: σ = 0.1500, Asset 2: σ = 0.0800

The minimum variance portfolio achieves lower risk than either individual asset. This is the fundamental benefit of diversification. Even though both assets carry uncertainty, combining them in the right proportions produces a portfolio with less uncertainty than holding either one alone. This result, sometimes called the "free lunch" of diversification, arises purely from the mathematics of combining imperfectly correlated random variables.

Out[6]:
Visualization
Line chart showing the portfolio frontier with the minimum variance portfolio marked as a star.
Location of the minimum variance portfolio on the two-asset frontier for a correlation of 0.3. The minimum variance portfolio (star) achieves lower risk than either individual asset by optimally combining them, in this case with an allocation of approximately 69% to Asset 1 and 31% to Asset 2.

Multi-Asset Portfolio Optimization

With more than two assets, we need the full matrix formulation. The principles remain the same, but the algebra becomes more complex, requiring linear algebra tools to express and solve the optimization problem efficiently. Let w\mathbf{w} be the n×1n \times 1 vector of weights, μ\boldsymbol{\mu} the n×1n \times 1 vector of expected returns, and Σ\Sigma the n×nn \times n covariance matrix.

The optimization problem has two equivalent formulations, each emphasizing a different perspective on the objective:

Formulation 1: Minimize risk for a target return

This formulation starts with a desired level of expected return and asks: what is the least risky way to achieve it?

minw12wΣwsubject towμ=μtargetw1=1\begin{aligned} \min_{\mathbf{w}} \quad & \frac{1}{2}\mathbf{w}^\top \Sigma \mathbf{w} \\ \text{subject to} \quad & \mathbf{w}^\top \boldsymbol{\mu} = \mu_{\text{target}} \\ & \mathbf{w}^\top \mathbf{1} = 1 \end{aligned}

where:

  • w\mathbf{w}: vector of portfolio weights
  • Σ\Sigma: covariance matrix of asset returns
  • μ\boldsymbol{\mu}: vector of expected returns
  • μtarget\mu_{\text{target}}: required level of expected return
  • 1\mathbf{1}: vector of ones (sum constraint)

The factor of 12\frac{1}{2} in the objective is a mathematical convenience that simplifies the derivatives. It does not change the optimal solution because multiplying an objective by a positive constant does not alter which portfolio minimizes it.

Formulation 2: Maximize utility (mean-variance utility)

This formulation takes a different approach, directly modeling investor preferences through a utility function that trades off return against risk:

maxwwμγ2wΣwsubject tow1=1\begin{aligned} \max_{\mathbf{w}} \quad & \mathbf{w}^\top \boldsymbol{\mu} - \frac{\gamma}{2}\mathbf{w}^\top \Sigma \mathbf{w} \\ \text{subject to} \quad & \mathbf{w}^\top \mathbf{1} = 1 \end{aligned}

where:

  • γ\gamma: risk aversion parameter (γ>0\gamma > 0)
  • w\mathbf{w}: vector of portfolio weights
  • μ\boldsymbol{\mu}: vector of expected returns
  • Σ\Sigma: covariance matrix of asset returns
  • 1\mathbf{1}: vector of ones

The parameter γ\gamma captures how much you dislike risk. If you are highly risk-averse (large γ\gamma), you penalize variance heavily and will choose a conservative, low-volatility portfolio. If you are risk-tolerant (small γ\gamma), you care more about expected return and will accept higher volatility to chase higher gains. Both formulations trace out the same efficient frontier; they simply parameterize it differently.

Analytical Solution: Lagrangian Approach

For the first formulation, we form the Lagrangian using the techniques from Part I. The Lagrangian method converts a constrained optimization problem into an unconstrained one by introducing penalty terms for violating constraints:

L=12wΣwλ1(wμμtarget)λ2(w11)\mathcal{L} = \frac{1}{2}\mathbf{w}^\top \Sigma \mathbf{w} - \lambda_1(\mathbf{w}^\top \boldsymbol{\mu} - \mu_{\text{target}}) - \lambda_2(\mathbf{w}^\top \mathbf{1} - 1)

where:

  • L\mathcal{L}: the Lagrangian function
  • w\mathbf{w}: vector of portfolio weights
  • Σ\Sigma: covariance matrix of asset returns
  • λ1,λ2\lambda_1, \lambda_2: Lagrange multipliers enforcing the return and budget constraints
  • μ\boldsymbol{\mu}: vector of expected returns
  • μtarget\mu_{\text{target}}: target portfolio return
  • 1\mathbf{1}: vector of ones

The Lagrange multipliers λ1\lambda_1 and λ2\lambda_2 act as shadow prices, measuring how much the optimal objective would change if we relaxed the corresponding constraint slightly. Taking the derivative with respect to w\mathbf{w} and setting it to zero:

Lw=Σwλ1μλ21=0\frac{\partial \mathcal{L}}{\partial \mathbf{w}} = \Sigma \mathbf{w} - \lambda_1 \boldsymbol{\mu} - \lambda_2 \mathbf{1} = 0

where:

  • Σ\Sigma: covariance matrix
  • w\mathbf{w}: vector of portfolio weights
  • λ1,λ2\lambda_1, \lambda_2: Lagrange multipliers
  • μ\boldsymbol{\mu}: vector of expected returns
  • 1\mathbf{1}: vector of ones

This first-order condition states that at the optimum, the marginal increase in risk from adjusting any weight must be exactly offset by the value of that weight in meeting the return and budget constraints. Rearranging terms to solve for w\mathbf{w} explicitly:

Σw=λ1μ+λ21(isolate Σw)w=Σ1(λ1μ+λ21)(pre-multiply by Σ1)=λ1Σ1μ+λ2Σ11(distribute)\begin{aligned} \Sigma \mathbf{w} &= \lambda_1 \boldsymbol{\mu} + \lambda_2 \mathbf{1} && \text{(isolate } \Sigma \mathbf{w} \text{)} \\ \mathbf{w} &= \Sigma^{-1}(\lambda_1 \boldsymbol{\mu} + \lambda_2 \mathbf{1}) && \text{(pre-multiply by } \Sigma^{-1} \text{)} \\ &= \lambda_1 \Sigma^{-1} \boldsymbol{\mu} + \lambda_2 \Sigma^{-1} \mathbf{1} && \text{(distribute)} \end{aligned}

where:

  • w\mathbf{w}: optimal portfolio weight vector
  • Σ1\Sigma^{-1}: inverse of the covariance matrix
  • μ\boldsymbol{\mu}: vector of expected returns
  • 1\mathbf{1}: vector of ones
  • λ1,λ2\lambda_1, \lambda_2: Lagrange multipliers

The formula decomposes the optimal portfolio into two parts: a term proportional to Σ1μ\Sigma^{-1}\boldsymbol{\mu} (seeking high returns) and a term proportional to Σ11\Sigma^{-1}\mathbf{1} (minimizing variance). The multipliers λ1\lambda_1 and λ2\lambda_2 weight these components to satisfy the target return and budget constraints. This decomposition reveals the fundamental tension in portfolio construction: the desire for return pulls toward Σ1μ\Sigma^{-1}\boldsymbol{\mu}, while the desire for safety pulls toward Σ11\Sigma^{-1}\mathbf{1}.

The Lagrange multipliers λ1\lambda_1 and λ2\lambda_2 are determined by the constraints. Substituting back:

μtarget=λ1μΣ1μ+λ2μΣ111=λ11Σ1μ+λ21Σ11\begin{aligned} \mu_{\text{target}} &= \lambda_1 \boldsymbol{\mu}^\top \Sigma^{-1} \boldsymbol{\mu} + \lambda_2 \boldsymbol{\mu}^\top \Sigma^{-1} \mathbf{1} \\ 1 &= \lambda_1 \mathbf{1}^\top \Sigma^{-1} \boldsymbol{\mu} + \lambda_2 \mathbf{1}^\top \Sigma^{-1} \mathbf{1} \end{aligned}

where:

  • μtarget\mu_{\text{target}}: target return
  • λ1,λ2\lambda_1, \lambda_2: Lagrange multipliers
  • μ\boldsymbol{\mu}: vector of expected returns
  • Σ1\Sigma^{-1}: inverse covariance matrix
  • 1\mathbf{1}: vector of ones

These two equations form a linear system in the two unknowns λ1\lambda_1 and λ2\lambda_2. We can solve for λ1\lambda_1 and λ2\lambda_2 in terms of scalar constants derived from the problem parameters:

λ1=CμtargetBACB2,λ2=ABμtargetACB2\lambda_1 = \frac{C\mu_{\text{target}} - B}{AC - B^2}, \quad \lambda_2 = \frac{A - B\mu_{\text{target}}}{AC - B^2}

where:

  • A=μΣ1μA = \boldsymbol{\mu}^\top \Sigma^{-1} \boldsymbol{\mu}
  • B=μΣ11=1Σ1μB = \boldsymbol{\mu}^\top \Sigma^{-1} \mathbf{1} = \mathbf{1}^\top \Sigma^{-1} \boldsymbol{\mu}
  • C=1Σ11C = \mathbf{1}^\top \Sigma^{-1} \mathbf{1}
  • μtarget\mu_{\text{target}}: target portfolio return

The constants AA, BB, and CC depend only on the assets' expected returns and covariance structure, not on the target return. This means we can precompute them once and then quickly find optimal portfolios for any target return, making efficient frontier construction computationally efficient.

The Global Minimum Variance Portfolio

Setting λ1=0\lambda_1 = 0 (ignoring the return target), we get the global minimum variance (GMV) portfolio:

wGMV=Σ111Σ11\mathbf{w}_{\text{GMV}} = \frac{\Sigma^{-1} \mathbf{1}}{\mathbf{1}^\top \Sigma^{-1} \mathbf{1}}

where:

  • wGMV\mathbf{w}_{\text{GMV}}: weights of the global minimum variance portfolio
  • Σ1\Sigma^{-1}: inverse of the covariance matrix
  • 1\mathbf{1}: vector of ones

This formula represents risk minimization in its most basic form. The term Σ11\Sigma^{-1}\mathbf{1} identifies the portfolio structure that minimizes variance without regard for expected returns, while the denominator normalizes the weights to sum to one. Assets with lower variance and lower correlation to other assets receive higher weights, as they contribute less to overall portfolio risk.

This portfolio has the lowest possible variance among all portfolios satisfying the budget constraint. It sits at the leftmost point of the efficient frontier, representing the safest allocation available without going to cash. If you care only about minimizing uncertainty (perhaps because you have very short horizons or extreme risk aversion), the GMV portfolio is optimal regardless of expected returns.

In[7]:
Code
import numpy as np


def global_minimum_variance_portfolio(cov_matrix):
    """
    Compute the global minimum variance portfolio weights.

    Parameters
    ----------
    cov_matrix : ndarray
        n x n covariance matrix of asset returns

    Returns
    -------
    weights : ndarray
        n x 1 vector of optimal weights
    """
    n = cov_matrix.shape[0]
    ones = np.ones(n)

    # Compute inverse covariance matrix
    cov_inv = np.linalg.inv(cov_matrix)

    # GMV weights: Σ⁻¹1 / (1'Σ⁻¹1)
    weights = cov_inv @ ones / (ones @ cov_inv @ ones)

    return weights

The Efficient Frontier

The efficient frontier is the set of all portfolios that offer the maximum expected return for each level of risk. Mathematically, it's the upper boundary of the feasible set in risk-return space. This concept is central to Modern Portfolio Theory because it delineates the best possible tradeoffs available to investors.

To understand the efficient frontier, imagine plotting every possible portfolio in risk-return space. The feasible region, bounded by the budget constraint and any position limits, forms a region. Some portfolios lie on the boundary of this region, while others lie in the interior. Interior portfolios are dominated by boundary portfolios that offer either higher return for the same risk or lower risk for the same return. The efficient frontier consists of all portfolios that are not dominated by any other portfolio.

Efficient Frontier

A portfolio is efficient if no other portfolio offers higher expected return for the same risk, or lower risk for the same expected return. The collection of all efficient portfolios forms the efficient frontier.

Two-Fund Separation Theorem

A remarkable property of mean-variance optimization is the two-fund separation theorem: any efficient portfolio can be expressed as a linear combination of two distinct efficient portfolios. This means efficiency can be achieved regardless of risk preferences by holding different combinations of just two mutual funds.

The theorem significantly affects portfolio management. It suggests that an asset management firm need only offer two efficiently managed funds, and an investor can achieve an optimal portfolio by combining these funds in proportions determined by their risk tolerance. This separation between the "production" of efficient portfolios and your "consumption" choices simplifies the investment process dramatically.

Mathematically, if w1\mathbf{w}_1 and w2\mathbf{w}_2 are any two efficient portfolios, then for any scalar α\alpha, the portfolio w\mathbf{w} is also efficient:

w=αw1+(1α)w2\mathbf{w} = \alpha \mathbf{w}_1 + (1 - \alpha) \mathbf{w}_2

where:

  • w\mathbf{w}: the new efficient portfolio
  • α\alpha: weighting scalar
  • w1,w2\mathbf{w}_1, \mathbf{w}_2: two distinct efficient portfolios

The parameter α\alpha determines where along the efficient frontier the combined portfolio sits. When α=1\alpha = 1, we hold only portfolio 1; when α=0\alpha = 0, we hold only portfolio 2; values between 0 and 1 produce intermediate portfolios. Values outside this range (which require short selling one of the funds) extend the frontier beyond the original two portfolios.

Computing the Frontier

Let's implement efficient frontier computation using quadratic programming:

In[8]:
Code
import numpy as np
from scipy.optimize import minimize


def efficient_frontier(mu, cov_matrix, n_points=50, allow_short=True):
    """
    Compute the efficient frontier.

    Parameters
    ----------
    mu : ndarray
        Expected returns for each asset
    cov_matrix : ndarray
        Covariance matrix of returns
    n_points : int
        Number of points on the frontier
    allow_short : bool
        Whether to allow short selling

    Returns
    -------
    frontier_returns : ndarray
        Expected returns of frontier portfolios
    frontier_risks : ndarray
        Standard deviations of frontier portfolios
    frontier_weights : ndarray
        Weights of frontier portfolios (n_points x n_assets)
    """
    n_assets = len(mu)

    def portfolio_variance(w):
        return w @ cov_matrix @ w

    def portfolio_return(w):
        return w @ mu

    # Constraint: weights sum to 1
    constraints = [{"type": "eq", "fun": lambda w: np.sum(w) - 1}]

    # Bounds
    if allow_short:
        bounds = tuple((-1, 2) for _ in range(n_assets))
    else:
        bounds = tuple((0, 1) for _ in range(n_assets))

    # Find minimum and maximum feasible returns
    min_ret = minimize(
        lambda w: portfolio_return(w),
        x0=np.ones(n_assets) / n_assets,
        bounds=bounds,
        constraints=constraints,
    ).fun
    max_ret = (
        minimize(
            lambda w: -portfolio_return(w),
            x0=np.ones(n_assets) / n_assets,
            bounds=bounds,
            constraints=constraints,
        ).fun
        * -1
    )

    # Swap if needed (minimize returns negative portfolio return)
    if min_ret > max_ret:
        min_ret, max_ret = max_ret, min_ret

    target_returns = np.linspace(min_ret, max_ret, n_points)

    frontier_returns = []
    frontier_risks = []
    frontier_weights = []

    for target in target_returns:
        # Add return constraint
        cons = constraints + [
            {"type": "eq", "fun": lambda w, t=target: portfolio_return(w) - t}
        ]

        result = minimize(
            portfolio_variance,
            x0=np.ones(n_assets) / n_assets,
            bounds=bounds,
            constraints=cons,
            method="SLSQP",
        )

        if result.success:
            frontier_returns.append(portfolio_return(result.x))
            frontier_risks.append(np.sqrt(portfolio_variance(result.x)))
            frontier_weights.append(result.x)

    return (
        np.array(frontier_returns),
        np.array(frontier_risks),
        np.array(frontier_weights),
    )

Worked Example: Three-Asset Portfolio

Let's construct an efficient frontier for a portfolio of stocks, bonds, and commodities. This example illustrates how different asset classes with distinct risk-return profiles and correlations combine to form a diversified investment opportunity set:

In[9]:
Code
import numpy as np

# Asset parameters (annualized)
# Typical values for stocks, bonds, and commodities
asset_names = ["Stocks", "Bonds", "Commodities"]

# Expected returns
mu = np.array([0.08, 0.03, 0.05])  # 8%, 3%, 5%

# Volatilities
sigma = np.array([0.18, 0.05, 0.20])  # 18%, 5%, 20%

# Correlation matrix
corr = np.array(
    [
        [1.0, 0.1, 0.3],  # Stocks
        [0.1, 1.0, 0.0],  # Bonds
        [0.3, 0.0, 1.0],  # Commodities
    ]
)

# Convert to covariance matrix: Σ = diag(σ) × Corr × diag(σ)
cov_matrix = np.diag(sigma) @ corr @ np.diag(sigma)
Out[10]:
Console
Covariance Matrix:
[[0.0324 0.0009 0.0108]
 [0.0009 0.0025 0.    ]
 [0.0108 0.     0.04  ]]

Asset Statistics:
  Stocks: μ = 8.0%, σ = 18.0%
  Bonds: μ = 3.0%, σ = 5.0%
  Commodities: μ = 5.0%, σ = 20.0%

The covariance matrix confirms the relationships between our assets: stocks and commodities have higher volatility and a moderate positive correlation (0.3), while bonds offer stability with low volatility and low correlation to the other assets. The near-zero correlations between bonds and the other asset classes suggest that bonds will play a crucial role in diversifying portfolio risk.

Out[11]:
Visualization
Heatmap showing correlation values between stocks, bonds, and commodities.
Correlation matrix heatmap for the three-asset portfolio. Bonds show low or zero correlation with both stocks and commodities, making them valuable for diversification. The numerical values inside the cells highlight the low or zero correlation of bonds compared to the moderate 0.3 correlation between stocks and commodities.

Now let's compute and visualize the efficient frontier:

In[12]:
Code
# Compute efficient frontier
frontier_returns, frontier_risks, frontier_weights = efficient_frontier(
    mu, cov_matrix, n_points=100, allow_short=True
)

# Find global minimum variance portfolio
gmv_weights = global_minimum_variance_portfolio(cov_matrix)
gmv_return = gmv_weights @ mu
gmv_risk = np.sqrt(gmv_weights @ cov_matrix @ gmv_weights)
Out[13]:
Visualization
Scatter plot with efficient frontier curve and individual asset positions marked.
Efficient frontier for a three-asset portfolio showing the optimal risk-return tradeoff. The upper branch represents efficient portfolios; portfolios below the minimum variance point are inefficient. Individual assets (Stocks, Bonds, and Commodities) lie inside the frontier, demonstrating how diversification creates superior risk-return combinations compared to holding single securities.

The efficient frontier curve illustrates the optimal risk-return combinations available to investors. The individual assets (red dots) lie inside the frontier, confirming that portfolios combining imperfectly correlated assets offer superior efficiency compared to holding single securities. Notice how the frontier bows outward to the left, demonstrating that diversification creates portfolios with better risk-return characteristics than any individual asset.

Out[14]:
Console
Global Minimum Variance Portfolio:
  Stocks: 2.95%
  Bonds: 92.03%
  Commodities: 5.02%

  Expected Return: 3.25%
  Standard Deviation: 4.82%

The GMV portfolio is heavily weighted toward bonds because of their low volatility. Notice that the GMV portfolio achieves lower risk than any individual asset, demonstrating the power of diversification. Even the lowest-risk asset (bonds at 5% volatility) has higher risk than the optimally diversified GMV portfolio.

Portfolio Composition Along the Frontier

Understanding how portfolio weights change along the efficient frontier provides insight into the risk-return tradeoff. As you move up the frontier seeking higher returns, you must accept more risk, and the portfolio composition shifts accordingly:

Out[15]:
Visualization
Stacked area chart showing portfolio weights for stocks, bonds, and commodities.
Asset weight distribution across the efficient frontier for a three-asset portfolio. As the target return increases, the allocation shifts significantly from conservative bonds toward higher-return stocks and commodities, illustrating the risk-return tradeoff. The vertical dashed line marks the global minimum variance portfolio, which is heavily dominated by bond holdings due to their low volatility.

At low target returns (near the GMV portfolio), the allocation is dominated by bonds. As we move up the frontier toward higher returns, the portfolio shifts toward stocks, with commodities providing additional diversification benefits due to their low correlation with bonds. The smooth transition in weights illustrates how the efficient allocation changes continuously as your return target increases.

Quadratic Programming for Portfolio Optimization

For practical applications with many assets and constraints, we use quadratic programming (QP). Quadratic programming is a mathematical optimization technique designed specifically for problems where the objective function is quadratic (involving squared terms) and the constraints are linear. Mean-variance optimization fits this structure perfectly because portfolio variance is a quadratic function of the weights.

The standard QP form is:

minx12xQx+cxsubject toAeqx=beqAineqxbineq\begin{aligned} \min_{\mathbf{x}} \quad & \frac{1}{2}\mathbf{x}^\top Q \mathbf{x} + \mathbf{c}^\top \mathbf{x} \\ \text{subject to} \quad & A_{\text{eq}}\mathbf{x} = \mathbf{b}_{\text{eq}} \\ & A_{\text{ineq}}\mathbf{x} \leq \mathbf{b}_{\text{ineq}} \end{aligned}

where:

  • x\mathbf{x}: vector of decision variables
  • QQ: symmetric matrix (quadratic term)
  • c\mathbf{c}: vector (linear term)
  • Aeq,beqA_{\text{eq}}, \mathbf{b}_{\text{eq}}: matrix and vector defining equality constraints
  • Aineq,bineqA_{\text{ineq}}, \mathbf{b}_{\text{ineq}}: matrix and vector defining inequality constraints

Mean-variance optimization maps directly to this form with Q=ΣQ = \Sigma and c=0\mathbf{c} = \mathbf{0} (or c=γμ\mathbf{c} = -\gamma\boldsymbol{\mu} for the utility maximization variant). The equality constraints enforce the budget constraint and target return, while inequality constraints can encode position limits, sector constraints, and other practical requirements.

Adding Practical Constraints

Real portfolio optimization often includes additional constraints that reflect regulatory requirements, risk management policies, or investment mandates:

  • No short selling: wi0w_i \geq 0 for all ii. This constraint prevents the portfolio from taking negative positions, which an investor cannot or chooses not to take.
  • Maximum position size: wiwmaxw_i \leq w_{\max} (e.g., 10% per asset). This ensures diversification by preventing excessive concentration in any single security.
  • Sector constraints: Total exposure to a sector \leq some limit. This prevents the portfolio from becoming overly exposed to a single industry or market segment.
  • Turnover constraints: Changes from current portfolio are limited. This controls transaction costs and tax consequences by restricting how much trading occurs.

Let's implement a more realistic optimizer using the cvxpy library:

In[16]:
Code
# uv pip install cvxpy
import cvxpy as cp
import numpy as np


def mean_variance_optimize(
    mu,
    cov_matrix,
    target_return=None,
    risk_aversion=None,
    max_weight=None,
    min_weight=0.0,
    allow_short=False,
):
    """
    Solve mean-variance optimization with practical constraints.

    Parameters
    ----------
    mu : ndarray
        Expected returns
    cov_matrix : ndarray
        Covariance matrix
    target_return : float, optional
        Target portfolio return (for min variance formulation)
    risk_aversion : float, optional
        Risk aversion parameter (for max utility formulation)
    max_weight : float, optional
        Maximum weight per asset
    min_weight : float, optional
        Minimum weight per asset (default 0)
    allow_short : bool
        Allow negative weights

    Returns
    -------
    weights : ndarray
        Optimal portfolio weights
    """
    n = len(mu)
    w = cp.Variable(n)

    # Convert to numpy arrays and ensure proper shape
    mu_arr = np.asarray(mu).flatten()
    cov_arr = np.asarray(cov_matrix)

    # Ensure covariance matrix is PSD by adding small regularization
    cov_arr = cov_arr + np.eye(len(mu_arr)) * 1e-8

    # Portfolio variance (objective to minimize)
    portfolio_variance = cp.quad_form(w, cov_arr)
    portfolio_return = mu_arr @ w

    # Constraints
    constraints = [cp.sum(w) == 1]

    if not allow_short:
        constraints.append(w >= min_weight)

    if max_weight is not None:
        constraints.append(w <= max_weight)

    # Choose formulation
    if target_return is not None:
        # Minimize variance for target return
        constraints.append(portfolio_return >= target_return)
        objective = cp.Minimize(portfolio_variance)
    elif risk_aversion is not None:
        # Maximize utility: return - (γ/2) * variance
        objective = cp.Maximize(
            portfolio_return - (risk_aversion / 2) * portfolio_variance
        )
    else:
        # Default: minimize variance (GMV portfolio)
        objective = cp.Minimize(portfolio_variance)

    problem = cp.Problem(objective, constraints)
    problem.solve(solver=cp.SCS)

    return w.value
In[17]:
Code
# Unconstrained GMV (allows short selling)
w_unconstrained = global_minimum_variance_portfolio(cov_matrix)

# Constrained: no short selling
w_long_only = mean_variance_optimize(mu, cov_matrix, allow_short=False)

# Constrained: max 40% per asset, no short selling
max_pos = 0.4
w_max_40 = mean_variance_optimize(
    mu, cov_matrix, allow_short=False, max_weight=max_pos
)

# Calculate performance metrics
ret_unconstrained = w_unconstrained @ mu
risk_unconstrained = np.sqrt(w_unconstrained @ cov_matrix @ w_unconstrained)

ret_long_only = w_long_only @ mu
risk_long_only = np.sqrt(w_long_only @ cov_matrix @ w_long_only)

ret_max_40 = w_max_40 @ mu
risk_max_40 = np.sqrt(w_max_40 @ cov_matrix @ w_max_40)
Out[18]:
Console
Portfolio Weights Comparison:
------------------------------------------------------------
Asset             Unconstrained       Long-Only         Max 40%
------------------------------------------------------------
Stocks                    2.95%           2.95%          33.78%
Bonds                    92.03%          92.03%          40.00%
Commodities               5.02%           5.02%          26.22%
------------------------------------------------------------

Unconstrained:
  Return: 3.25%, Risk: 4.82%

Long-Only:
  Return: 3.25%, Risk: 4.82%

Max 40%:
  Return: 5.21%, Risk: 9.49%
Out[19]:
Visualization
Grouped bar chart comparing weights for three assets across three optimization scenarios.
Comparison of portfolio weights under different constraint regimes. Adding constraints forces diversification but may increase risk relative to the unconstrained optimum. The long-only and position-limit constraints prevent the negative allocations (short selling) that would otherwise be chosen in the unconstrained portfolio.

The constrained portfolios have slightly higher risk than the unconstrained GMV portfolio. This is a general result: adding constraints can never improve the optimal objective value. Constraints restrict the feasible region, preventing the optimizer from reaching solutions that might otherwise be optimal. The cost of these constraints appears as slightly higher portfolio risk, but this cost may be acceptable given the practical benefits of avoiding short positions or maintaining diversification.

The Mathematics of Diversification

Diversification is the only "free lunch" in finance. To understand why, let's decompose portfolio variance more carefully. The key insight is that not all risk is equal: some risk affects all assets together (systematic risk), while other risk is unique to individual securities (idiosyncratic risk). Diversification can eliminate the latter but not the former.

Variance Decomposition

Consider an equally-weighted portfolio of nn assets with identical variance σ2\sigma^2 and pairwise correlation ρ\rho. While this is a simplification, it reveals the essential mechanics of diversification. The portfolio variance is:

σp2=i=1nj=1n1n2σij\sigma_p^2 = \sum_{i=1}^n \sum_{j=1}^n \frac{1}{n^2} \sigma_{ij}

where:

  • σp2\sigma_p^2: variance of the portfolio
  • nn: number of assets
  • σij\sigma_{ij}: covariance between asset ii and asset jj

For the diagonal terms (i=ji = j): σii=σ2\sigma_{ii} = \sigma^2

For the off-diagonal terms (iji \neq j): σij=ρσ2\sigma_{ij} = \rho \sigma^2

There are nn diagonal terms and n(n1)n(n-1) off-diagonal terms. The diagonal terms represent each asset's own variance contribution, while the off-diagonal terms capture the covariance contributions from all pairs of different assets:

σp2=nσ2n2+n(n1)ρσ2n2=σ2n+n1nρσ2(simplify fractions)\begin{aligned} \sigma_p^2 &= \frac{n \sigma^2}{n^2} + \frac{n(n-1) \rho \sigma^2}{n^2} \\ &= \frac{\sigma^2}{n} + \frac{n-1}{n} \rho \sigma^2 && \text{(simplify fractions)} \end{aligned}

where:

  • σ2\sigma^2: variance of each individual asset
  • ρ\rho: pairwise correlation between all assets
  • nn: number of assets in the portfolio

This decomposition separates portfolio variance into two distinct components. The first term, σ2/n\sigma^2/n, represents idiosyncratic risk. It depends on the number of assets and shrinks as nn increases. This is the diversifiable component of risk. The second term, n1nρσ2\frac{n-1}{n}\rho\sigma^2, represents systematic risk. It depends on the correlation between assets and persists regardless of how many assets we hold.

As nn \to \infty:

σp2ρσ2\sigma_p^2 \to \rho \sigma^2

where:

  • σp2\sigma_p^2: limiting portfolio variance
  • ρ\rho: pairwise correlation between assets
  • σ2\sigma^2: variance of individual assets

This reveals a fundamental insight: diversification can eliminate idiosyncratic risk (the σ2/n\sigma^2/n term that vanishes with nn), but cannot eliminate systematic risk (the ρσ2\rho \sigma^2 term that persists). No matter how many assets we add to our portfolio, we cannot diversify away the risk that comes from common factors affecting all assets simultaneously. When markets crash, all stocks tend to fall together; this correlated movement represents systematic risk that no amount of stock picking can eliminate.

In[20]:
Code
import numpy as np


def calculate_portfolio_variance(n, sigma, rho):
    """Calculate portfolio variance for an equally weighted portfolio."""
    return (sigma**2 / n) + ((n - 1) / n) * rho * sigma**2


# Define parameters for visualization
sigma = 0.25
rho_values = [0.0, 0.2, 0.4, 0.6]
n_assets = np.arange(1, 101)
Out[21]:
Visualization
Line chart showing portfolio standard deviation decreasing as number of assets increases.
Portfolio risk reduction as the number of assets increases across different correlation levels. The horizontal dashed lines represent systematic risk, which persists as a floor that cannot be eliminated through diversification, regardless of the number of assets held. Higher correlation (rho) raises this floor, limiting the total possible risk reduction.
Out[22]:
Visualization
Stacked area chart showing how portfolio variance decomposes into two components.
Decomposition of portfolio variance into diversifiable (idiosyncratic) and non-diversifiable (systematic) components. As portfolio size increases, idiosyncratic risk vanishes while systematic risk remains constant, showing that diversification only eliminates asset-specific uncertainty.
Systematic vs. Idiosyncratic Risk

Systematic risk (also called market risk or undiversifiable risk) affects all assets and cannot be eliminated through diversification. Idiosyncratic risk (also called specific or diversifiable risk) is unique to individual assets and approaches zero as portfolio size increases. We'll explore this distinction further in the next chapter on the Capital Asset Pricing Model.

How Many Assets Are Enough?

The figure shows that most diversification benefits are captured with 20-30 assets, depending on correlation. Beyond that, adding assets provides diminishing marginal benefit. This explains why well-diversified portfolios don't need hundreds of positions. The mathematics tells us that each additional asset beyond a certain point contributes very little to further risk reduction, while potentially adding complexity and transaction costs.

In[23]:
Code
# Calculate percentage of diversification benefit captured
sigma = 0.25
rho = 0.3  # Moderate correlation


def pct_diversification(n, sigma, rho):
    """Calculate percentage of diversifiable risk eliminated."""
    single_asset_var = sigma**2
    portfolio_var = (sigma**2 / n) + ((n - 1) / n) * rho * sigma**2
    systematic_var = rho * sigma**2
    diversifiable_var = single_asset_var - systematic_var
    eliminated_var = single_asset_var - portfolio_var
    return eliminated_var / diversifiable_var * 100


# Calculate benefits for various portfolio sizes
asset_counts = [5, 10, 20, 30, 50, 100]
diversification_benefits = {
    n: pct_diversification(n, sigma, rho) for n in asset_counts
}
Out[24]:
Console
Diversification Benefit Captured (ρ = 0.3):
----------------------------------------
    5 assets:  80.00% of diversifiable risk eliminated
   10 assets:  90.00% of diversifiable risk eliminated
   20 assets:  95.00% of diversifiable risk eliminated
   30 assets:  96.67% of diversifiable risk eliminated
   50 assets:  98.00% of diversifiable risk eliminated
  100 assets:  99.00% of diversifiable risk eliminated
Out[25]:
Visualization
Line chart showing cumulative diversification benefit rising quickly then leveling off.
Cumulative elimination of diversifiable risk as a function of portfolio size. The curve demonstrates that over 90% of diversifiable risk is removed within the first 20 to 30 assets, after which the marginal reduction in risk becomes negligible. This illustrates why well-diversified institutional portfolios often hold a relatively small number of positions.

The table results demonstrate that the majority of diversification benefits are realized early. With just 20 assets, nearly 95% of the diversifiable risk is eliminated, confirming that a portfolio does not need hundreds of positions to be well-diversified. This finding has practical implications for portfolio construction: beyond a certain point, adding more securities increases complexity and costs without meaningfully improving the risk-return profile.

Practical Implementation with Real Data

Let's apply mean-variance optimization to a realistic multi-asset portfolio using historical return data:

In[26]:
Code
import numpy as np

# Simulated historical returns for 5 assets (252 trading days x 5 years)
np.random.seed(42)
n_days = 252 * 5
n_assets = 5
asset_names = ["Large Cap", "Small Cap", "Intl Stocks", "Bonds", "REITs"]

# True parameters for simulation
true_annual_returns = np.array([0.10, 0.12, 0.08, 0.04, 0.09])
true_annual_vols = np.array([0.16, 0.22, 0.18, 0.04, 0.20])

# Correlation structure
true_corr = np.array(
    [
        [1.00, 0.75, 0.65, 0.10, 0.55],
        [0.75, 1.00, 0.60, 0.05, 0.50],
        [0.65, 0.60, 1.00, 0.15, 0.45],
        [0.10, 0.05, 0.15, 1.00, 0.20],
        [0.55, 0.50, 0.45, 0.20, 1.00],
    ]
)

# Convert to daily parameters
daily_returns_mean = true_annual_returns / 252
daily_vols = true_annual_vols / np.sqrt(252)
true_cov_daily = np.diag(daily_vols) @ true_corr @ np.diag(daily_vols)

# Generate returns using Cholesky decomposition
L = np.linalg.cholesky(true_cov_daily)
z = np.random.randn(n_days, n_assets)
returns = daily_returns_mean + z @ L.T
In[27]:
Code
# Estimate parameters from historical data
estimated_mu = returns.mean(axis=0) * 252  # Annualize
estimated_cov = np.cov(returns.T) * 252  # Annualize

# Compute efficient frontier
frontier_ret, frontier_risk, frontier_w = efficient_frontier(
    estimated_mu, estimated_cov, n_points=100, allow_short=False
)

# Find maximum Sharpe ratio portfolio (assuming rf = 2%)
rf = 0.02
sharpe_ratios = (frontier_ret - rf) / frontier_risk
max_sharpe_idx = np.argmax(sharpe_ratios)
tangent_weights = frontier_w[max_sharpe_idx]
tangent_return = frontier_ret[max_sharpe_idx]
tangent_risk = frontier_risk[max_sharpe_idx]
max_sharpe_ratio = sharpe_ratios[max_sharpe_idx]

# Calculate Capital Market Line (CML)
cml_risks = np.linspace(0, tangent_risk * 1.5, 50)
cml_returns = rf + max_sharpe_ratio * cml_risks
Out[28]:
Visualization
Chart showing efficient frontier curve with tangent portfolio and capital market line.
Efficient frontier constructed from historical return data with the tangent portfolio (maximum Sharpe ratio) highlighted. The capital market line shows the risk-return tradeoff when combining the tangent portfolio with the risk-free asset (2%), representing the most efficient set of risky and risk-free asset combinations.

The chart displays the efficient frontier derived from historical data. The Capital Market Line (green dashed) connects the risk-free rate to the tangent portfolio, illustrating the optimal opportunity set available by combining the risk-free asset with the maximal Sharpe ratio portfolio.

Out[29]:
Console
Tangent Portfolio (Maximum Sharpe Ratio):
--------------------------------------------------
  Large Cap      :  14.49%
  Small Cap      :   0.00%
  Intl Stocks    :   0.00%
  Bonds          :   7.02%
  REITs          :  78.50%
--------------------------------------------------
  Expected Return: 13.45%
  Standard Deviation: 16.79%
  Sharpe Ratio: 0.682
Out[30]:
Visualization
Pie chart showing the weight distribution of the tangent portfolio.
Asset allocation of the tangent portfolio (maximum Sharpe ratio portfolio). This allocation represents the optimal combination of risky assets for all investors regardless of risk tolerance. In this historical example, the allocation is diversified across all five assets, with bonds and REITs providing significant stability.

The tangent portfolio maximizes the Sharpe ratio, representing the optimal combination of risky assets. You should hold this portfolio combined with lending or borrowing at the risk-free rate, regardless of your risk preference. This insight leads directly to the Capital Asset Pricing Model, which we'll develop in the next chapter.

Limitations of Mean-Variance Optimization

While mean-variance optimization provides a rigorous framework for portfolio construction, several practical challenges limit its direct application.

Estimation Error

The most significant limitation is sensitivity to input estimation errors. Small changes in expected returns, volatilities, or correlations can lead to dramatically different optimal portfolios. Since we must estimate these parameters from historical data, the resulting portfolios can be highly unstable.

Research has shown that mean-variance optimizers are "error maximizers": they tend to overweight assets with overestimated returns and underweight those with underestimated returns. The expected return estimates are particularly problematic because they have much higher estimation error than covariance estimates.

In[31]:
Code
import numpy as np

# Demonstrate sensitivity to estimation error
np.random.seed(123)

# Add small noise to expected returns (1% estimation error)
est_error = 0.01
n_simulations = 50
perturbed_weights = []

for _ in range(n_simulations):
    perturbed_mu = estimated_mu + np.random.normal(0, est_error, n_assets)
    w = mean_variance_optimize(
        perturbed_mu, estimated_cov, target_return=0.08, allow_short=False
    )
    perturbed_weights.append(w)

perturbed_weights = np.array(perturbed_weights)

# Calculate statistics
mean_w = perturbed_weights.mean(axis=0)
std_w = perturbed_weights.std(axis=0)
min_w = perturbed_weights.min(axis=0)
max_w = perturbed_weights.max(axis=0)
Out[32]:
Visualization
Box plot showing weight distributions for five assets with wide confidence intervals.
Box plots showing portfolio weight variability across 50 simulations with 1% estimation error in expected returns. The significant spread demonstrates how sensitive optimal weights are to small input changes, with weights for volatile assets like stocks fluctuating significantly due to minor forecast errors.

The box plots reveal significant dispersion in the optimal weights for each asset across the simulations. These wide ranges indicate that optimal portfolios are highly sensitive to even small perturbations in expected return inputs, a phenomenon often described as "error maximization."

Out[33]:
Console
Weight Variability Statistics:
------------------------------------------------------------
Asset                 Mean    Std Dev        Min        Max
------------------------------------------------------------
Large Cap            8.93%      7.64%     -0.00%     28.10%
Small Cap            0.06%      0.29%     -0.00%      1.99%
Intl Stocks          0.08%      0.56%     -0.00%      4.02%
Bonds               53.57%      7.57%     34.95%     69.15%
REITs               37.36%      4.73%     26.69%     46.15%

The weight variability is substantial. A 1% estimation error in expected returns causes some weights to fluctuate by 10 percentage points or more. This instability makes raw mean-variance optimization impractical without additional regularization techniques.

Solutions to Estimation Error

Several approaches address the estimation error problem:

  • Shrinkage estimators: Combine sample estimates with prior beliefs (e.g., shrink toward equal weights or a market-cap weighted portfolio)
  • Robust optimization: Explicitly account for parameter uncertainty in the optimization
  • Resampling methods: Generate multiple parameter sets, optimize each, and average the results
  • Constraints: Position limits and other constraints implicitly regularize the optimization
  • Black-Litterman model: Combine market equilibrium returns with your views

We'll explore advanced portfolio construction techniques that address these issues in Part IV's later chapters.

Assumptions and Limitations

Beyond estimation error, the mean-variance framework rests on several assumptions worth examining:

  • Quadratic utility or normal returns: Mean-variance is optimal only if returns are normally distributed or you have a quadratic utility function. As we discussed in Part III's chapter on stylized facts, return distributions exhibit fat tails and skewness that violate normality.

  • Single-period framework: The basic model ignores multi-period considerations like rebalancing costs, tax consequences, and changing investment horizons.

  • No transaction costs: Real portfolios face trading costs that mean-variance ignores.

  • Unlimited leverage and short selling: Many investors face constraints that prevent them from implementing the theoretically optimal portfolio.

Despite these limitations, mean-variance optimization remains the foundational framework for portfolio construction. Its insights about diversification, the risk-return tradeoff, and the efficient frontier guide both academic research and practical investment management.

Key Parameters

The key parameters for Mean-Variance Optimization are:

  • μ\boldsymbol{\mu} (Expected Returns): The vector of forecasted returns for each asset. Estimation errors here have the largest impact on portfolio weight stability.
  • Σ\Sigma (Covariance Matrix): Captures the risk of individual assets (diagonal elements) and their co-movements (off-diagonal elements). Positive correlations reduce diversification benefits.
  • rfr_f (Risk-Free Rate): The return on a risk-free asset, used to calculate Sharpe ratios and the Capital Market Line.
  • γ\gamma (Risk Aversion): Determines the trade-off between risk and return in the utility maximization formulation (U=wμγ2wΣwU = \mathbf{w}^\top \boldsymbol{\mu} - \frac{\gamma}{2}\mathbf{w}^\top \Sigma \mathbf{w}).
  • w\mathbf{w} (Weights): The decision variables representing the fraction of capital allocated to each asset. Constraints typically require wi=1\sum w_i = 1 and wi0w_i \ge 0 (for long-only portfolios).

Summary

This chapter developed the mathematical framework of Modern Portfolio Theory, which transformed investment management from an art to a science.

The key concepts we covered include:

  • Portfolio return and risk: Portfolio expected return is the weighted average of asset returns (μp=wμ\mu_p = \mathbf{w}^\top \boldsymbol{\mu}), while portfolio variance depends on the full covariance structure (σp2=wΣw\sigma_p^2 = \mathbf{w}^\top \Sigma \mathbf{w}).

  • The efficient frontier: The set of portfolios offering maximum expected return for each level of risk. You should hold a portfolio on this frontier if you are a rational mean-variance investor.

  • Global minimum variance portfolio: The portfolio with the lowest possible variance, given by wGMV=Σ11/(1Σ11)\mathbf{w}_{\text{GMV}} = \Sigma^{-1}\mathbf{1} / (\mathbf{1}^\top \Sigma^{-1}\mathbf{1}).

  • Two-fund separation: Any efficient portfolio can be expressed as a combination of two distinct efficient portfolios, simplifying your investment decision.

  • Diversification benefits and limits: Combining assets with less-than-perfect correlation reduces portfolio risk. However, systematic risk (captured by the average correlation) cannot be diversified away, no matter how many assets are included.

  • Practical optimization: Quadratic programming enables efficient frontier computation with realistic constraints on short selling, position sizes, and sector exposures.

  • Estimation error: Mean-variance optimization is highly sensitive to input estimation errors, particularly in expected returns. This sensitivity motivates the regularization techniques and robust methods covered in later chapters.

The framework developed here provides the foundation for the Capital Asset Pricing Model in the next chapter, where we'll see how equilibrium considerations determine the pricing of risk in the market portfolio.

Quiz

Ready to test your understanding? Take this quick quiz to reinforce what you've learned about Modern Portfolio Theory and mean-variance optimization.

Loading component...

Reference

BIBTEXAcademic
@misc{modernportfoliotheorymeanvarianceoptimizationguide, author = {Michael Brenndoerfer}, title = {Modern Portfolio Theory: Mean-Variance Optimization Guide}, year = {2025}, url = {https://mbrenndoerfer.com/writing/modern-portfolio-theory-mean-variance-optimization}, organization = {mbrenndoerfer.com}, note = {Accessed: 2025-01-01} }
APAAcademic
Michael Brenndoerfer (2025). Modern Portfolio Theory: Mean-Variance Optimization Guide. Retrieved from https://mbrenndoerfer.com/writing/modern-portfolio-theory-mean-variance-optimization
MLAAcademic
Michael Brenndoerfer. "Modern Portfolio Theory: Mean-Variance Optimization Guide." 2026. Web. today. <https://mbrenndoerfer.com/writing/modern-portfolio-theory-mean-variance-optimization>.
CHICAGOAcademic
Michael Brenndoerfer. "Modern Portfolio Theory: Mean-Variance Optimization Guide." Accessed today. https://mbrenndoerfer.com/writing/modern-portfolio-theory-mean-variance-optimization.
HARVARDAcademic
Michael Brenndoerfer (2025) 'Modern Portfolio Theory: Mean-Variance Optimization Guide'. Available at: https://mbrenndoerfer.com/writing/modern-portfolio-theory-mean-variance-optimization (Accessed: today).
SimpleBasic
Michael Brenndoerfer (2025). Modern Portfolio Theory: Mean-Variance Optimization Guide. https://mbrenndoerfer.com/writing/modern-portfolio-theory-mean-variance-optimization