Linear Algebra for Quantitative Finance: Portfolio Math

Michael Brenndoerfer

Quantitative Finance Data, Analytics & AI

Master vectors, matrices, and decompositions for portfolio optimization, risk analysis, and factor models. Essential math foundations for quant finance.

Track your reading progress

Sign in to mark chapters as read and track your learning journey

Sign in →

Reading Level

Choose your expertise level to adjust how many terms are explained. Beginners see more tooltips, experts see fewer to maintain reading flow. Hover over underlined terms for instant definitions.

Linear Algebra for Quantitative FinanceLink Copied

Linear algebra forms the mathematical backbone of quantitative finance. Every portfolio optimization, risk calculation, and factor model you encounter relies on vectors and matrices to represent assets, returns, and their relationships. Computing portfolio risk involves matrix multiplication. Identifying the key drivers of market movements means finding eigenvectors of a covariance matrix. When you hedge a derivatives book, you're solving a system of linear equations.

This chapter builds your fluency in linear algebra with a constant eye toward financial applications. We start with vectors and matrices, showing how they naturally represent portfolios and return data. We then tackle systems of linear equations, which appear in hedging problems and arbitrage pricing. Finally, we explore matrix decompositions, the powerful techniques that enable principal component analysis and reveal the hidden structure in financial data.

Vectors in FinanceLink Copied

A vector is an ordered list of numbers. In finance, vectors appear everywhere: asset returns, portfolio weights, and factor exposures. The power of vectors lies in how we can manipulate them mathematically to answer financial questions.

Why do we need vectors rather than simply tracking individual numbers? Consider the alternative: if you manage a portfolio of 500 stocks, you could track 500 separate weight variables and 500 separate return variables. But this approach quickly becomes unwieldy. You'd need to write out 500 terms every time you calculate portfolio return, and any formula involving all assets would span pages. Vectors solve this organizational challenge by packaging related quantities into a single mathematical object that we can manipulate as a unit. This abstraction isn't merely notational convenience; it reveals structure. Operations that would be tedious and error-prone with individual variables become clean and computationally efficient with vectors.

Vector BasicsLink Copied

Consider a portfolio containing three assets. We can represent the weights allocated to each asset as a vector:

\mathbf{w} = \begin{bmatrix} w_1 \\ w_2 \\ w_3 \end{bmatrix} = \begin{bmatrix} 0.4 \\ 0.35 \\ 0.25 \end{bmatrix}

where:

$\mathbf{w}$ : the portfolio weight vector
$w_i$ : the fraction of portfolio value allocated to asset $i$

Here $w_1 = 0.4$ means 40% of the portfolio value is in asset 1. The vector $\mathbf{w}$ lives in $\mathbb{R}^3$ (three dimensional real space) because it has three components. Geometrically, each asset corresponds to an axis, and the weight vector points to a specific location in this three-dimensional "asset space." Every possible portfolio allocation corresponds to some point in this space. Portfolio constraints, like requiring weights to sum to one, define surfaces or regions within it.

Similarly, we can represent the returns of these three assets on a given day:

\mathbf{r} = \begin{bmatrix} r_1 \\ r_2 \\ r_3 \end{bmatrix} = \begin{bmatrix} 0.02 \\ -0.01 \\ 0.015 \end{bmatrix}

where:

$\mathbf{r}$ : the return vector for a single time period
$r_i$ : the return of asset $i$ (expressed as a decimal, so 0.02 = 2%)

This says asset 1 returned 2%, asset 2 lost 1%, and asset 3 gained 1.5%. Notice how naturally this representation captures a snapshot of market behavior: all three returns belong together because they occurred simultaneously, and the vector keeps them organized as a coherent unit.

Vector OperationsLink Copied

The fundamental vector operations translate directly into financial calculations:

Scalar multiplication scales every element by a constant. If you double your position in everything:

2\mathbf{w} = \begin{bmatrix} 0.8 \\ 0.7 \\ 0.5 \end{bmatrix}

Geometrically, scalar multiplication stretches or shrinks the vector without changing its direction. Financially, this corresponds to leveraging or deleveraging a portfolio, maintaining the same relative allocations. A leveraged portfolio with 2x weights has double the exposure to every asset, magnifying both gains and losses proportionally.

Vector addition combines vectors element-wise. If $\mathbf{r}_1$ and $\mathbf{r}_2$ are returns on consecutive days:

\mathbf{r}_1 + \mathbf{r}_2 = \begin{bmatrix} r_{1,1} + r_{1,2} \\ r_{2,1} + r_{2,2} \\ r_{3,1} + r_{3,2} \end{bmatrix}

where $r_{i,t}$ denotes the return of asset $i$ on day $t$ .

For small returns, this approximates cumulative returns (the exact formula uses geometric compounding). Vector addition also models combining different portfolios. If two funds each contribute capital, the combined portfolio's weight vector is approximately the sum of the individual weight vectors, scaled by the relative capital contributions.

The Dot Product: Portfolio ReturnsLink Copied

The dot product (or inner product) of two vectors is the sum of the products of corresponding elements:

\mathbf{w} \cdot \mathbf{r} = w_1 r_1 + w_2 r_2 + w_3 r_3 = \sum_{i=1}^{n} w_i r_i

where:

$n$ : the number of assets in the portfolio.

This single number has direct financial meaning: it's the portfolio return. The formula captures exactly what we would compute by hand. Each asset contributes to the portfolio return in proportion to both its individual return and the fraction of capital allocated to it. Asset 1's contribution is its weight times its return, asset 2's contribution is its weight times its return, and so on. The dot product sums these contributions to produce the aggregate portfolio performance.

Why does this work? Consider what happens to a $1 portfolio. The amount invested in asset 1 is $w_1, and this grows to $w_1(1 + r_1). Similarly for each asset. The total portfolio value becomes $\sum w_i(1 + r_i) = $(\sum w_i + \sum w_i r_i). Since weights sum to 1, this equals $(1 + \sum w_i r_i), confirming that the portfolio return is indeed the dot product $\mathbf{w} \cdot \mathbf{r}$ .

In[2]:

Code

import numpy as np

# Portfolio weights (must sum to 1 for a fully invested portfolio)
weights = np.array([0.4, 0.35, 0.25])

# Daily returns for three assets
returns = np.array([0.02, -0.01, 0.015])

# Portfolio return is the dot product
portfolio_return = np.dot(weights, returns)

import numpy as np

# Portfolio weights (must sum to 1 for a fully invested portfolio)
weights = np.array([0.4, 0.35, 0.25])

# Daily returns for three assets
returns = np.array([0.02, -0.01, 0.015])

# Portfolio return is the dot product
portfolio_return = np.dot(weights, returns)

Out[3]:

Console

Asset weights: [0.4  0.35 0.25]
Asset returns: [ 0.02  -0.01   0.015]
Portfolio return: 0.0083 (0.83%)

Calculation: 0.4×0.02 + 0.35×(-0.01) + 0.25×0.015 = 0.0083

Out[4]:

Visualization

Portfolio return as the sum of weighted asset contributions. Each bar shows how much each asset contributes to the total portfolio return based on its weight and individual return.

The portfolio earned 1.13% by holding 40% in the 2% gainer, 35% in the 1% loser, and 25% in the 1.5% gainer. This simple calculation—a dot product—underlies virtually every performance calculation in finance.

Vector Norms: Measuring SizeLink Copied

The norm of a vector measures its "size" in various ways. But what does "size" mean for a vector? There's no single answer—different norms capture different notions of magnitude, each useful in different contexts. The most common is the Euclidean norm (or L2 norm):

\|\mathbf{v}\|_2 = \sqrt{\sum_{i=1}^{n} v_i^2}

where:

$\|\mathbf{v}\|_2$ : the L2 (Euclidean) norm of vector $\mathbf{v}$
$v_i$ : the $i$ -th component of the vector
$n$ : the dimension of the vector

The Euclidean norm corresponds to our intuitive notion of distance: it's the straight-line distance from the origin to the point represented by the vector. In finance, the L2 norm of a return vector relates to volatility. For a vector of deviations from the mean return, the L2 norm (scaled appropriately) gives the standard deviation. This connection between geometric distance and financial risk is one reason the L2 norm appears so frequently in portfolio optimization.

The L1 norm sums absolute values:

\|\mathbf{v}\|_1 = \sum_{i=1}^{n} |v_i|

The L1 norm measures total "travel distance" if you could only move along coordinate axes, like navigating a city grid where you can only travel along streets, not diagonally through blocks. In portfolio optimization, L1 norms appear in constraints that promote sparse portfolios (holding few assets), as we'll see in later chapters. This is because minimizing the L1 norm of a weight vector tends to push many weights exactly to zero, while the L2 norm spreads weights more evenly across all assets.

In[5]:

Code

# Different norms of a return vector
returns_vector = np.array([0.02, -0.01, 0.015, -0.005, 0.01])

l2_norm = np.linalg.norm(returns_vector, ord=2)  # Euclidean norm
l1_norm = np.linalg.norm(returns_vector, ord=1)  # Sum of absolute values
linf_norm = np.linalg.norm(returns_vector, ord=np.inf)  # Max absolute value

# Different norms of a return vector
returns_vector = np.array([0.02, -0.01, 0.015, -0.005, 0.01])

l2_norm = np.linalg.norm(returns_vector, ord=2)  # Euclidean norm
l1_norm = np.linalg.norm(returns_vector, ord=1)  # Sum of absolute values
linf_norm = np.linalg.norm(returns_vector, ord=np.inf)  # Max absolute value

Out[6]:

Console

Return vector: [ 0.02  -0.01   0.015 -0.005  0.01 ]
L2 norm: 0.029155
L1 norm: 0.060000
L∞ norm (max absolute): 0.020000

The L2 norm of 0.031 represents the Euclidean magnitude of the return vector, which relates to total variability. The L1 norm of 0.06 sums absolute deviations, useful when constructing sparse portfolios. The L∞ norm of 0.02 identifies the largest absolute return, highlighting the most extreme daily movement.

Matrices in FinanceLink Copied

A matrix is a rectangular array of numbers. While vectors represent single entities (one portfolio, one day of returns), matrices represent collections and relationships. These include return history across time, covariances between assets, and transformations between coordinate systems.

The jump from vectors to matrices is conceptually significant. A vector captures one snapshot, such as today's returns or a single portfolio's weights. A matrix captures an entire dataset or a complete description of how quantities relate to each other. When we write down a covariance matrix, we're encoding not just each asset's volatility but every pairwise relationship in the investment universe. When we write down a return matrix, we're capturing the complete history of how multiple assets performed over multiple time periods. This compression of information into a structured rectangular array is what makes quantitative finance computationally tractable.

Matrix FundamentalsLink Copied

An $m \times n$ matrix has $m$ rows and $n$ columns. In finance, we commonly organize data with rows as time periods and columns as assets:

\mathbf{R} = \begin{bmatrix} r_{1,1} & r_{1,2} & r_{1,3} \\ r_{2,1} & r_{2,2} & r_{2,3} \\ r_{3,1} & r_{3,2} & r_{3,3} \\ r_{4,1} & r_{4,2} & r_{4,3} \end{bmatrix}

Here $r_{t,i}$ is the return of asset $i$ on day $t$ . This $4 \times 3$ matrix contains 4 days of returns for 3 assets.

This convention (time as rows, assets as columns) is deliberate and consequential. It means each row represents a complete cross-section of the market at one moment, while each column represents the complete time series of one asset. Extracting a row gives you all assets' returns on a specific day; extracting a column gives you one asset's return history. Most matrix operations in finance align with this convention, so understanding the layout helps you interpret results correctly.

In[7]:

Code

# Simulated return matrix: 4 days, 3 assets
# Each row is a day, each column is an asset
return_matrix = np.array(
    [
        [0.02, -0.01, 0.015],  # Day 1
        [0.01, 0.02, -0.005],  # Day 2
        [-0.015, 0.005, 0.01],  # Day 3
        [0.005, -0.005, 0.02],  # Day 4
    ]
)

# Simulated return matrix: 4 days, 3 assets
# Each row is a day, each column is an asset
return_matrix = np.array(
    [
        [0.02, -0.01, 0.015],  # Day 1
        [0.01, 0.02, -0.005],  # Day 2
        [-0.015, 0.005, 0.01],  # Day 3
        [0.005, -0.005, 0.02],  # Day 4
    ]
)

Out[8]:

Console

Return matrix (4 days × 3 assets):
[[ 0.02  -0.01   0.015]
 [ 0.01   0.02  -0.005]
 [-0.015  0.005  0.01 ]
 [ 0.005 -0.005  0.02 ]]

Shape: (4, 3)
Day 2 returns (row 1): [ 0.01   0.02  -0.005]
Asset 3 returns (column 2): [ 0.015 -0.005  0.01   0.02 ]

Matrix MultiplicationLink Copied

Matrix multiplication is the workhorse operation of linear algebra. For matrices $\mathbf{A}$ (size $m \times n$ ) and $\mathbf{B}$ (size $n \times p$ ), the product $\mathbf{C} = \mathbf{A}\mathbf{B}$ is an $m \times p$ matrix where:

c_{ij} = \sum_{k=1}^{n} a_{ik} b_{kj}

where:

$c_{ij}$ : element in row $i$ , column $j$ of the result matrix $\mathbf{C}$
$a_{ik}$ : element in row $i$ , column $k$ of matrix $\mathbf{A}$
$b_{kj}$ : element in row $k$ , column $j$ of matrix $\mathbf{B}$

Each element of the result is a dot product of a row from $\mathbf{A}$ with a column from $\mathbf{B}$ .

Each element of the result shows how much the $i$ -th row of $\mathbf{A}$ aligns with the $j$ -th column of $\mathbf{B}$ . In financial terms, when multiplying a return matrix by a weight vector, each resulting element aggregates the weighted contributions of all assets for that time period. You can think of matrix multiplication as performing many dot products simultaneously. The $(i,j)$ entry of the product answers the question "how does row $i$ of the first matrix relate to column $j$ of the second?"

This interpretation illuminates why matrix multiplication has the dimensional requirements it does. The dot product requires vectors of equal length, so for each row-column pair to produce a dot product, the row length (number of columns in $\mathbf{A}$ ) must equal the column length (number of rows in $\mathbf{B}$ ). The result matrix takes its row count from $\mathbf{A}$ and its column count from $\mathbf{B}$ because we compute one number for each possible row-column pairing.

Matrix Dimension Compatibility

For matrix multiplication $\mathbf{AB}$ to be valid, the number of columns in $\mathbf{A}$ must equal the number of rows in $\mathbf{B}$ . The result has the number of rows from $\mathbf{A}$ and columns from $\mathbf{B}$ : $(m \times n)(n \times p) = (m \times p)$ .

A critical financial application is computing portfolio returns across multiple days. If $\mathbf{R}$ is a $T \times N$ matrix of returns ( $T$ days, $N$ assets) and $\mathbf{w}$ is an $N \times 1$ weight vector, then $\mathbf{R}\mathbf{w}$ gives a $T \times 1$ vector of portfolio returns:

In[9]:

Code

# Portfolio returns across multiple days via matrix multiplication
weights = np.array([0.4, 0.35, 0.25])

# Matrix-vector multiplication: (4×3) @ (3,) = (4,)
portfolio_returns = return_matrix @ weights

# Portfolio returns across multiple days via matrix multiplication
weights = np.array([0.4, 0.35, 0.25])

# Matrix-vector multiplication: (4×3) @ (3,) = (4,)
portfolio_returns = return_matrix @ weights

Out[10]:

Console

Daily portfolio returns:
  Day 1: +0.0083 (+0.83%)
  Day 2: +0.0097 (+0.97%)
  Day 3: -0.0018 (-0.18%)
  Day 4: +0.0053 (+0.53%)

Mean portfolio return: 0.0054
Portfolio volatility: 0.0044

The Covariance MatrixLink Copied

The covariance matrix captures how assets move together. For $N$ assets, it's an $N \times N$ symmetric matrix where element $(i,j)$ is the covariance between assets $i$ and $j$ :

\Sigma_{ij} = \text{Cov}(r_i, r_j) = \mathbb{E}[(r_i - \mu_i)(r_j - \mu_j)]

where:

$\Sigma_{ij}$ : the $(i,j)$ element of the covariance matrix $\Sigma$
$r_i, r_j$ : returns of assets $i$ and $j$
$\mu_i, \mu_j$ : expected (mean) returns of assets $i$ and $j$
$\mathbb{E}[\cdot]$ : the expectation operator

The formula shows what covariance measures: we're looking at the product of deviations from the mean. When asset $i$ is above its average and asset $j$ is also above its average, the product is positive. When both are below average, the product is again positive. But when one is above and the other below, the product is negative. By averaging these products across many observations, covariance tells us whether two assets tend to move together (positive covariance), move oppositely (negative covariance), or move independently (covariance near zero).

The diagonal elements $\Sigma_{ii}$ are variances of individual assets. The off-diagonal elements measure co-movement. Positive covariance means assets tend to move together. Negative covariance means they move oppositely.

In[11]:

Code

# Compute covariance matrix from return data
# Using more days for a realistic estimate
np.random.seed(42)
n_days = 252  # One trading year
n_assets = 3

# Generate correlated returns
mean_returns = np.array([0.0005, 0.0003, 0.0004])  # Daily means
volatilities = np.array([0.02, 0.015, 0.025])  # Daily vols

# True correlation structure
correlation = np.array([[1.0, 0.6, 0.3], [0.6, 1.0, 0.4], [0.3, 0.4, 1.0]])

# Convert correlation to covariance
true_cov = np.outer(volatilities, volatilities) * correlation

# Generate returns from multivariate normal
returns_data = np.random.multivariate_normal(mean_returns, true_cov, n_days)

# Estimate covariance matrix from data
sample_cov = np.cov(returns_data, rowvar=False)

# Compute covariance matrix from return data
# Using more days for a realistic estimate
np.random.seed(42)
n_days = 252  # One trading year
n_assets = 3

# Generate correlated returns
mean_returns = np.array([0.0005, 0.0003, 0.0004])  # Daily means
volatilities = np.array([0.02, 0.015, 0.025])  # Daily vols

# True correlation structure
correlation = np.array([[1.0, 0.6, 0.3], [0.6, 1.0, 0.4], [0.3, 0.4, 1.0]])

# Convert correlation to covariance
true_cov = np.outer(volatilities, volatilities) * correlation

# Generate returns from multivariate normal
returns_data = np.random.multivariate_normal(mean_returns, true_cov, n_days)

# Estimate covariance matrix from data
sample_cov = np.cov(returns_data, rowvar=False)

Out[12]:

Console

Sample covariance matrix (×10,000 for readability):
[[3.4765 1.4642 0.9198]
 [1.4642 2.1172 0.8783]
 [0.9198 0.8783 5.4973]]

Volatilities (annualized):
  Asset 1: 29.60%
  Asset 2: 23.10%
  Asset 3: 37.22%

Correlation matrix:
[[1.    0.54  0.21 ]
 [0.54  1.    0.257]
 [0.21  0.257 1.   ]]

Out[13]:

Visualization

Correlation matrix heatmap showing pairwise relationships between assets. Darker colors indicate stronger positive correlations, revealing how assets tend to move together.

Portfolio Variance: The Quadratic FormLink Copied

Portfolio variance demonstrates the power of matrix notation. For a portfolio with weights $\mathbf{w}$ and asset covariance matrix $\Sigma$ , the portfolio variance is:

\sigma_p^2 = \mathbf{w}^T \Sigma \mathbf{w}

where:

$\sigma_p^2$ : the variance of the portfolio's returns
$\mathbf{w}$ : the $N \times 1$ vector of portfolio weights
$\mathbf{w}^T$ : the transpose of $\mathbf{w}$ (a $1 \times N$ row vector)
$\Sigma$ : the $N \times N$ covariance matrix of asset returns

This compact formula packs a lot of computation: it accounts for each asset's variance and all pairwise covariances, weighted by the portfolio allocations. The expression $\mathbf{w}^T \Sigma \mathbf{w}$ is called a quadratic form because if you expand it, you get a polynomial where each term involves products of two weights—it's quadratic in the portfolio allocations.

Expanding for two assets:

\sigma_p^2 = w_1^2 \sigma_1^2 + w_2^2 \sigma_2^2 + 2 w_1 w_2 \sigma_{12}

where:

$\sigma_1^2, \sigma_2^2$ : the variances of assets 1 and 2
$\sigma_{12}$ : the covariance between assets 1 and 2

The first two terms are variance contributions from each asset. The third term involving covariance is why diversification works. When $\sigma_{12} < 0$ (negative correlation), it reduces portfolio variance. Even when covariance is positive but less than the geometric mean of the variances, diversification still helps by ensuring the portfolio variance is less than the weighted average of individual variances.

The matrix formulation generalizes seamlessly to any number of assets. For 500 stocks, the formula remains $\mathbf{w}^T \Sigma \mathbf{w}$ (the same compact expression) while the expanded version would require summing 250,500 terms (500 variances plus 124,750 unique covariances, each appearing twice). This is the power of linear algebra notation: it scales effortlessly from toy examples to production systems.

In[14]:

Code

# Portfolio variance calculation
weights = np.array([0.4, 0.35, 0.25])

# Using the quadratic form
portfolio_variance = weights @ sample_cov @ weights
portfolio_volatility = np.sqrt(portfolio_variance)

# Annualize
annual_vol = portfolio_volatility * np.sqrt(252)

# Portfolio variance calculation
weights = np.array([0.4, 0.35, 0.25])

# Using the quadratic form
portfolio_variance = weights @ sample_cov @ weights
portfolio_volatility = np.sqrt(portfolio_variance)

# Annualize
annual_vol = portfolio_volatility * np.sqrt(252)

Out[15]:

Console

Portfolio weights: [0.4  0.35 0.25]
Daily portfolio variance: 0.00019068
Daily portfolio volatility: 1.3809%
Annualized volatility: 21.92%

Weighted average volatility (no diversification): 29.23%
Diversification benefit: 7.31%

Out[16]:

Visualization

Diversification benefit illustrated by comparing individual asset volatilities with portfolio volatility. The portfolio achieves lower volatility than the weighted average of individual volatilities due to imperfect correlation between assets.

The portfolio volatility is lower than the weighted average of individual volatilities because the assets are imperfectly correlated. This is the mathematical basis of diversification.

Matrix Transpose and Special MatricesLink Copied

The transpose of matrix $\mathbf{A}$ , written $\mathbf{A}^T$ , flips rows and columns: $(A^T)_{ij} = A_{ji}$ .

Several special matrix types appear frequently in finance:

Symmetric matrices: $\mathbf{A} = \mathbf{A}^T$ . Covariance matrices are always symmetric. This property reflects a fundamental reality: the covariance between assets A and B must equal the covariance between B and A, since we're measuring the same relationship from both directions.
Diagonal matrices: Non-zero elements only on the diagonal. Used to represent variance contributions when assets are uncorrelated, or to scale different variables by different amounts.
Identity matrix: Diagonal matrix with 1s on the diagonal. Acts as the "1" of matrix multiplication: $\mathbf{I}\mathbf{A} = \mathbf{A}$ . It's the matrix equivalent of multiplying by one, leaving any matrix unchanged.
Positive definite matrices: $\mathbf{x}^T \mathbf{A} \mathbf{x} > 0$ for all non-zero $\mathbf{x}$ . Valid covariance matrices must be positive semi-definite (allowing zero). This property ensures that no portfolio can have negative variance, a mathematical necessity for any coherent risk measure.

In[17]:

Code

# Verify covariance matrix properties
# 1. Symmetric
is_symmetric = np.allclose(sample_cov, sample_cov.T)

# 2. Positive semi-definite (all eigenvalues ≥ 0)
eigenvalues = np.linalg.eigvalsh(sample_cov)
is_psd = np.all(
    eigenvalues >= -1e-10
)  # Small tolerance for numerical precision

# Verify covariance matrix properties
# 1. Symmetric
is_symmetric = np.allclose(sample_cov, sample_cov.T)

# 2. Positive semi-definite (all eigenvalues ≥ 0)
eigenvalues = np.linalg.eigvalsh(sample_cov)
is_psd = np.all(
    eigenvalues >= -1e-10
)  # Small tolerance for numerical precision

Out[18]:

Console

Covariance matrix is symmetric: True
Eigenvalues: [0.0001167  0.00036008 0.00063233]
Covariance matrix is positive semi-definite: True

Both properties confirm we have a valid covariance matrix. Symmetry ensures that the covariance between assets A and B equals the covariance between B and A. Positive semi-definiteness guarantees that no portfolio can have negative variance—a mathematical necessity for any coherent risk measure.

Systems of Linear EquationsLink Copied

Many financial problems reduce to solving systems of linear equations. Hedge ratios, factor exposures, and arbitrage pricing all require solving $\mathbf{A}\mathbf{x} = \mathbf{b}$ for unknown $\mathbf{x}$ .

Why is this formulation so ubiquitous? Because linear systems capture the essence of constraints and requirements. In finance, we often face situations where multiple conditions must hold simultaneously. A hedge must neutralize exposure to several risk factors at once, a replicating portfolio must match the payoffs of a target in multiple scenarios, and factor loadings must explain returns across many time periods. Each condition contributes one equation, and the unknowns are the positions or weights we need to determine. Linear algebra provides systematic machinery for finding solutions when they exist and for characterizing what's possible when they don't.

The General ProblemLink Copied

A system of linear equations has the form:

\begin{aligned} a_{11}x_1 + a_{12}x_2 + \cdots + a_{1n}x_n &= b_1 \\ a_{21}x_1 + a_{22}x_2 + \cdots + a_{2n}x_n &= b_2 \\ &\vdots \\ a_{m1}x_1 + a_{m2}x_2 + \cdots + a_{mn}x_n &= b_m \end{aligned}

where:

$a_{ij}$ : the coefficient in equation $i$ for unknown $x_j$
$x_j$ : the $j$ -th unknown variable we're solving for
$b_i$ : the right-hand side constant of equation $i$
$m$ : the number of equations
$n$ : the number of unknowns

In matrix form: $\mathbf{A}\mathbf{x} = \mathbf{b}$ , where $\mathbf{A}$ is $m \times n$ , $\mathbf{x}$ is $n \times 1$ , and $\mathbf{b}$ is $m \times 1$ .

The matrix $\mathbf{A}$ encodes the structure of the problem—how each unknown contributes to each equation. Finding $\mathbf{x}$ means finding the combination of unknowns that simultaneously satisfies all constraints. In financial applications, $\mathbf{A}$ often represents sensitivities (like Greeks or factor exposures), $\mathbf{x}$ represents positions or weights we're solving for, and $\mathbf{b}$ represents target values we want to achieve.

The geometry of linear systems provides useful intuition. Each equation defines a hyperplane in the space of unknowns (a line in 2D, a plane in 3D, and so on). Solving the system means finding the point (or points, or nothing at all) where all these hyperplanes intersect. When there are exactly as many independent equations as unknowns, and the equations aren't contradictory, the hyperplanes intersect at a single point, giving the unique solution.

Matrix Inverses and Solving Square SystemsLink Copied

When $\mathbf{A}$ is square ( $n \times n$ ) and invertible, the solution is $\mathbf{x} = \mathbf{A}^{-1}\mathbf{b}$ . The inverse $\mathbf{A}^{-1}$ satisfies $\mathbf{A}\mathbf{A}^{-1} = \mathbf{A}^{-1}\mathbf{A} = \mathbf{I}$ .

The inverse matrix "undoes" the transformation represented by $\mathbf{A}$ . If $\mathbf{A}$ transforms inputs to outputs, then $\mathbf{A}^{-1}$ transforms outputs back to inputs. In the context of our linear system, we know the outputs ( $\mathbf{b}$ , the targets we want to achieve) and need to find the inputs ( $\mathbf{x}$ , the positions that achieve those targets). Multiplying by the inverse reverses the process, revealing the required inputs.

When Does an Inverse Exist?

A square matrix is invertible (or non-singular) when its determinant is non-zero, equivalently when its rows (or columns) are linearly independent. Economically, this means the equations provide truly independent constraints. No equation is redundant or contradictory with others.

Application: Factor ReplicationLink Copied

Consider replicating a target portfolio's factor exposures using available assets. Suppose you have three assets with known exposures to two factors (market and size), and you want to construct a portfolio with specific target exposures:

In[19]:

Code

# Factor exposures: rows are assets, columns are factors (market, size)
factor_exposures = np.array(
    [
        [1.2, 0.8],  # Asset 1: high market beta, positive size exposure
        [0.8, -0.3],  # Asset 2: moderate market beta, negative size exposure
        [1.0, 0.2],  # Asset 3: market-neutral on size
    ]
)

# Target exposures: we want these factor loadings
target_exposure = np.array([1.0, 0.0])  # Market beta of 1, size-neutral

# We have 3 assets and 2 constraints, so we need an additional constraint
# Add constraint: weights sum to 1 (fully invested)
A = np.vstack([factor_exposures.T, np.ones(3)])
b = np.append(target_exposure, 1.0)

# Factor exposures: rows are assets, columns are factors (market, size)
factor_exposures = np.array(
    [
        [1.2, 0.8],  # Asset 1: high market beta, positive size exposure
        [0.8, -0.3],  # Asset 2: moderate market beta, negative size exposure
        [1.0, 0.2],  # Asset 3: market-neutral on size
    ]
)

# Target exposures: we want these factor loadings
target_exposure = np.array([1.0, 0.0])  # Market beta of 1, size-neutral

# We have 3 assets and 2 constraints, so we need an additional constraint
# Add constraint: weights sum to 1 (fully invested)
A = np.vstack([factor_exposures.T, np.ones(3)])
b = np.append(target_exposure, 1.0)

Out[20]:

Console

Factor exposure matrix (assets × factors):
[[ 1.2  0.8]
 [ 0.8 -0.3]
 [ 1.   0.2]]

Target factor exposures: [1. 0.]
Additional constraint: weights sum to 1

Augmented system A:
[[ 1.2  0.8  1. ]
 [ 0.8 -0.3  0.2]
 [ 1.   1.   1. ]]

Target vector b: [1. 0. 1.]

We have 3 unknowns (weights) and 3 equations (2 factor constraints + 1 budget constraint). Let's solve:

In[21]:

Code

# Solve the system
weights = np.linalg.solve(A, b)

# Verify the solution
achieved_exposures = factor_exposures.T @ weights
weight_sum = np.sum(weights)

# Solve the system
weights = np.linalg.solve(A, b)

# Verify the solution
achieved_exposures = factor_exposures.T @ weights
weight_sum = np.sum(weights)

Out[22]:

Console

Solution weights: [-2. -2.  5.]

Verification:
  Factor exposures achieved: [ 1.00000000e+00 -2.22044605e-16]
  Weight sum: 1.000000

Interpretation:
  Asset 1: -200.00% of portfolio
  Asset 2: -200.00% of portfolio
  Asset 3: +500.00% of portfolio

The solution tells us exactly how to combine the three assets to achieve our target factor profile: market beta of 1 with zero size exposure, while being fully invested.

Application: Delta HedgingLink Copied

In derivatives trading, you often need to hedge exposure to multiple risk factors. Suppose you hold a portfolio of options and want to eliminate sensitivity to the underlying price (delta) and volatility (vega). You have two hedging instruments available:

In[23]:

Code

# Current portfolio Greeks (what we need to hedge)
portfolio_delta = 150  # Long 150 deltas
portfolio_vega = -2000  # Short 2000 vegas

# Available hedging instruments
# Instrument 1: Stock (delta=1, vega=0)
# Instrument 2: ATM option (delta=0.5, vega=100)

hedge_matrix = np.array(
    [
        [1.0, 0.5],  # Delta of each instrument
        [0.0, 100.0],  # Vega of each instrument
    ]
)

# Target: neutralize the portfolio Greeks
target = np.array([-portfolio_delta, -portfolio_vega])

# Solve for hedge quantities
hedge_quantities = np.linalg.solve(hedge_matrix, target)

# Current portfolio Greeks (what we need to hedge)
portfolio_delta = 150  # Long 150 deltas
portfolio_vega = -2000  # Short 2000 vegas

# Available hedging instruments
# Instrument 1: Stock (delta=1, vega=0)
# Instrument 2: ATM option (delta=0.5, vega=100)

hedge_matrix = np.array(
    [
        [1.0, 0.5],  # Delta of each instrument
        [0.0, 100.0],  # Vega of each instrument
    ]
)

# Target: neutralize the portfolio Greeks
target = np.array([-portfolio_delta, -portfolio_vega])

# Solve for hedge quantities
hedge_quantities = np.linalg.solve(hedge_matrix, target)

Out[24]:

Console

Portfolio Greeks to hedge:
  Delta: +150
  Vega: -2000

Hedging instruments (Delta, Vega):
  Stock: (1, 0)
  ATM Option: (0.5, 100)

Required hedge quantities:
  Stock: -160 shares
  Options: +20 contracts

Resulting portfolio Greeks:
  Delta: 0.00
  Vega: 0.00

We need to short 140 shares of stock and buy 20 options to neutralize both delta and vega exposure. This is precisely the kind of calculation that happens in real-time at trading desks.

Least Squares: Overdetermined SystemsLink Copied

When we have more equations than unknowns ( $m > n$ ), the system is overdetermined and typically has no exact solution. This happens often in finance. We have many data points (days of returns) but few parameters to estimate (factor exposures).

The least squares solution minimizes the sum of squared residuals:

\hat{\mathbf{x}} = \arg\min_{\mathbf{x}} \|\mathbf{A}\mathbf{x} - \mathbf{b}\|_2^2

where:

$\hat{\mathbf{x}}$ : the least squares estimate of $\mathbf{x}$
$\|\cdot\|_2^2$ : the squared L2 norm (sum of squared elements)
$\mathbf{A}\mathbf{x} - \mathbf{b}$ : the residual vector

The solution is given by the normal equations:

\hat{\mathbf{x}} = (\mathbf{A}^T\mathbf{A})^{-1}\mathbf{A}^T\mathbf{b}

where:

$\mathbf{A}^T\mathbf{A}$ : a square $n \times n$ matrix that captures the "self-correlation" of the explanatory variables
$(\mathbf{A}^T\mathbf{A})^{-1}$ : inverts this correlation structure to prevent double-counting
$\mathbf{A}^T\mathbf{b}$ : correlates the explanatory variables with the target values

The matrix $(\mathbf{A}^T\mathbf{A})^{-1}\mathbf{A}^T$ is called the Moore-Penrose pseudoinverse of $\mathbf{A}$ .

Intuitively, least squares finds the $\mathbf{x}$ that makes $\mathbf{A}\mathbf{x}$ as close as possible to $\mathbf{b}$ in Euclidean distance. The residual vector $\mathbf{b} - \mathbf{A}\hat{\mathbf{x}}$ is orthogonal to the column space of $\mathbf{A}$ . We've extracted all the signal that $\mathbf{A}$ can explain. What remains (the residual) lies in a direction that no linear combination of our explanatory variables can reach.

This geometric picture explains why least squares produces the best linear fit. Among all possible values of $\mathbf{x}$ , the least squares solution generates an $\mathbf{A}\mathbf{x}$ that is the orthogonal projection of $\mathbf{b}$ onto the subspace spanned by the columns of $\mathbf{A}$ . Orthogonal projection onto a subspace always yields the closest point in that subspace. This is a basic geometric fact. The residual, being orthogonal to the subspace, represents the irreducible error that cannot be explained by our model.

This is exactly what linear regression computes.

In[25]:

Code

# Estimate factor exposures from return data
# We have many days of returns (observations) and want to find factor betas

np.random.seed(123)
n_days = 100

# True factor betas we're trying to discover
true_betas = np.array([1.2, -0.3])  # Market beta, size beta

# Simulated factor returns
market_returns = np.random.normal(0.0005, 0.01, n_days)
size_returns = np.random.normal(0.0001, 0.008, n_days)

# Factor matrix: each row is a day, each column is a factor
factor_matrix = np.column_stack([market_returns, size_returns])

# Stock returns = factor exposures × factor returns + noise
noise = np.random.normal(0, 0.005, n_days)
stock_returns = factor_matrix @ true_betas + noise

# Estimate betas via least squares
estimated_betas = np.linalg.lstsq(factor_matrix, stock_returns, rcond=None)[0]

# Estimate factor exposures from return data
# We have many days of returns (observations) and want to find factor betas

np.random.seed(123)
n_days = 100

# True factor betas we're trying to discover
true_betas = np.array([1.2, -0.3])  # Market beta, size beta

# Simulated factor returns
market_returns = np.random.normal(0.0005, 0.01, n_days)
size_returns = np.random.normal(0.0001, 0.008, n_days)

# Factor matrix: each row is a day, each column is a factor
factor_matrix = np.column_stack([market_returns, size_returns])

# Stock returns = factor exposures × factor returns + noise
noise = np.random.normal(0, 0.005, n_days)
stock_returns = factor_matrix @ true_betas + noise

# Estimate betas via least squares
estimated_betas = np.linalg.lstsq(factor_matrix, stock_returns, rcond=None)[0]

Out[26]:

Console

True factor betas: [ 1.2 -0.3]
Estimated betas: [ 1.1715 -0.2855]

Estimation error:
  Market beta error: -0.0285
  Size beta error: +0.0145

Out[27]:

Visualization

Least squares factor estimation showing the relationship between actual stock returns and factor-predicted returns. Points close to the diagonal line indicate accurate predictions from the estimated factor model.

The least squares estimates are close to the true values, with small errors due to the noise in returns. With more data (more days), the estimates would converge to the true values.

Matrix DecompositionsLink Copied

Matrix decompositions break a matrix into simpler components, exposing structure that's hidden in the raw numbers. In quantitative finance, decompositions help us understand risk sources, reduce dimensionality, and improve numerical stability.

Think of decomposition as a kind of mathematical X-ray. A covariance matrix appears as a dense array of numbers, but its eigendecomposition reveals the underlying modes of variation, the basic ways ways in which assets tend to move together. A return matrix might contain hundreds of thousands of numbers, but its singular value decomposition exposes the dominant patterns that generate most of the observed variation. Decompositions transform opaque numerical arrays into interpretable components with clear financial meaning.

Eigenvalue DecompositionLink Copied

An eigenvalue decomposition expresses a square matrix $\mathbf{A}$ as:

\mathbf{A} = \mathbf{V}\mathbf{\Lambda}\mathbf{V}^{-1}

where:

$\mathbf{V}$ : an $n \times n$ matrix whose columns are the eigenvectors $\mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_n$
$\mathbf{\Lambda}$ : a diagonal matrix with eigenvalues $\lambda_1, \lambda_2, \ldots, \lambda_n$ on the diagonal
$\mathbf{V}^{-1}$ : the inverse of the eigenvector matrix, which "undoes" the coordinate transformation

This decomposition shows that $\mathbf{A}$ acts as: (1) rotating into the eigenvector coordinate system via $\mathbf{V}^{-1}$ , (2) scaling each axis by the corresponding eigenvalue via $\mathbf{\Lambda}$ , and (3) rotating back via $\mathbf{V}$ . For covariance matrices, we can decompose complex correlations into independent directions of variation.

The power of this decomposition lies in the simplicity of the diagonal matrix $\mathbf{\Lambda}$ . A diagonal matrix just scales each coordinate independently, with no mixing between directions. All the complexity of $\mathbf{A}$ is absorbed into finding the right coordinate system (the eigenvectors) in which the matrix action becomes simple scaling.

Eigenvalues and Eigenvectors

An eigenvector $\mathbf{v}$ of matrix $\mathbf{A}$ satisfies $\mathbf{A}\mathbf{v} = \lambda\mathbf{v}$ , where $\lambda$ is the eigenvalue. The matrix simply scales the eigenvector by $\lambda$ , without changing its direction. For symmetric matrices like covariance matrices, eigenvectors are orthogonal and eigenvalues are real.

For a covariance matrix, the eigenvectors represent the principal directions of variation in the data, and the eigenvalues represent the variance along each direction. The largest eigenvalue corresponds to the direction of maximum variance. This interpretation makes eigendecomposition indispensable for understanding portfolio risk: the dominant eigenvector shows which combination of assets contributes most to overall portfolio variance.

In[28]:

Code

# Eigendecomposition of the covariance matrix
eigenvalues, eigenvectors = np.linalg.eigh(sample_cov)

# Sort by eigenvalue magnitude (descending)
sort_idx = np.argsort(eigenvalues)[::-1]
eigenvalues = eigenvalues[sort_idx]
eigenvectors = eigenvectors[:, sort_idx]

# Calculate variance explained by each eigenvector
total_variance = np.sum(eigenvalues)
variance_explained = eigenvalues / total_variance
cumulative_variance = np.cumsum(variance_explained)

# Eigendecomposition of the covariance matrix
eigenvalues, eigenvectors = np.linalg.eigh(sample_cov)

# Sort by eigenvalue magnitude (descending)
sort_idx = np.argsort(eigenvalues)[::-1]
eigenvalues = eigenvalues[sort_idx]
eigenvectors = eigenvectors[:, sort_idx]

# Calculate variance explained by each eigenvector
total_variance = np.sum(eigenvalues)
variance_explained = eigenvalues / total_variance
cumulative_variance = np.cumsum(variance_explained)

Out[29]:

Console

Eigenvalues of covariance matrix:
  λ1 = 0.00063233 (57.0% of variance)
  λ2 = 0.00036008 (32.5% of variance)
  λ3 = 0.00011670 (10.5% of variance)

Total variance: 0.00110911

First eigenvector (principal direction):
  [0.43878792 0.32747683 0.83679393]

Interpretation: This is the direction in asset-space along which
returns vary most. It often represents the 'market' factor.

Out[30]:

Visualization

Individual variance contribution by each principal component. The first component captures the majority of total variance.

Cumulative variance explained as components are added. Two components explain nearly all variance in this three-asset example.

Principal Component Analysis (PCA)Link Copied

PCA uses eigendecomposition to transform correlated variables into uncorrelated principal components. In finance, PCA identifies the main drivers of asset returns: often, a few factors explain most of the variation in a large universe of assets.

The principal components are projections of the data onto the eigenvectors:

\mathbf{Z} = \mathbf{X}\mathbf{V}

where:

$\mathbf{Z}$ : the $T \times N$ matrix of principal component scores
$\mathbf{X}$ : the $T \times N$ centered data matrix (each row is an observation, each column is a variable minus its mean)
$\mathbf{V}$ : the $N \times N$ matrix of eigenvectors (principal component loadings)

Each column of $\mathbf{Z}$ represents a principal component, a new synthetic variable that captures a specific pattern of co-movement in the original data. The first principal component captures the most variance, the second captures the most remaining variance while being uncorrelated with the first, and so on. In finance, these components often correspond to interpretable market factors like overall market direction, sector rotations, or style tilts.

Why does this transformation produce uncorrelated components? The eigenvectors of a symmetric matrix (like a covariance matrix) are orthogonal. They point in perpendicular directions. When we project data onto perpendicular axes, the resulting coordinates are uncorrelated by construction. The eigenvalues tell us how much variance lies along each axis, so ordering by eigenvalue puts the most important component first.

In[31]:

Code

# Apply PCA to return data
from sklearn.decomposition import PCA

# Center the returns
centered_returns = returns_data - returns_data.mean(axis=0)

# Fit PCA
pca = PCA()
principal_components = pca.fit_transform(centered_returns)

# Examine results
pc_variance = pca.explained_variance_ratio_
loadings = pca.components_  # Each row is an eigenvector

# Apply PCA to return data
from sklearn.decomposition import PCA

# Center the returns
centered_returns = returns_data - returns_data.mean(axis=0)

# Fit PCA
pca = PCA()
principal_components = pca.fit_transform(centered_returns)

# Examine results
pc_variance = pca.explained_variance_ratio_
loadings = pca.components_  # Each row is an eigenvector

Out[32]:

Console

PCA Results:
--------------------------------------------------
PC1: 57.0% variance explained
PC2: 32.5% variance explained
PC3: 10.5% variance explained

Cumulative variance: [0.57012852 0.8947838  1.        ]

Factor loadings (how each PC relates to original assets):
          Asset1    Asset2    Asset3
PC1:    +0.4388    +0.3275    +0.8368
PC2:    +0.7356    +0.4040    -0.5438
PC3:    -0.5162    +0.8541    -0.0636

Out[33]:

Visualization

Principal component loadings showing how each PC relates to the original assets. PC1 has same-sign loadings (market factor), while PC2 and PC3 capture relative value relationships between assets.

The first principal component typically captures broad market movements—all loadings have the same sign. Later components capture relative value or sector bets, and loadings have mixed signs. This structure emerges naturally from the correlation structure of returns.

PCA in Practice: Interest Rate CurvesLink Copied

One of the most powerful applications of PCA in finance is analyzing yield curve movements. Interest rates at different maturities are highly correlated, but PCA reveals that three factors explain over 95% of yield curve variation:

Level (PC1): Parallel shift in the curve. All rates move together.
Slope (PC2): Steepening or flattening. Short and long rates move in opposite directions.
Curvature (PC3): Bending. The middle of the curve moves relative to the ends.

This parsimony (three numbers summarizing an entire term structure) shows the value of PCA for dimensionality reduction. Instead of tracking eight or more rates independently, a fixed income trader can focus on level, slope, and curvature exposures.

In[34]:

Code

# Simulate yield curve data (simplified example)
np.random.seed(456)
n_days = 500
maturities = [1, 2, 3, 5, 7, 10, 20, 30]  # Years

# Generate correlated yield changes using a realistic correlation structure
# Short rates are more correlated with each other than with long rates
n_rates = len(maturities)
base_corr = np.eye(n_rates)
for i in range(n_rates):
    for j in range(n_rates):
        base_corr[i, j] = np.exp(-0.15 * abs(i - j))

# Volatilities decrease slightly with maturity
vols = 0.05 * np.array([1.0, 0.95, 0.9, 0.85, 0.8, 0.75, 0.7, 0.65])
cov_matrix = np.outer(vols, vols) * base_corr

# Generate yield changes
yield_changes = np.random.multivariate_normal(
    np.zeros(n_rates), cov_matrix, n_days
)

# Apply PCA
pca_yields = PCA()
pca_yields.fit(yield_changes)

# Simulate yield curve data (simplified example)
np.random.seed(456)
n_days = 500
maturities = [1, 2, 3, 5, 7, 10, 20, 30]  # Years

# Generate correlated yield changes using a realistic correlation structure
# Short rates are more correlated with each other than with long rates
n_rates = len(maturities)
base_corr = np.eye(n_rates)
for i in range(n_rates):
    for j in range(n_rates):
        base_corr[i, j] = np.exp(-0.15 * abs(i - j))

# Volatilities decrease slightly with maturity
vols = 0.05 * np.array([1.0, 0.95, 0.9, 0.85, 0.8, 0.75, 0.7, 0.65])
cov_matrix = np.outer(vols, vols) * base_corr

# Generate yield changes
yield_changes = np.random.multivariate_normal(
    np.zeros(n_rates), cov_matrix, n_days
)

# Apply PCA
pca_yields = PCA()
pca_yields.fit(yield_changes)

Out[35]:

Console

Yield Curve PCA Results:
--------------------------------------------------
Variance explained by each component:
  PC1: 72.8%
  PC2: 14.4%
  PC3: 5.2%
  PC4: 2.7%
  PC5: 1.8%

First 3 PCs explain 92.3% of variance

Out[36]:

Visualization

Line chart showing three principal component loadings across bond maturities from 1 to 30 years. — Principal component loadings for yield curve movements. PC1 represents parallel shifts (all maturities move together), PC2 captures slope changes (short vs. long rates), and PC3 reflects curvature adjustments (middle vs. ends of the curve).

PC1 shows positive loadings across all maturities, confirming it captures level shifts. PC2 is negative for short maturities and positive for long ones, capturing slope. PC3 has a distinctive "humped" pattern—negative at the ends, positive in the middle—capturing curvature. Bond portfolio managers use these insights to decompose and hedge their interest rate risk.

Singular Value Decomposition (SVD)Link Copied

While eigendecomposition only applies to square matrices, Singular Value Decomposition works for any matrix. For an $m \times n$ matrix $\mathbf{A}$ :

\mathbf{A} = \mathbf{U}\mathbf{\Sigma}\mathbf{V}^T

where:

$\mathbf{U}$ : an $m \times m$ orthogonal matrix whose columns are the left singular vectors (patterns in the row space, e.g., time patterns)
$\mathbf{\Sigma}$ : an $m \times n$ diagonal matrix with non-negative singular values $\sigma_1 \geq \sigma_2 \geq \cdots \geq 0$ on the diagonal
$\mathbf{V}$ : an $n \times n$ orthogonal matrix whose columns are the right singular vectors (patterns in the column space, e.g., asset patterns)

The singular values measure the "importance" of each component. For a return matrix with days as rows and assets as columns, $\mathbf{U}$ shows when certain patterns occurred, $\mathbf{V}$ shows which assets participated in each pattern, and $\Sigma$ shows how strong each pattern was.

The strength of SVD lies in its universality and interpretability. Any matrix, not just square symmetric ones, can be decomposed into rotations and scalings. The singular values are always non-negative and ordered by magnitude, providing a natural ranking of importance. Truncating the SVD by keeping only the largest singular values gives the best low-rank approximation to the original matrix in terms of Frobenius norm, a result known as the Eckart-Young theorem. This makes SVD the mathematical foundation for dimensionality reduction and data compression.

SVD is useful for handling non-square return matrices (different numbers of days vs. assets) and for computing the pseudoinverse used in least squares solutions.

In[37]:

Code

# SVD of return matrix
U, singular_values, Vt = np.linalg.svd(centered_returns, full_matrices=False)

# Singular values relate to eigenvalues of covariance matrix
# σ² ≈ eigenvalue × (n-1)
eigenvalue_estimate = singular_values**2 / (n_days - 1)

# SVD of return matrix
U, singular_values, Vt = np.linalg.svd(centered_returns, full_matrices=False)

# Singular values relate to eigenvalues of covariance matrix
# σ² ≈ eigenvalue × (n-1)
eigenvalue_estimate = singular_values**2 / (n_days - 1)

Out[38]:

Console

Singular values of return matrix:
  σ1 = 0.3984
  σ2 = 0.3006
  σ3 = 0.1711

Relationship to eigenvalues:
  Eigenvalues of cov matrix: [0.00063233 0.00036008 0.0001167 ]
  σ²/(n-1):                  [3.1807e-04 1.8112e-04 5.8700e-05]

The close match between eigenvalues and σ²/(n-1) confirms the mathematical relationship between SVD and eigendecomposition. The largest singular value corresponds to the dominant pattern in the return matrix, typically the market factor. This relationship is why PCA can be computed efficiently via SVD, which is numerically more stable than computing the eigendecomposition of the covariance matrix directly.

Cholesky DecompositionLink Copied

For positive definite matrices like valid covariance matrices, Cholesky decomposition provides a "square root":

\mathbf{\Sigma} = \mathbf{L}\mathbf{L}^T

where:

$\mathbf{L}$ : a lower triangular matrix (all entries above the diagonal are zero)
$\mathbf{L}^T$ : the transpose of $\mathbf{L}$ (an upper triangular matrix)

The Cholesky factor $\mathbf{L}$ is unique for positive definite matrices and can be computed efficiently in $O(n^3/3)$ operations, half the cost of a general matrix factorization.

where $\mathbf{L}$ is lower triangular. This is computationally efficient and important for simulating correlated random variables, essential for Monte Carlo methods in derivatives pricing and risk management.

The lower triangular structure of $\mathbf{L}$ has a clear interpretation in terms of sequential dependence. The first asset's random component depends only on one source of randomness. The second asset depends on the first asset's randomness through correlation, plus its own independent randomness. The third asset depends on both previous assets' randomness plus a third independent source. This cascading structure mirrors how correlated variables can be constructed. Start with independent random variables and mix them according to the correlation structure.

In[39]:

Code

# Cholesky decomposition for generating correlated random samples
L = np.linalg.cholesky(sample_cov)

# Generate uncorrelated standard normal samples
n_simulations = 10000
uncorrelated = np.random.standard_normal((n_simulations, 3))

# Transform to correlated samples
correlated = uncorrelated @ L.T

# Verify the correlation structure
simulated_cov = np.cov(correlated, rowvar=False)

# Cholesky decomposition for generating correlated random samples
L = np.linalg.cholesky(sample_cov)

# Generate uncorrelated standard normal samples
n_simulations = 10000
uncorrelated = np.random.standard_normal((n_simulations, 3))

# Transform to correlated samples
correlated = uncorrelated @ L.T

# Verify the correlation structure
simulated_cov = np.cov(correlated, rowvar=False)

Out[40]:

Console

Original covariance matrix (×10,000):
[[3.4765 1.4642 0.9198]
 [1.4642 2.1172 0.8783]
 [0.9198 0.8783 5.4973]]

Simulated covariance matrix (×10,000):
[[3.5015 1.4471 0.8961]
 [1.4471 2.1432 0.8691]
 [0.8961 0.8691 5.6059]]

Cholesky factor L (lower triangular, ×100):
[[1.8645 0.     0.    ]
 [0.7853 1.225  0.    ]
 [0.4933 0.4008 2.2569]]

Out[41]:

Visualization

Scatter plot of Asset 1 vs Asset 2 returns showing positive correlation.

Scatter plot of Asset 1 vs Asset 3 returns showing moderate correlation.

Scatter plot of Asset 2 vs Asset 3 returns showing moderate correlation.

The simulated covariance matrix closely matches the original, confirming that our Cholesky-based sampling correctly reproduces the correlation structure. The lower triangular form of L shows how each asset's random component builds on the previous ones: Asset 1 has independent randomness, Asset 2 combines Asset 1's randomness with its own, and Asset 3 incorporates components from both. This cascading structure is why Cholesky decomposition efficiently generates correlated samples.

The Cholesky decomposition allows us to generate correlated random returns for Monte Carlo simulation. This is essential for pricing path-dependent derivatives and computing Value-at-Risk.

A Complete Example: Minimum Variance PortfolioLink Copied

Let's bring together the linear algebra concepts to solve a standard portfolio optimization problem: finding the portfolio with minimum variance subject to being fully invested.

The optimization problem is:

\min_{\mathbf{w}} \mathbf{w}^T \mathbf{\Sigma} \mathbf{w}

subject to:

\mathbf{1}^T \mathbf{w} = 1

where $\mathbf{1}$ is a vector of ones (enforcing that weights sum to 1, meaning the portfolio is fully invested).

This problem asks which portfolio, among all portfolios that invest 100% of capital (no cash, no leverage), which has the smallest variance? The objective function is the quadratic form we encountered earlier, portfolio variance expressed in matrix notation. The constraint is a linear equation requiring weights to sum to unity.

Using Lagrange multipliers, the solution is:

\mathbf{w}^* = \frac{\mathbf{\Sigma}^{-1} \mathbf{1}}{\mathbf{1}^T \mathbf{\Sigma}^{-1} \mathbf{1}}

where:

$\mathbf{w}^*$ : the optimal weight vector that minimizes portfolio variance
$\mathbf{\Sigma}^{-1}$ : the inverse of the covariance matrix
The numerator $\mathbf{\Sigma}^{-1} \mathbf{1}$ gives "risk-adjusted" weights favoring low-variance and negatively-correlated assets
The denominator $\mathbf{1}^T \mathbf{\Sigma}^{-1} \mathbf{1}$ normalizes these weights to sum to 1

The appearance of $\mathbf{\Sigma}^{-1}$ in the solution formula is typical of quadratic optimization with linear constraints. Intuitively, the inverse covariance matrix reweights assets to account for correlation. An asset that seems low-variance in isolation might contribute more risk than expected if it's highly correlated with other holdings. The inverse covariance matrix adjusts for these interactions, identifying the truly risk-efficient allocation.

In[42]:

Code

# Find the minimum variance portfolio
ones = np.ones(3)

# Compute inverse of covariance matrix
cov_inv = np.linalg.inv(sample_cov)

# Apply the analytical formula
numerator = cov_inv @ ones
denominator = ones @ cov_inv @ ones
min_var_weights = numerator / denominator

# Calculate portfolio statistics
min_var_portfolio_var = min_var_weights @ sample_cov @ min_var_weights
min_var_portfolio_vol = np.sqrt(min_var_portfolio_var) * np.sqrt(252)

# Compare with equal-weight portfolio
equal_weights = np.array([1 / 3, 1 / 3, 1 / 3])
equal_weight_var = equal_weights @ sample_cov @ equal_weights
equal_weight_vol = np.sqrt(equal_weight_var) * np.sqrt(252)

# Find the minimum variance portfolio
ones = np.ones(3)

# Compute inverse of covariance matrix
cov_inv = np.linalg.inv(sample_cov)

# Apply the analytical formula
numerator = cov_inv @ ones
denominator = ones @ cov_inv @ ones
min_var_weights = numerator / denominator

# Calculate portfolio statistics
min_var_portfolio_var = min_var_weights @ sample_cov @ min_var_weights
min_var_portfolio_vol = np.sqrt(min_var_portfolio_var) * np.sqrt(252)

# Compare with equal-weight portfolio
equal_weights = np.array([1 / 3, 1 / 3, 1 / 3])
equal_weight_var = equal_weights @ sample_cov @ equal_weights
equal_weight_vol = np.sqrt(equal_weight_var) * np.sqrt(252)

Out[43]:

Console

Minimum Variance Portfolio:
--------------------------------------------------
Optimal weights:
  Asset 1: 19.59%
  Asset 2: 61.58%
  Asset 3: 18.83%

Annualized volatility: 21.04%

Equal-Weight Portfolio:
--------------------------------------------------
Weights: [0.33333333 0.33333333 0.33333333]
Annualized volatility: 22.21%

Volatility reduction: 1.17%

Out[44]:

Visualization

Portfolio weight allocations comparing minimum variance and equal-weight strategies.

Annualized volatility comparison showing the risk reduction from optimization.

The minimum variance portfolio allocates more to lower-volatility assets and exploits negative or low correlations to reduce overall portfolio risk. Notice that this solution used matrix inversion ( $\Sigma^{-1}$ ) and multiple matrix-vector multiplications—fundamental linear algebra operations that power quantitative portfolio construction.

Key ParametersLink Copied

The key parameters for linear algebra operations in quantitative finance are:

Portfolio weights (w): The fraction of portfolio value allocated to each asset. Must sum to 1 for a fully invested portfolio.
Covariance matrix (Σ): Captures variance of individual assets (diagonal) and co-movement between assets (off-diagonal). Must be positive semi-definite.
Eigenvalues (λ): Represent variance along each principal direction. Larger eigenvalues indicate more important risk factors.
Eigenvectors (V): Define the principal directions of variation. For covariance matrices, these are orthogonal and reveal underlying factor structure.
Condition number: Ratio of largest to smallest eigenvalue. High values indicate numerical instability in matrix inversion.

Limitations and Practical ConsiderationsLink Copied

Linear algebra provides clean solutions, but real-world implementation requires caution around several issues.

Estimation error is the most significant challenge. Covariance matrices estimated from historical data are noisy, especially with many assets relative to the number of observations. A 500-stock universe with 252 days of data means estimating 125,250 covariance parameters from only 126,000 data points, almost one parameter per observation. This leads to unstable matrix inverses and portfolio weights that swing wildly with small changes in the data. Practitioners address this through shrinkage estimators, factor models, and regularization techniques that we'll explore in later chapters.

Numerical stability compounds the estimation problem. Covariance matrices with similar eigenvalues or near-zero eigenvalues produce high condition numbers (the ratio of largest to smallest eigenvalue) that make matrix inversion unreliable. Double-precision floating point arithmetic can introduce errors that propagate through calculations. Using SVD-based pseudoinverses rather than direct matrix inversion, and checking condition numbers before inverting, helps avoid numerical problems.

Non-stationarity presents a major challenge to our matrix-based models. The covariance structure of asset returns shifts over time as market conditions change. A covariance matrix estimated during calm markets will underestimate risk during crises. This is exactly when accurate risk measurement matters most. Rolling window estimation, exponentially weighted covariance, and regime-switching models attempt to capture time-varying structure, but no approach perfectly solves the problem of predicting future relationships from past data.

SummaryLink Copied

This chapter covered the linear algebra foundations that underpin quantitative finance. Key concepts include:

Vectors represent portfolios, returns, and factor exposures. The dot product of weight and return vectors gives portfolio return. Vector norms measure size and appear in regularization constraints.

Matrices organize returns over time and across assets. The covariance matrix describes asset relationships, and the quadratic form $\mathbf{w}^T \Sigma \mathbf{w}$ computes portfolio variance. Matrix multiplication transforms between spaces and aggregates calculations efficiently.

Systems of linear equations solve hedging problems, factor replication, and regression. Hedging finds instrument quantities to neutralize risk, replication matches target exposures, and regression estimates factor loadings. Least squares handles overdetermined systems by minimizing squared residuals.

Matrix decompositions expose structure hidden in raw data. Eigendecomposition of covariance matrices identifies principal risk directions. PCA reduces dimensionality by projecting onto the dominant eigenvectors. Cholesky decomposition enables efficient simulation of correlated variables.

These tools form the computational backbone for the portfolio optimization, factor models, and risk management techniques developed in subsequent chapters. Understanding them lets you implement quantitative strategies from first principles rather than treating standard library functions as black boxes.

QuizLink Copied

Ready to test your understanding? Take this quick quiz to reinforce what you've learned about linear algebra in quantitative finance.

Loading component...

Track your reading progress

Sign in to mark chapters as read and track your learning journey

Sign in →

Comments

Back to Quantitative Finance

Previous Chapter

Statistical Data Analysis and Inference

Next Chapter

Differential Calculus and Optimization Basics

Reference

BIBTEXAcademic

@misc{linearalgebraforquantitativefinanceportfoliomath, author = {Michael Brenndoerfer}, title = {Linear Algebra for Quantitative Finance: Portfolio Math}, year = {2025}, url = {https://mbrenndoerfer.com/writing/linear-algebra-quantitative-finance-vectors-matrices-pca}, organization = {mbrenndoerfer.com}, note = {Accessed: 2025-12-25} }

APAAcademic

Michael Brenndoerfer (2025). Linear Algebra for Quantitative Finance: Portfolio Math. Retrieved from https://mbrenndoerfer.com/writing/linear-algebra-quantitative-finance-vectors-matrices-pca

MLAAcademic

Michael Brenndoerfer. "Linear Algebra for Quantitative Finance: Portfolio Math." 2025. Web. 12/25/2025. <https://mbrenndoerfer.com/writing/linear-algebra-quantitative-finance-vectors-matrices-pca>.

CHICAGOAcademic

Michael Brenndoerfer. "Linear Algebra for Quantitative Finance: Portfolio Math." Accessed 12/25/2025. https://mbrenndoerfer.com/writing/linear-algebra-quantitative-finance-vectors-matrices-pca.

HARVARDAcademic

Michael Brenndoerfer (2025) 'Linear Algebra for Quantitative Finance: Portfolio Math'. Available at: https://mbrenndoerfer.com/writing/linear-algebra-quantitative-finance-vectors-matrices-pca (Accessed: 12/25/2025).

SimpleBasic

Michael Brenndoerfer (2025). Linear Algebra for Quantitative Finance: Portfolio Math. https://mbrenndoerfer.com/writing/linear-algebra-quantitative-finance-vectors-matrices-pca

Direct link:

https://mbrenndoerfer.com/writing/linear-algebra-quantitative-finance-vectors-matrices-pca

About the author: Michael Brenndoerfer

All opinions expressed here are my own and do not reflect the views of my employer.

Michael currently works as an Associate Director of Data Science at EQT Partners in Singapore, leading AI and data initiatives across private capital investments.

With over a decade of experience spanning private equity, management consulting, and software engineering, he specializes in building and scaling analytics capabilities from the ground up. He has published research in leading AI conferences and holds expertise in machine learning, natural language processing, and value creation through data.

View Full Resume Publications Contact Books

Linear Algebra for Quantitative Finance: Portfolio Math

Linear Algebra for Quantitative FinanceLink Copied

Vectors in FinanceLink Copied

Vector BasicsLink Copied

Vector OperationsLink Copied

The Dot Product: Portfolio ReturnsLink Copied

Vector Norms: Measuring SizeLink Copied

Matrices in FinanceLink Copied

Matrix FundamentalsLink Copied

Matrix MultiplicationLink Copied

The Covariance MatrixLink Copied

Portfolio Variance: The Quadratic FormLink Copied

Matrix Transpose and Special MatricesLink Copied

Systems of Linear EquationsLink Copied

The General ProblemLink Copied

Matrix Inverses and Solving Square SystemsLink Copied

Application: Factor ReplicationLink Copied

Application: Delta HedgingLink Copied

Least Squares: Overdetermined SystemsLink Copied

Matrix DecompositionsLink Copied

Eigenvalue DecompositionLink Copied

Principal Component Analysis (PCA)Link Copied

PCA in Practice: Interest Rate CurvesLink Copied

Singular Value Decomposition (SVD)Link Copied

Cholesky DecompositionLink Copied

A Complete Example: Minimum Variance PortfolioLink Copied

Key ParametersLink Copied

Limitations and Practical ConsiderationsLink Copied

SummaryLink Copied

QuizLink Copied

Comments

Reference

About the author: Michael Brenndoerfer

Related Content

Differential Calculus and Optimization for Quantitative Finance

Statistical Data Analysis & Inference in Finance

Probability Distributions in Finance: Normal, Lognormal & Fat Tails

Stay updated

Comments

About the author: Michael Brenndoerfer

Related Content

Differential Calculus and Optimization for Quantitative Finance

Statistical Data Analysis & Inference in Finance

Probability Distributions in Finance: Normal, Lognormal & Fat Tails

Stay updated