Position Sizing & Leverage: Kelly Criterion Strategy

Michael BrenndoerferJanuary 23, 202646 min read

Master optimal position sizing using the Kelly Criterion, risk budgeting, and volatility targeting. Learn how leverage impacts drawdowns and long-term growth.

Reading Level

Choose your expertise level to adjust how many terms are explained. Beginners see more tooltips, experts see fewer to maintain reading flow. Hover over underlined terms for instant definitions.

Position Sizing and Leverage Management

A brilliant trading strategy with a genuine edge can still lead to ruin if positions are sized incorrectly. Conversely, a mediocre strategy with excellent position sizing can outperform a superior strategy with poor sizing over time. Position sizing, determining how much capital to allocate to each trade, is one of the most underappreciated aspects of quantitative trading, yet it fundamentally determines whether a strategy compounds wealth or destroys it.

In earlier chapters, we developed strategies for mean reversion, momentum, factor investing, and other approaches. We learned to backtest these strategies and measure their performance. But we largely sidestepped a crucial question: given a strategy with a positive expected return, how much should you bet? The answer isn't "as much as possible." Aggressive betting increases variance and can lead to catastrophic drawdowns from which recovery becomes mathematically improbable. If position sizing is too conservative, you leave substantial returns on the table, failing to adequately utilize your edge.

This chapter addresses the mathematics and practice of optimal position sizing. We begin with the Kelly Criterion, a foundational result from information theory that provides a principled answer to optimal bet sizing. We then extend to multi-strategy portfolios through risk budgeting and capital allocation frameworks. Finally, we examine the practical constraints of leverage limits, margin requirements, and the sobering lessons from funds that have blown up due to excessive leverage.

The Kelly Criterion

The Kelly Criterion, developed by John L. Kelly Jr. at Bell Labs in 1956, answers a simple question: Given an edge in a repeated game, what fraction of capital should be wagered to maximize long-term wealth growth? Kelly's insight, originally applied to information transmission over noisy channels, has profound implications for trading and investment. The criterion emerges from a fundamental tension in betting: bet too small and you fail to exploit your edge, but bet too large and you risk catastrophic losses that compound negatively over time. Kelly's mathematical framework resolves this tension by identifying the unique betting fraction that maximizes the expected geometric growth rate of capital.

The Single-Bet Case

Consider a simple gambling scenario where you repeatedly face a bet with the following characteristics:

  • Win probability: pp
  • Loss probability: q=1pq = 1 - p
  • Win payoff: bb (for every dollar wagered, you receive bb dollars profit)
  • Loss payoff: 1-1 (you lose your entire wager)

To understand why position sizing matters so critically, consider what happens when you repeatedly face this bet. If you bet fraction ff of your current capital on each round, the multiplicative nature of returns creates a fundamentally different dynamic than additive returns. After a win, your capital multiplies by (1+bf)(1 + bf), and after a loss it multiplies by (1f)(1 - f). This multiplicative structure means that the sequence of wins and losses matters far less than you might expect: what matters is the geometric growth rate.

After nn trials with ww wins and l=nwl = n - w losses, your final wealth WnW_n starting from initial wealth W0W_0 is:

Wn=W0(1+bf)w(1f)lW_n = W_0 (1 + bf)^w (1 - f)^l

where:

  • WnW_n: final wealth after nn trials
  • W0W_0: initial wealth
  • nn: number of trials
  • bb: win payoff (profit per dollar wagered)
  • ff: fraction of capital wagered per trial
  • ww: number of wins
  • ll: number of losses

This formula captures the essence of compounding. Notice that wealth is a product of factors, not a sum. This multiplicative structure has profound implications: a single devastating loss can overwhelm many small gains. If you bet your entire capital (f=1f = 1) and lose even once, you're wiped out regardless of how many wins preceded or follow that loss.

Taking logarithms transforms the multiplicative relationship into an additive one, which proves essential for analysis. The logarithm of wealth growth becomes:

log(WnW0)=wlog(1+bf)+llog(1f)\log\left(\frac{W_n}{W_0}\right) = w \log(1 + bf) + l \log(1 - f)

where:

  • WnW_n: final wealth
  • W0W_0: initial wealth
  • ww: number of wins
  • ll: number of losses
  • bb: win payoff
  • ff: fraction of capital wagered

This transformation reveals why logarithmic returns are natural for analyzing betting and investment: they turn compound growth into a sum of independent contributions from each trial.

The expected log growth per trial, which we call the geometric growth rate, captures the long-run behavior of the strategy. By the law of large numbers, over many trials the actual growth rate converges to this expected value. The geometric growth rate is:

G(f)=E[1nlog(WnW0)]=plog(1+bf)+qlog(1f)G(f) = \mathbb{E}\left[\frac{1}{n}\log\left(\frac{W_n}{W_0}\right)\right] = p \log(1 + bf) + q \log(1 - f)

where:

  • G(f)G(f): expected geometric growth rate per trial
  • pp: probability of winning
  • qq: probability of losing (1p1-p)
  • bb: win payoff
  • ff: fraction of capital wagered

This function G(f)G(f) is the central object of Kelly's analysis. It is concave in ff, meaning it rises to a unique maximum and then falls. Betting nothing (f=0f = 0) yields zero growth, but betting everything (f=1f = 1) also yields negative expected growth for any realistic bet because the logarithm of zero (which you face after a loss with f=1f = 1) is negative infinity.

To find the optimal fraction ff^* that maximizes G(f)G(f), we apply standard calculus, differentiating with respect to ff and setting the derivative to zero:

dGdf=ddf(plog(1+bf)+qlog(1f))=pb1+bfq1f\begin{aligned} \frac{dG}{df} &= \frac{d}{df}\left(p \log(1 + bf) + q \log(1 - f)\right) \\ &= \frac{pb}{1 + bf} - \frac{q}{1 - f} \end{aligned}

The first term represents the marginal benefit of increased betting: with probability pp, you win, and increasing your bet increases your logarithmic gain. The second term represents the marginal cost: with probability qq, you lose, and increased betting amplifies your logarithmic loss. At the optimal point, these marginal effects balance perfectly.

Setting the derivative to zero gives us the first-order condition:

pb1+bfq1f=0\frac{pb}{1 + bf} - \frac{q}{1 - f} = 0

where:

  • GG: expected growth rate
  • ff: fraction of capital wagered
  • pp: win probability
  • qq: loss probability
  • bb: win payoff

Solving for ff requires straightforward algebraic manipulation. We rearrange to isolate ff^*:

pb1+bf=q1f    pb(1f)=q(1+bf)(cross-multiply)    pbpbf=q+qbf(expand terms)    pbq=pbf+qbf(group constants and variables)    pbq=bf(p+q)(factor out bf)    pbq=bf(since p+q=1)\begin{aligned} \frac{pb}{1 + bf^*} &= \frac{q}{1 - f^*} \\ \implies pb(1 - f^*) &= q(1 + bf^*) && \text{(cross-multiply)} \\ \implies pb - pbf^* &= q + qbf^* && \text{(expand terms)} \\ \implies pb - q &= pbf^* + qbf^* && \text{(group constants and variables)} \\ \implies pb - q &= bf^*(p + q) && \text{(factor out } bf^* \text{)} \\ \implies pb - q &= bf^* && \text{(since } p + q = 1 \text{)} \end{aligned}

The final step uses the fundamental probability constraint that win and loss probabilities must sum to one. This yields the celebrated Kelly formula:

f=pbqb=p(b+1)1bf^* = \frac{pb - q}{b} = \frac{p(b + 1) - 1}{b}

where:

  • ff^*: optimal fraction of capital to wager
  • pp: win probability
  • qq: loss probability
  • bb: win payoff (odds)

This result admits a clear interpretation. The numerator pbqpb - q represents the expected profit per dollar wagered, often called the "edge." The denominator bb represents the odds. The optimal bet fraction is simply the edge divided by the odds. When the odds are generous (large bb), you can afford to bet more conservatively. When the odds are stingy (small bb), you must bet more aggressively to exploit a given edge.

This is the Kelly Criterion for a simple bet. For the special case of even odds (b=1b = 1), the formula simplifies even further:

f=2p1=pqf^* = 2p - 1 = p - q

where:

  • ff^*: optimal fraction for even odds
  • pp: win probability
  • qq: loss probability

The result is intuitive: bet a fraction equal to the edge (win probability minus loss probability). If you have a 60% chance of winning an even-money bet, you should wager 20% of your capital. This provides a concrete, actionable rule that balances the benefit of exploiting your edge against the risk of overbetting.

Out[2]:
Visualization
Expected geometric growth rate as a function of bet size for a wager with 55% win probability and 1:1 odds. The optimal Kelly bet is 10% ($f^*=0.1$). Betting more than roughly 20% leads to negative long-term growth despite the positive expected value of the bet, illustrating the 'Kelly cliff' where over-betting guarantees ruin.
Expected geometric growth rate as a function of bet size for a wager with 55% win probability and 1:1 odds. The optimal Kelly bet is 10% ($f^*=0.1$). Betting more than roughly 20% leads to negative long-term growth despite the positive expected value of the bet, illustrating the 'Kelly cliff' where over-betting guarantees ruin.
Kelly Criterion

The Kelly Criterion states that to maximize the long-term geometric growth rate of capital, bet a fraction f=pbqbf^* = \frac{pb - q}{b} of your capital on each wager, where pp is the win probability, q=1pq = 1-p is the loss probability, and bb is the odds (profit per dollar wagered on a win).

Continuous Returns: Kelly for Trading

Real trading doesn't involve discrete binary outcomes. Instead, returns are continuous and approximately normally distributed over short horizons. We can derive a continuous version of the Kelly formula that applies directly to realistic trading scenarios where positions may gain or lose varying amounts.

Suppose a strategy has expected return μ\mu and volatility σ\sigma per period. If you apply leverage \ell (meaning you bet \ell times your capital), your leveraged return is r=rr_\ell = \ell \cdot r where rN(μ,σ2)r \sim N(\mu, \sigma^2). Leverage scales both the expected return and the volatility. The expected leveraged return is E[r]=μ\mathbb{E}[r_\ell] = \ell\mu and the variance is Var(r)=2σ2\text{Var}(r_\ell) = \ell^2\sigma^2. Notice that variance scales with the square of leverage, which foreshadows the extreme danger of high leverage.

For small returns, which is a reasonable approximation for daily or weekly trading, the expected log return, which determines geometric growth, is approximately:

E[log(1+r)]E[r]12Var(r)=μ122σ2\begin{aligned} \mathbb{E}[\log(1 + r_\ell)] &\approx \mathbb{E}[r_\ell] - \frac{1}{2}\text{Var}(r_\ell) \\ &= \ell\mu - \frac{1}{2}\ell^2\sigma^2 \end{aligned}

where:

  • rr_\ell: leveraged return
  • \ell: leverage ratio
  • μ\mu: expected return of the strategy
  • σ2\sigma^2: variance of the strategy

This approximation uses the Taylor expansion log(1+x)xx22\log(1 + x) \approx x - \frac{x^2}{2} for small xx. The formula has a compelling structure: the first term μ\ell\mu is the expected arithmetic return, which increases linearly with leverage. The second term 122σ2\frac{1}{2}\ell^2\sigma^2 is the variance penalty, sometimes called "volatility drag," which increases quadratically with leverage. This quadratic penalty is why unlimited leverage is disastrous: at high enough leverage, the variance penalty overwhelms any expected return.

To maximize this geometric growth rate, we differentiate with respect to \ell:

dd(μ122σ2)=dd(μ)12σ2dd(2)=μ12σ2(2)=μσ2\begin{aligned} \frac{d}{d\ell}\left(\ell\mu - \frac{1}{2}\ell^2\sigma^2\right) &= \frac{d}{d\ell}(\ell\mu) - \frac{1}{2}\sigma^2 \frac{d}{d\ell}(\ell^2) \\ &= \mu - \frac{1}{2}\sigma^2(2\ell) \\ &= \mu - \ell\sigma^2 \end{aligned}

The derivative has two terms: μ\mu represents the marginal benefit of increased leverage (more expected return), while σ2\ell\sigma^2 represents the marginal cost (more variance penalty). Setting the derivative to zero identifies where these opposing forces balance:

μσ2=0\mu - \ell\sigma^2 = 0

where:

  • \ell: leverage ratio
  • μ\mu: expected return
  • σ2\sigma^2: return variance

Solving this simple equation yields the continuous Kelly leverage formula:

=μσ2\ell^* = \frac{\mu}{\sigma^2}

where:

  • \ell^*: optimal leverage ratio
  • μ\mu: expected return
  • σ2\sigma^2: return variance

This formula is intuitive. Optimal leverage increases with expected return, as you should bet more aggressively when the edge is larger. Optimal leverage decreases with variance, as you should bet more conservatively when uncertainty is higher. The formula can be rewritten using the Sharpe ratio S=μ/σS = \mu/\sigma:

=μσ2=Sσ\ell^* = \frac{\mu}{\sigma^2} = \frac{S}{\sigma}

where:

  • \ell^*: optimal leverage ratio
  • μ\mu: expected return
  • SS: Sharpe ratio (μ/σ\mu/\sigma)
  • σ\sigma: return volatility

This alternative form reveals that optimal leverage is the Sharpe ratio divided by volatility. A strategy with higher Sharpe ratio warrants more aggressive position sizing because the edge relative to risk is larger. A strategy with lower volatility also warrants higher leverage because the same proportional bet involves less absolute risk. This formula provides the foundational insight for sizing trading positions optimally.

Properties of Kelly Betting

Kelly betting has several important mathematical properties that make it theoretically attractive:

  • Maximizes geometric growth rate: No other fixed-fraction strategy achieves higher long-term wealth growth. This optimality is exact under the model assumptions.
  • Never goes bankrupt: Since you only bet a fraction of capital, you always have something left, though it can become arbitrarily small. This contrasts with fixed-dollar betting, which can lead to complete ruin.
  • Variance increases with leverage: At Kelly leverage, the variance of log returns equals the squared Sharpe ratio: Var(logW)=S2\text{Var}(\log W) = S^2 per unit time. This provides a direct link between strategy quality and wealth volatility.

However, Kelly betting also has significant drawbacks that limit its practical applicability:

  • Extreme drawdowns: Full Kelly can produce drawdowns of 50% or more with high probability, even for profitable strategies. These drawdowns are mathematically expected, not rare events, and they create severe psychological challenges for you.
  • Parameter sensitivity: The optimal fraction depends critically on μ\mu and σ\sigma, which must be estimated. Overestimating μ\mu or underestimating σ\sigma leads to overbetting, which can be catastrophic because overbetting beyond Kelly has worse expected growth than underbetting by the same proportional amount.
  • Assumes ergodicity: Kelly assumes you face the same bet repeatedly forever with unchanging parameters. Finite horizons or changing conditions violate this assumption, and the infinite-horizon optimal strategy may perform poorly over realistic investment horizons.

Fractional Kelly

Given the practical dangers of full Kelly betting, most practitioners use fractional Kelly, betting a fraction (typically 0.25 to 0.5) of the Kelly-optimal amount. This sacrifices some expected growth for substantially reduced variance and drawdown risk. The rationale is that we never know the true parameters μ\mu and σ\sigma, so using full Kelly based on estimates is almost certainly betting too aggressively.

If we denote the Kelly fraction multiplier as κ(0,1]\kappa \in (0, 1], the leveraged position becomes =κ=κμσ2\ell = \kappa \cdot \ell^* = \kappa \frac{\mu}{\sigma^2}. The multiplier κ\kappa represents our conservatism: κ=1\kappa = 1 is full Kelly, κ=0.5\kappa = 0.5 is half-Kelly, and κ=0.25\kappa = 0.25 is quarter-Kelly.

The expected geometric growth rate at fractional Kelly follows from substituting the scaled leverage into our growth rate formula:

G(κ)=κμ12(κ)2σ2=κμ2σ212κ2μ2σ2(substitute =μ/σ2)=μ2σ2(κκ22)(factor out μ2/σ2)\begin{aligned} G(\kappa) &= \kappa \ell^* \mu - \frac{1}{2}(\kappa \ell^*)^2 \sigma^2 \\ &= \kappa \frac{\mu^2}{\sigma^2} - \frac{1}{2}\kappa^2 \frac{\mu^2}{\sigma^2} && \text{(substitute } \ell^* = \mu/\sigma^2 \text{)} \\ &= \frac{\mu^2}{\sigma^2}\left(\kappa - \frac{\kappa^2}{2}\right) && \text{(factor out } \mu^2/\sigma^2 \text{)} \end{aligned}

where:

  • G(κ)G(\kappa): expected geometric growth rate with fractional Kelly
  • κ\kappa: Kelly fraction multiplier (κ(0,1]\kappa \in (0, 1])
  • \ell^*: optimal full Kelly leverage
  • μ\mu: expected return
  • σ\sigma: volatility

This formula reveals the fundamental tradeoff in fractional Kelly. The term κ\kappa in the parentheses represents growth that increases linearly with aggressiveness, while the term κ2/2\kappa^2/2 represents the variance penalty that increases quadratically. As κ\kappa increases from zero, growth initially increases faster than the penalty, but eventually the penalty dominates.

At full Kelly (κ=1\kappa = 1), the growth rate is G(1)=μ22σ2G(1) = \frac{\mu^2}{2\sigma^2}. At half-Kelly (κ=0.5\kappa = 0.5), the growth rate is:

G(0.5)=μ2σ2(0.50.522)=μ2σ2(0.50.125)=0.375μ2σ2=0.75(μ22σ2)=0.75G(1)\begin{aligned} G(0.5) &= \frac{\mu^2}{\sigma^2}\left(0.5 - \frac{0.5^2}{2}\right) \\ &= \frac{\mu^2}{\sigma^2}(0.5 - 0.125) \\ &= 0.375 \frac{\mu^2}{\sigma^2} \\ &= 0.75 \left( \frac{\mu^2}{2\sigma^2} \right) \\ &= 0.75 G(1) \end{aligned}

where:

  • G(0.5)G(0.5): expected geometric growth rate at half-Kelly
  • G(1)G(1): expected geometric growth rate at full Kelly
  • μ\mu: expected return
  • σ\sigma: volatility

This calculation shows the efficiency of half-Kelly: it achieves 75% of the growth rate but with only 25% of the variance, since variance scales as κ2\kappa^2. This tradeoff is highly attractive for risk-averse investors who value smoother wealth paths over maximum expected growth. The insight that you can sacrifice only 25% of growth while eliminating 75% of variance explains why fractional Kelly dominates practical applications.

In[3]:
Code
import numpy as np

# Analyze fractional Kelly tradeoffs
kappa = np.linspace(0.01, 2, 200)  # Kelly fraction from 0 to 2x

# Normalized growth rate (as fraction of max Kelly growth)
# G(kappa) = kappa - 0.5*kappa^2 (normalized)
growth_rate = kappa - 0.5 * kappa**2

# Variance (as fraction of full Kelly variance)
variance = kappa**2

# Growth-to-variance ratio (efficiency)
efficiency = growth_rate / variance
efficiency[variance < 0.01] = np.nan  # Avoid division issues
Out[4]:
Visualization
Three panel chart showing growth rate, variance, and efficiency versus Kelly fraction.
Expected geometric growth rate versus Kelly fraction. Growth peaks at the optimal fraction ($f^*$) and declines thereafter.
Variance of returns versus Kelly fraction. Variance increases quadratically with leverage, meaning Full Kelly has four times the variance of Half Kelly.
Variance of returns versus Kelly fraction. Variance increases quadratically with leverage, meaning Full Kelly has four times the variance of Half Kelly.
Capital efficiency (growth/variance) versus Kelly fraction. Efficiency is highest at low leverage levels and decreases as leverage increases.
Capital efficiency (growth/variance) versus Kelly fraction. Efficiency is highest at low leverage levels and decreases as leverage increases.

The efficiency plot reveals a crucial insight: smaller Kelly fractions provide better return per unit of risk taken. As the fraction increases toward full Kelly and beyond, efficiency deteriorates rapidly. This mathematical relationship explains why experienced practitioners almost universally advocate for conservative position sizing.

Risk Budgeting and Capital Allocation

Real portfolios contain multiple strategies, each with its own expected return, volatility, and correlations with other strategies. How should capital be allocated across strategies to maximize overall portfolio performance? This question extends Kelly's single-bet framework to the multi-dimensional setting that characterizes real trading operations.

Multi-Strategy Framework

Consider a portfolio of nn strategies with return vector r=(r1,,rn)T\mathbf{r} = (r_1, \ldots, r_n)^T, expected return vector μ=(μ1,,μn)T\boldsymbol{\mu} = (\mu_1, \ldots, \mu_n)^T, and covariance matrix Σ\boldsymbol{\Sigma}. Let w=(w1,,wn)T\mathbf{w} = (w_1, \ldots, w_n)^T be the weight, or capital allocation, vector. The notation uses vectors and matrices because the interactions between strategies, captured by the covariance matrix, fundamentally affect optimal allocation.

The portfolio expected return is μp=wTμ\mu_p = \mathbf{w}^T \boldsymbol{\mu}, which is simply the weighted average of individual strategy returns. The portfolio variance is σp2=wTΣw\sigma_p^2 = \mathbf{w}^T \boldsymbol{\Sigma} \mathbf{w}, which accounts not only for individual strategy variances but also for all pairwise covariances. When strategies are negatively correlated, diversification reduces portfolio variance below the weighted average of individual variances.

As we discussed in Modern Portfolio Theory and Mean-Variance Optimization, the maximum Sharpe ratio portfolio solves:

maxwwTμwTΣw\max_{\mathbf{w}} \frac{\mathbf{w}^T \boldsymbol{\mu}}{\sqrt{\mathbf{w}^T \boldsymbol{\Sigma} \mathbf{w}}}

where:

  • w\mathbf{w}: portfolio weight vector
  • μ\boldsymbol{\mu}: expected return vector
  • Σ\boldsymbol{\Sigma}: covariance matrix of returns

This optimization seeks the portfolio with the best risk-adjusted return, balancing expected return in the numerator against risk in the denominator. For unconstrained optimization, which allows both leverage and short selling, the solution has a closed form:

wΣ1μ\mathbf{w}^* \propto \boldsymbol{\Sigma}^{-1} \boldsymbol{\mu}

where:

  • w\mathbf{w}^*: optimal weight vector
  • Σ1\boldsymbol{\Sigma}^{-1}: inverse covariance matrix
  • μ\boldsymbol{\mu}: expected return vector

This is the multi-asset generalization of the Kelly criterion. Each strategy receives weight proportional to its expected return, adjusted by the inverse covariance matrix, which accounts for both individual volatility and correlations. The inverse covariance matrix effectively adjusts weights downward for volatile strategies and for strategies that are highly correlated with others, because these provide less diversification benefit.

Independent Strategies

When strategies are independent, meaning they have zero correlation with each other, the covariance matrix is diagonal: Σ=diag(σ12,,σn2)\boldsymbol{\Sigma} = \text{diag}(\sigma_1^2, \ldots, \sigma_n^2). The inverse of a diagonal matrix is simply the diagonal matrix of reciprocals. In this special case, the optimal weights simplify dramatically to:

wiμiσi2w_i^* \propto \frac{\mu_i}{\sigma_i^2}

where:

  • wiw_i^*: optimal weight for strategy ii
  • μi\mu_i: expected return of strategy ii
  • σi2\sigma_i^2: variance of strategy ii

This formula states that each strategy should be sized according to its individual Kelly criterion, independent of other strategies. This independence is powerful because it means you can optimize each strategy separately without worrying about interactions. The total portfolio leverage is then simply the sum of individual leverages.

For correlated strategies, the picture is more complex. Positive correlation between strategies reduces diversification benefits, and the optimal allocation accounts for this by reducing weights on highly correlated strategies. Intuitively, having two highly correlated strategies is almost like having twice the position in a single strategy, which may violate risk limits even when individual position sizes appear reasonable.

Risk Budgeting Framework

An alternative to return-based allocation is risk budgeting, where we allocate a "risk budget" to each strategy rather than capital directly. This approach is particularly useful when expected returns are uncertain but risk estimates are more reliable. In practice, volatilities and correlations tend to be more persistent and easier to estimate than expected returns, making risk budgeting a more robust framework.

Risk Contribution

The marginal risk contribution (MRC) of strategy ii to portfolio volatility measures how much portfolio risk increases when you slightly increase the weight of strategy ii. Formally, it is defined as:

MRCi=σpwi=(Σw)iσp\text{MRC}_i = \frac{\partial \sigma_p}{\partial w_i} = \frac{(\boldsymbol{\Sigma} \mathbf{w})_i}{\sigma_p}

where:

  • MRCi\text{MRC}_i: marginal risk contribution of strategy ii
  • σp\sigma_p: portfolio volatility
  • wiw_i: weight of strategy ii
  • Σ\boldsymbol{\Sigma}: covariance matrix of returns
  • (Σw)i(\boldsymbol{\Sigma} \mathbf{w})_i: ii-th element of the marginal covariance vector

The term (Σw)i(\boldsymbol{\Sigma} \mathbf{w})_i is the ii-th element of the vector obtained by multiplying the covariance matrix by the weight vector. It represents the covariance of strategy ii with the overall portfolio.

The total risk contribution (TRC) measures how much of the portfolio's total risk is attributable to a particular strategy. It combines the marginal contribution with the position size:

TRCi=wiMRCi=wi(Σw)iσp\text{TRC}_i = w_i \cdot \text{MRC}_i = \frac{w_i (\boldsymbol{\Sigma} \mathbf{w})_i}{\sigma_p}

where:

  • TRCi\text{TRC}_i: total risk contribution of strategy ii
  • wiw_i: weight of strategy ii
  • MRCi\text{MRC}_i: marginal risk contribution
  • σp\sigma_p: portfolio volatility

A fundamental property of risk contributions is that they decompose portfolio risk additively. The sum of total risk contributions equals the portfolio volatility:

iTRCi=σp\sum_i \text{TRC}_i = \sigma_p

where:

  • TRCi\text{TRC}_i: total risk contribution
  • σp\sigma_p: portfolio volatility

This decomposition property means that risk contributions provide a complete accounting of where portfolio risk comes from.

In a risk budgeting framework, we specify target risk contributions bib_i (summing to 1) and find weights such that:

TRCiσp=bi\frac{\text{TRC}_i}{\sigma_p} = b_i

where:

  • TRCi\text{TRC}_i: total risk contribution of strategy ii
  • σp\sigma_p: portfolio volatility
  • bib_i: target risk contribution proportion for strategy ii

This constraint says that strategy ii should contribute fraction bib_i of total portfolio risk. Combining this with the definition of TRC leads to solving:

wi(Σw)i=biσp2for all iw_i (\boldsymbol{\Sigma} \mathbf{w})_i = b_i \cdot \sigma_p^2 \quad \text{for all } i

where:

  • wiw_i: weight of strategy ii
  • (Σw)i(\boldsymbol{\Sigma} \mathbf{w})_i: ii-th element of the vector Σw\boldsymbol{\Sigma} \mathbf{w} (marginal covariance)
  • bib_i: target risk budget
  • σp2\sigma_p^2: portfolio variance

This system of equations is nonlinear in the weights and typically requires numerical optimization to solve. The most common special case is equal risk contribution, where bi=1/nb_i = 1/n for all strategies. This equal risk contribution constraint is the foundation of risk parity approaches, which have gained substantial popularity in institutional investing.

In[5]:
Code
import numpy as np
from scipy.optimize import minimize


def portfolio_volatility(weights, cov_matrix):
    """Calculate portfolio volatility."""
    return np.sqrt(weights @ cov_matrix @ weights)


def risk_contributions(weights, cov_matrix):
    """Calculate the risk contribution of each asset."""
    port_vol = portfolio_volatility(weights, cov_matrix)
    marginal_contrib = cov_matrix @ weights / port_vol
    risk_contrib = weights * marginal_contrib
    return risk_contrib


def risk_parity_objective(weights, cov_matrix):
    """
    Objective function for risk parity: minimize deviation from equal risk contribution.
    """
    risk_contrib = risk_contributions(weights, cov_matrix)
    target_risk = np.sum(risk_contrib) / len(weights)  # Equal risk
    return np.sum((risk_contrib - target_risk) ** 2)


def optimize_risk_parity(cov_matrix, initial_weights=None):
    """Find risk parity weights given a covariance matrix."""
    n = cov_matrix.shape[0]
    if initial_weights is None:
        initial_weights = np.ones(n) / n

    # Constraints: weights sum to 1, all positive
    constraints = {"type": "eq", "fun": lambda w: np.sum(w) - 1}
    bounds = [(0.01, 1) for _ in range(n)]  # No short selling

    result = minimize(
        risk_parity_objective,
        initial_weights,
        args=(cov_matrix,),
        method="SLSQP",
        constraints=constraints,
        bounds=bounds,
    )

    return result.x

Let's compare different allocation approaches for a three-strategy portfolio.

In[6]:
Code
# Define three strategies with different characteristics
strategy_names = ["Momentum", "Mean Reversion", "Factor"]
expected_returns = np.array([0.12, 0.08, 0.10])  # Annual expected returns
volatilities = np.array([0.20, 0.15, 0.12])  # Annual volatilities

# Correlation matrix (momentum and mean reversion tend to be negatively correlated)
correlation = np.array([[1.0, -0.3, 0.2], [-0.3, 1.0, 0.1], [0.2, 0.1, 1.0]])

# Build covariance matrix
cov_matrix = np.outer(volatilities, volatilities) * correlation

# Method 1: Equal weight allocation
equal_weights = np.ones(3) / 3

# Method 2: Kelly-optimal (inverse variance weighted by expected return)
# For simplicity, use the uncorrelated approximation first
kelly_raw = expected_returns / volatilities**2
kelly_weights = kelly_raw / np.sum(kelly_raw)  # Normalize to sum to 1

# Method 3: True mean-variance optimal (accounts for correlations)
cov_inv = np.linalg.inv(cov_matrix)
mv_raw = cov_inv @ expected_returns
mv_weights = mv_raw / np.sum(mv_raw)  # Normalize

# Method 4: Risk parity
rp_weights = optimize_risk_parity(cov_matrix)


# Calculate portfolio statistics for each method
def portfolio_stats(weights, exp_ret, cov_matrix):
    port_return = weights @ exp_ret
    port_vol = np.sqrt(weights @ cov_matrix @ weights)
    sharpe = port_return / port_vol
    risk_contrib = risk_contributions(weights, cov_matrix)
    return port_return, port_vol, sharpe, risk_contrib


methods = {
    "Equal Weight": equal_weights,
    "Kelly (uncorr)": kelly_weights,
    "Mean-Variance": mv_weights,
    "Risk Parity": rp_weights,
}
Out[7]:
Console
Strategy Characteristics:
--------------------------------------------------
Momentum: E[r] = 12.0%, σ = 20.0%, Sharpe = 0.60
Mean Reversion: E[r] = 8.0%, σ = 15.0%, Sharpe = 0.53
Factor: E[r] = 10.0%, σ = 12.0%, Sharpe = 0.83

======================================================================
Allocation Comparison:
======================================================================

Equal Weight:
  Weights: ['33.3%', '33.3%', '33.3%']
  Risk Contributions: ['49.7%', '21.2%', '29.1%']
  Portfolio: E[r] = 10.00%, σ = 8.95%, Sharpe = 1.12

Kelly (uncorr):
  Weights: ['22.2%', '26.3%', '51.4%']
  Risk Contributions: ['25.4%', '16.2%', '58.4%']
  Portfolio: E[r] = 9.92%, σ = 8.88%, Sharpe = 1.12

Mean-Variance:
  Weights: ['25.7%', '34.2%', '40.1%']
  Risk Contributions: ['31.4%', '27.8%', '40.8%']
  Portfolio: E[r] = 9.83%, σ = 8.66%, Sharpe = 1.14

Risk Parity:
  Weights: ['27.3%', '37.8%', '34.9%']
  Risk Contributions: ['33.7%', '33.6%', '32.7%']
  Portfolio: E[r] = 9.79%, σ = 8.65%, Sharpe = 1.13
Out[8]:
Visualization
Portfolio weights by allocation method. Mean-Variance and Kelly concentrate capital in the Factor strategy, while Risk Parity and Equal Weight provide more balanced allocations.
Portfolio weights by allocation method. Mean-Variance and Kelly concentrate capital in the Factor strategy, while Risk Parity and Equal Weight provide more balanced allocations.
Risk contributions by allocation method. The Equal Weight portfolio is dominated by Momentum risk, whereas Risk Parity achieves equal risk contributions across all strategies.
Risk contributions by allocation method. The Equal Weight portfolio is dominated by Momentum risk, whereas Risk Parity achieves equal risk contributions across all strategies.

The results highlight key differences between allocation approaches:

  • Equal weight treats all strategies identically regardless of their risk or return characteristics.
  • Kelly/Mean-variance tilts heavily toward strategies with better risk-adjusted returns, potentially creating concentrated bets.
  • Risk parity equalizes risk contribution, resulting in higher weights to lower-volatility strategies and more balanced risk exposure.

Notice that mean-variance optimization produces the highest Sharpe ratio by construction, but this comes with concentrated positions that are sensitive to estimation errors in expected returns. Risk parity sacrifices some expected return for more diversified risk exposure.

Leverage Limits and Margin Requirements

Leverage amplifies both gains and losses. While optimal sizing theory suggests an ideal leverage level, practical constraints impose hard limits on how much leverage can actually be employed. Understanding these constraints is essential for translating theoretical optimal positions into executable trades.

Understanding Margin and Leverage

When trading on margin, you borrow funds from your broker to increase position size beyond your capital. The key concepts are:

  • Initial margin: The minimum equity required to open a position, typically 25-50% for stocks.
  • Maintenance margin: The minimum equity required to keep a position open, typically 25-30%.
  • Margin call: When equity falls below maintenance margin, requiring additional funds or position reduction.
  • Leverage ratio: The ratio of total position size to equity. With 50% initial margin, maximum leverage is 2x.

For derivatives, margin works differently. Futures require "performance bond" margin representing a small percentage of notional value, enabling leverage of 10x-20x or more. Options require margin based on potential loss scenarios.

Reg T Margin

Regulation T established by the Federal Reserve, sets the initial margin requirement for most U.S. securities at 50%, implying maximum leverage of 2x. Portfolio margin accounts may receive more favorable treatment based on hedged positions and overall portfolio risk.

The Mathematics of Leverage and Drawdown

Leverage has a nonlinear relationship with drawdown risk, making high leverage more dangerous than intuition suggests. Consider a strategy with return volatility σ\sigma. At leverage \ell, the leveraged volatility is σ\ell \sigma. This linear scaling of volatility translates into highly nonlinear effects on drawdown probability.

Under geometric Brownian motion assumptions, the expected maximum drawdown over time horizon TT for a strategy with Sharpe ratio SS and leveraged volatility σ\ell\sigma is approximately:

E[MaxDD]2σTΦ1(1e2S2T/2)1\mathbb{E}[\text{MaxDD}] \approx 2\ell\sigma\sqrt{T} \cdot \Phi^{-1}\left(1 - e^{-\ell^2 S^2 T / 2}\right)^{-1}

where:

  • E[MaxDD]\mathbb{E}[\text{MaxDD}]: expected maximum drawdown
  • \ell: leverage ratio
  • σ\sigma: strategy volatility
  • TT: time horizon
  • SS: Sharpe ratio
  • Φ1\Phi^{-1}: inverse cumulative standard normal distribution function

The formula combines two components: a baseline volatility scaling term (2σT2\ell\sigma\sqrt{T}) and a risk-adjusted multiplier (the inverse normal term) that accounts for how the strategy's Sharpe ratio and leverage interact to determine tail risk depth. The baseline term grows with the square root of time, while the multiplier depends on the interaction between leverage and strategy quality.

A simpler approximation for the probability of experiencing a drawdown of at least dd follows from the reflection principle for Brownian motion:

P(DDd)e2d2/(2σ2T)P(\text{DD} \geq d) \approx e^{-2d^2 / (\ell^2 \sigma^2 T)}

where:

  • P(DDd)P(\text{DD} \geq d): probability of a drawdown exceeding dd
  • dd: drawdown threshold
  • \ell: leverage ratio
  • σ\sigma: volatility
  • TT: time horizon

This shows that drawdown probability is highly sensitive to leverage. Doubling leverage quadruples the exponent's denominator, dramatically increasing the probability of severe drawdowns. A drawdown that is virtually impossible at 1x leverage may be almost certain at 4x leverage.

In[9]:
Code
import numpy as np


def simulate_drawdowns(n_sims, n_days, daily_return, daily_vol, leverage):
    """Simulate maximum drawdowns for a leveraged strategy."""
    np.random.seed(42)
    max_drawdowns = []

    for _ in range(n_sims):
        # Generate daily returns
        returns = np.random.normal(
            leverage * daily_return, leverage * daily_vol, n_days
        )
        # Calculate cumulative wealth (starting at 1)
        wealth = np.cumprod(1 + returns)
        # Calculate running maximum
        running_max = np.maximum.accumulate(wealth)
        # Calculate drawdown
        drawdown = (running_max - wealth) / running_max
        max_drawdowns.append(np.max(drawdown))

    return np.array(max_drawdowns)


# Strategy parameters
daily_return = 0.0004  # ~10% annual
daily_vol = 0.01  # ~16% annual
n_days = 252  # One year
n_sims = 5000

# Test different leverage levels
leverage_levels = [1, 2, 3, 4, 5]
drawdown_results = {}

for lev in leverage_levels:
    dd = simulate_drawdowns(n_sims, n_days, daily_return, daily_vol, lev)
    drawdown_results[lev] = {
        "mean": np.mean(dd),
        "median": np.median(dd),
        "p95": np.percentile(dd, 95),
        "p99": np.percentile(dd, 99),
        "prob_50pct": np.mean(dd > 0.5),
    }

# Calculate base strategy statistics for display
annual_return = daily_return * 252
annual_vol = daily_vol * np.sqrt(252)
base_sharpe = annual_return / annual_vol
Out[10]:
Console
Impact of Leverage on Maximum Drawdown (1-year simulation)
======================================================================
Base strategy: 10.1% annual return, 15.9% annual vol
Sharpe ratio: 0.63
----------------------------------------------------------------------
Leverage   Mean DD      Median DD    95th %ile    99th %ile    P(DD>50%) 
----------------------------------------------------------------------
1x         14.3%        13.2%        25.0%        30.6%        0.0%      
2x         26.8%        25.2%        44.8%        52.7%        1.9%      
3x         37.7%        36.1%        60.0%        68.4%        16.6%     
4x         47.2%        45.8%        71.5%        79.3%        39.0%     
5x         55.4%        54.4%        80.2%        86.7%        61.1%     
Out[11]:
Visualization
Distribution of maximum drawdowns over 1 year at different leverage levels. The box plots show the median (orange line), interquartile range (box), and tails (whiskers). Note the non-linear increase in tail risk as leverage increases; at 5x leverage, drawdowns exceeding 50% become the norm rather than the exception.
Distribution of maximum drawdowns over 1 year at different leverage levels. The box plots show the median (orange line), interquartile range (box), and tails (whiskers). Note the non-linear increase in tail risk as leverage increases; at 5x leverage, drawdowns exceeding 50% become the norm rather than the exception.

The simulation illustrates how leverage transforms a reasonable strategy into a dangerous one. At 1x leverage, this strategy with a Sharpe ratio around 0.63 has modest drawdowns. At 5x leverage, the probability of a 50%+ drawdown in a single year exceeds 50%. Recovery from a 50% drawdown requires a 100% gain, which at 10% annual returns (before the drawdown) would take nearly 7 years.

Leverage and the Risk of Ruin

Beyond drawdowns, excessive leverage creates genuine risk of ruin: losing so much capital that continuing to trade becomes impossible. Several mechanisms create this risk:

Margin calls and forced liquidation: When losses erode equity below maintenance margin, positions are forcibly closed at unfavorable prices. This "stop out" crystallizes losses that might otherwise recover.

Gap risk: Markets can move discontinuously, especially over weekends or during crises. A 3x leveraged position in an asset that gaps down 35% overnight faces a 105% loss, exceeding total capital.

Volatility expansion: Leverage is often sized based on historical volatility, but volatility can spike dramatically during crises precisely when you're already losing money.

Correlation breakdown: Strategies that appear diversified in normal conditions often become highly correlated during market stress, magnifying portfolio-level losses.

Case Studies: Leverage Disasters

History provides examples of leverage-induced failures:

Long-Term Capital Management (1998): LTCM's strategies had estimated Sharpe ratios of 1-2, but they applied leverage of 25x or more. When the Russian debt crisis triggered a flight to quality, correlations spiked and spreads widened dramatically. LTCM lost \4.6billionandrequireda4.6 billion and required a \\3.6 billion bailout coordinated by the Federal Reserve to prevent systemic contagion.

Amaranth Advisors (2006): This multi-strategy fund concentrated heavily in natural gas futures. When positions moved against them, leverage of approximately 8x transformed a significant loss into a \$6 billion catastrophe wiping out the fund entirely.

XIV and Volatility ETNs (2018): Leveraged inverse volatility products lost nearly all their value in a single day during the "Volmageddon" event. A 115% spike in the VIX caused products designed to profit from calm markets to collapse.

The common thread: strategies that worked well in normal conditions failed when leverage combined with adverse market moves.

Risk Parity and Volatility Targeting

Given the dangers of leverage and the difficulties of estimating expected returns, alternative allocation frameworks focus on risk rather than return.

Risk Parity Principles

Risk parity, pioneered by Ray Dalio's Bridgewater Associates, allocates capital so that each asset contributes equally to portfolio risk. The core insight is that expected returns are notoriously difficult to estimate, but volatilities and correlations are relatively stable and predictable. By focusing on what we can estimate reliably, risk parity sidesteps much of the estimation error that plagues return-based optimization.

For a long-only portfolio where each asset contributes equally to risk (TRCi=σp/n\text{TRC}_i = \sigma_p/n), we require:

wi(Σw)i=σp2nfor all iw_i (\boldsymbol{\Sigma} \mathbf{w})_i = \frac{\sigma_p^2}{n} \quad \text{for all } i

where:

  • wiw_i: weight of asset ii
  • Σ\boldsymbol{\Sigma}: covariance matrix of returns
  • (Σw)i(\boldsymbol{\Sigma} \mathbf{w})_i: marginal covariance of asset ii
  • σp2\sigma_p^2: portfolio variance
  • nn: number of assets

When assets are uncorrelated, this simplifies to inverse-volatility weighting:

wi1σiw_i \propto \frac{1}{\sigma_i}

where:

  • wiw_i: weight of asset ii
  • σi\sigma_i: volatility of asset ii

Lower-volatility assets receive higher weights, which for traditional portfolios means bonds receive much larger allocations than stocks. To achieve competitive returns, risk parity portfolios typically apply leverage to the entire portfolio, bringing total volatility to a target level (often 10-15% annually).

Volatility Targeting

A related approach is volatility targeting, where position sizes are adjusted dynamically to maintain constant portfolio volatility. This approach recognizes that volatility varies substantially over time, so static position sizing produces varying levels of actual risk. If target volatility is σ\sigma^* and current estimated volatility is σ^t\hat{\sigma}_t, the position multiplier is:

multipliert=σσ^t\text{multiplier}_t = \frac{\sigma^*}{\hat{\sigma}_t}

where:

  • multipliert\text{multiplier}_t: leverage scaling factor at time tt
  • σ\sigma^*: target volatility
  • σ^t\hat{\sigma}_t: estimated current volatility

This approach automatically reduces positions during high-volatility periods (when losses are most likely) and increases positions during calm periods. Research shows this can improve risk-adjusted returns and reduce drawdowns compared to constant position sizing.

In[12]:
Code
import numpy as np


def simulate_volatility_targeting(returns, target_vol, lookback=20):
    """
    Simulate a volatility-targeting strategy.

    Parameters:
    - returns: array of daily returns
    - target_vol: target daily volatility
    - lookback: days for volatility estimation
    """
    n = len(returns)
    leverages = np.ones(n)
    targeted_returns = np.zeros(n)

    for t in range(lookback, n):
        # Estimate volatility from recent returns
        recent_vol = np.std(returns[t - lookback : t])

        # Calculate leverage to hit target volatility
        if recent_vol > 0:
            leverage = min(target_vol / recent_vol, 3.0)  # Cap at 3x
        else:
            leverage = 1.0

        leverages[t] = leverage
        targeted_returns[t] = leverage * returns[t]

    return targeted_returns, leverages


# Simulate a strategy with time-varying volatility
np.random.seed(123)
n_days = 1000

# Create volatility regime (low vol, then high vol, then low again)
vol_regime = np.concatenate(
    [
        np.ones(400) * 0.01,  # Low vol period
        np.ones(200) * 0.03,  # High vol period (crisis)
        np.ones(400) * 0.012,  # Return to normal
    ]
)

# Generate returns with constant expected return but time-varying vol
base_returns = np.random.normal(0.0003, 1, n_days) * vol_regime

# Apply volatility targeting
target_daily_vol = 0.01  # Target 1% daily vol (~16% annual)
targeted_returns, leverages = simulate_volatility_targeting(
    base_returns, target_daily_vol, lookback=20
)
Out[13]:
Visualization
Three panel chart showing cumulative returns, rolling volatility, and leverage over time.
Cumulative returns for static versus volatility-targeted positions. The targeted strategy (orange) avoids the severe drawdown experienced by the static strategy (blue).
Cumulative returns for static versus volatility-targeted positions. The targeted strategy (orange) avoids the severe drawdown experienced by the static strategy (blue).
Cumulative returns for static versus volatility-targeted positions. The targeted strategy (orange) avoids the severe drawdown experienced by the static strategy (blue).
Cumulative returns for static versus volatility-targeted positions. The targeted strategy (orange) avoids the severe drawdown experienced by the static strategy (blue).
Cumulative returns for static versus volatility-targeted positions. The targeted strategy (orange) avoids the severe drawdown experienced by the static strategy (blue).

The visualization demonstrates volatility targeting's key benefit: automatic risk reduction during dangerous periods. During the simulated crisis (days 400-600), the strategy reduced leverage to well below 1x, limiting losses. After the crisis, leverage gradually increased as volatility estimates came down.

Practical Position Sizing Implementation

Let's build a comprehensive position sizing system that integrates the concepts we've covered.

Position Sizing Framework

A production position sizing system needs to handle:

  1. Signal to target position conversion: Translate strategy signals into desired position sizes
  2. Risk scaling: Apply Kelly, fractional Kelly, or volatility targeting
  3. Constraint enforcement: Respect leverage limits, position limits, and risk budgets
  4. Dynamic adjustment: Update sizes as volatility and portfolio state change
In[14]:
Code
import numpy as np
from dataclasses import dataclass
from typing import Dict, List


@dataclass
class StrategyParams:
    """Parameters for a trading strategy."""

    name: str
    expected_return: float  # Annualized expected return
    volatility: float  # Annualized volatility
    max_weight: float = 0.5  # Maximum portfolio weight
    min_weight: float = -0.5  # Minimum portfolio weight (negative = short)


@dataclass
class PortfolioConstraints:
    """Constraints for the overall portfolio."""

    max_gross_leverage: float = 2.0  # Max sum of absolute weights
    max_net_leverage: float = 1.0  # Max sum of signed weights
    target_volatility: float = 0.15  # Target portfolio volatility
    kelly_fraction: float = 0.5  # Fractional Kelly multiplier


class PositionSizer:
    """
    Position sizing engine that combines Kelly criterion, risk budgeting,
    and practical constraints.
    """

    def __init__(
        self,
        strategies: List[StrategyParams],
        constraints: PortfolioConstraints,
        correlation_matrix: np.ndarray,
    ):
        self.strategies = {s.name: s for s in strategies}
        self.constraints = constraints
        self.n_strategies = len(strategies)
        self.correlation = correlation_matrix

        # Build covariance matrix
        vols = np.array([s.volatility for s in strategies])
        self.cov_matrix = np.outer(vols, vols) * correlation_matrix

    def kelly_weights(self) -> Dict[str, float]:
        """Calculate Kelly-optimal weights accounting for correlations."""
        exp_returns = np.array(
            [s.expected_return for s in self.strategies.values()]
        )

        # Full Kelly: w* = Sigma^{-1} * mu
        cov_inv = np.linalg.inv(self.cov_matrix)
        raw_weights = cov_inv @ exp_returns

        # Apply fractional Kelly
        raw_weights *= self.constraints.kelly_fraction

        return {name: w for name, w in zip(self.strategies.keys(), raw_weights)}

    def apply_constraints(self, weights: Dict[str, float]) -> Dict[str, float]:
        """Apply portfolio and position-level constraints."""
        constrained = {}

        # First, apply position-level constraints
        for name, w in weights.items():
            strategy = self.strategies[name]
            constrained[name] = np.clip(
                w, strategy.min_weight, strategy.max_weight
            )

        # Check gross leverage constraint
        gross_leverage = sum(abs(w) for w in constrained.values())
        if gross_leverage > self.constraints.max_gross_leverage:
            scale = self.constraints.max_gross_leverage / gross_leverage
            constrained = {k: v * scale for k, v in constrained.items()}

        # Check net leverage constraint
        net_leverage = sum(constrained.values())
        if abs(net_leverage) > self.constraints.max_net_leverage:
            # Scale down all positions proportionally
            scale = self.constraints.max_net_leverage / abs(net_leverage)
            constrained = {k: v * scale for k, v in constrained.items()}

        return constrained

    def volatility_scale(self, weights: Dict[str, float]) -> float:
        """Calculate scale factor to hit target volatility."""
        w = np.array(list(weights.values()))
        port_vol = np.sqrt(w @ self.cov_matrix @ w)

        if port_vol > 0:
            return self.constraints.target_volatility / port_vol
        return 1.0

    def compute_positions(
        self, volatility_target: bool = True
    ) -> Dict[str, float]:
        """
        Compute final position sizes.

        Parameters:
        - volatility_target: If True, scale positions to hit target volatility

        Returns:
        - Dictionary of strategy name to weight
        """
        # Start with Kelly-optimal weights
        weights = self.kelly_weights()

        # Apply constraints
        weights = self.apply_constraints(weights)

        # Optionally scale to target volatility
        if volatility_target:
            scale = self.volatility_scale(weights)
            # Don't scale up beyond max leverage
            if scale > 1:
                max_scale = self.constraints.max_gross_leverage / sum(
                    abs(w) for w in weights.values()
                )
                scale = min(scale, max_scale)
            weights = {k: v * scale for k, v in weights.items()}
            # Re-apply constraints after scaling
            weights = self.apply_constraints(weights)

        return weights

    def risk_decomposition(
        self, weights: Dict[str, float]
    ) -> Dict[str, Dict[str, float]]:
        """Decompose portfolio risk by strategy."""
        w = np.array(list(weights.values()))
        port_var = w @ self.cov_matrix @ w
        port_vol = np.sqrt(port_var)

        # Risk contribution of each strategy
        marginal_contrib = self.cov_matrix @ w / port_vol
        risk_contrib = w * marginal_contrib

        result = {}
        for i, name in enumerate(weights.keys()):
            result[name] = {
                "weight": weights[name],
                "marginal_risk_contrib": marginal_contrib[i],
                "total_risk_contrib": risk_contrib[i],
                "pct_of_risk": risk_contrib[i] / port_vol * 100,
            }

        return result

Let's test this framework with our three-strategy portfolio:

In[15]:
Code
# Define strategies
strategies = [
    StrategyParams(
        "Momentum",
        expected_return=0.12,
        volatility=0.20,
        max_weight=0.6,
        min_weight=-0.1,
    ),
    StrategyParams(
        "MeanRev",
        expected_return=0.08,
        volatility=0.15,
        max_weight=0.5,
        min_weight=-0.1,
    ),
    StrategyParams(
        "Factor",
        expected_return=0.10,
        volatility=0.12,
        max_weight=0.5,
        min_weight=0.0,
    ),
]

# Correlation matrix
corr = np.array([[1.0, -0.3, 0.2], [-0.3, 1.0, 0.1], [0.2, 0.1, 1.0]])

# Define constraints
constraints = PortfolioConstraints(
    max_gross_leverage=1.5,
    max_net_leverage=1.0,
    target_volatility=0.12,
    kelly_fraction=0.5,
)

# Create position sizer
sizer = PositionSizer(strategies, constraints, corr)

# Compute positions
positions = sizer.compute_positions(volatility_target=True)
risk_decomp = sizer.risk_decomposition(positions)

# Calculate portfolio summary statistics
port_return = sum(positions[s.name] * s.expected_return for s in strategies)
weights_array = np.array(list(positions.values()))

# Reconstruct full covariance for portfolio vol calc
vols = np.array([s.volatility for s in strategies])
full_cov = np.outer(vols, vols) * corr
port_vol = np.sqrt(weights_array @ full_cov @ weights_array)
port_sharpe = port_return / port_vol
gross_leverage = sum(abs(w) for w in positions.values())
net_leverage = sum(positions.values())
Out[16]:
Console
Position Sizing Results
============================================================

Constraints Applied:
  Kelly Fraction: 0.5
  Max Gross Leverage: 1.5
  Target Volatility: 12.0%

Final Positions:
------------------------------------------------------------
  Momentum:
    Weight:    37.5%
    Risk Contribution:   59.7%
  MeanRev:
    Weight:    31.2%
    Risk Contribution:   15.3%
  Factor:
    Weight:    31.2%
    Risk Contribution:   24.9%

Portfolio Summary:
------------------------------------------------------------
  Gross Leverage: 1.00
  Net Leverage: 1.00
  Expected Return: 10.1%
  Volatility: 9.3%
  Sharpe Ratio: 1.09

The position sizer allocates the most capital to the Factor strategy, which has the best risk-adjusted returns (Sharpe ratio 0.83). Despite Momentum having the highest expected return, its higher volatility results in a lower optimal weight. The Mean Reversion strategy's negative correlation with Momentum provides diversification benefits, earning it a meaningful allocation despite its lower expected return.

Drawdown-Based Position Adjustment

Many practitioners reduce position sizes following drawdowns, either as a risk management discipline or to preserve capital for potential mean reversion opportunities. This can be formalized:

In[17]:
Code
def drawdown_adjusted_sizing(
    base_weight: float,
    current_dd: float,
    dd_threshold: float = 0.10,
    max_reduction: float = 0.5,
) -> float:
    """
    Reduce position size based on current drawdown.

    Parameters:
    - base_weight: Normal position size
    - current_dd: Current drawdown (positive number, e.g., 0.15 = 15% drawdown)
    - dd_threshold: Drawdown level at which reduction begins
    - max_reduction: Maximum reduction factor at extreme drawdowns

    Returns:
    - Adjusted position size
    """
    if current_dd <= dd_threshold:
        return base_weight

    # Linear reduction from threshold to 2x threshold
    reduction_range = dd_threshold  # Full reduction at 2x threshold
    excess_dd = current_dd - dd_threshold
    reduction_pct = min(excess_dd / reduction_range, 1.0) * max_reduction

    return base_weight * (1 - reduction_pct)


# Example: show position adjustment across drawdown levels
drawdowns = np.linspace(0, 0.30, 50)
base_position = 1.0
adjusted_positions = [
    drawdown_adjusted_sizing(base_position, dd) for dd in drawdowns
]
Out[18]:
Visualization
Line chart showing position size versus drawdown with linear reduction after threshold.
Drawdown-based position sizing adjustment. Position sizes remain constant during normal operations but reduce linearly once drawdown exceeds the 10% threshold, reaching 50% reduction at 20% drawdown.

This drawdown-based adjustment provides automatic de-risking during losing periods. While it may reduce returns during recoveries, it helps preserve capital and reduces the psychological burden of maintaining full positions during painful drawdowns.

Limitations and Practical Considerations

Position sizing theory provides valuable guidance, but several limitations affect real-world implementation.

Parameter estimation uncertainty: The Kelly formula requires accurate estimates of expected return and volatility. In practice, expected returns are notoriously difficult to estimate. An overestimate of μ\mu by 50% leads to a 50% overestimate of optimal leverage, potentially disastrous during adverse conditions. This uncertainty is the primary reason practitioners use fractional Kelly, treating the formula's output as an upper bound rather than a target.

Non-stationarity: Financial return distributions change over time. Volatility clusters, correlations spike during crises, and regime changes alter expected returns. Position sizing calibrated to historical parameters may be dramatically wrong for future conditions. Dynamic approaches like volatility targeting partially address this by continuously re-estimating parameters.

Model misspecification: The continuous Kelly derivation assumes normally distributed returns. Real returns exhibit fat tails, meaning extreme events occur far more frequently than Gaussian models predict. A "5-sigma" event that should occur once in 7,000 years under normality happens roughly once per decade in markets. Tail risk makes any fixed-fraction betting strategy vulnerable to catastrophic losses during extreme events.

Liquidity and market impact: As discussed in Transaction Costs and Market Impact, large positions affect prices. The theoretical optimal position may not be achievable without significant market impact, and forced liquidation during drawdowns occurs at the worst possible prices. Position sizing must account for realistic execution constraints, particularly for strategies operating in less liquid markets.

Correlation instability: Diversification benefits assumed when allocating across strategies depend on correlation estimates. During market crises, correlations typically spike toward 1.0, exactly when diversification is most needed. Portfolio-level position sizing should stress-test performance under elevated correlation scenarios.

Despite these limitations, the frameworks presented remain valuable for several reasons. First, they provide quantitative discipline, replacing intuition-based sizing with principled analysis. Second, they clarify the tradeoffs between growth and risk, helping you choose appropriate points on the risk spectrum. Third, they highlight the extreme sensitivity of outcomes to leverage, encouraging conservative approaches. Even imperfect Kelly estimates, scaled down by fractional multipliers and capped by leverage constraints, produce more robust position sizing than ad hoc approaches.

Summary

Position sizing determines whether a trading edge compounds into wealth or destruction. This chapter developed the mathematical foundations and practical frameworks for optimal sizing:

Kelly Criterion fundamentals: The Kelly formula maximizes long-term geometric growth by betting a fraction f=pbqbf^* = \frac{pb - q}{b} in the discrete case or applying leverage =μσ2\ell^* = \frac{\mu}{\sigma^2} in the continuous case. Full Kelly betting is aggressive, often too aggressive for practical use.

Fractional Kelly: Using 25-50% of Kelly-optimal sizing sacrifices modest growth for substantially reduced variance and drawdown risk. Half-Kelly achieves 75% of Kelly growth with 25% of the variance, an attractive tradeoff for most investors.

Multi-strategy allocation: For portfolios of strategies, optimal allocation follows wΣ1μ\mathbf{w}^* \propto \boldsymbol{\Sigma}^{-1} \boldsymbol{\mu}, generalizing Kelly to account for correlations. Strategies with better risk-adjusted returns and lower correlation receive higher weights.

Risk budgeting: When expected returns are uncertain, risk parity approaches allocate based on risk contribution rather than expected return. This produces more robust portfolios, though potentially at the cost of expected return.

Leverage dangers: Leverage amplifies both gains and losses nonlinearly. Maximum drawdown probability increases dramatically with leverage, and numerous historical examples demonstrate how excessive leverage has destroyed sophisticated investors. Practical constraints on gross leverage, position limits, and margin requirements provide essential guardrails.

Volatility targeting: Dynamic position sizing based on current volatility estimates automatically reduces risk during dangerous periods. This approach has demonstrated ability to improve risk-adjusted returns across many asset classes and strategies.

The next chapter addresses ethical and regulatory considerations in quantitative trading, examining how position sizing and trading practices intersect with market integrity and investor protection requirements.

Quiz

Ready to test your understanding? Take this quick quiz to reinforce what you've learned about position sizing, the Kelly Criterion, and leverage management.

Loading component...

Reference

BIBTEXAcademic
@misc{positionsizingleveragekellycriterionstrategy, author = {Michael Brenndoerfer}, title = {Position Sizing & Leverage: Kelly Criterion Strategy}, year = {2026}, url = {https://mbrenndoerfer.com/writing/optimal-position-sizing-kelly-criterion-leverage}, organization = {mbrenndoerfer.com}, note = {Accessed: 2025-01-01} }
APAAcademic
Michael Brenndoerfer (2026). Position Sizing & Leverage: Kelly Criterion Strategy. Retrieved from https://mbrenndoerfer.com/writing/optimal-position-sizing-kelly-criterion-leverage
MLAAcademic
Michael Brenndoerfer. "Position Sizing & Leverage: Kelly Criterion Strategy." 2026. Web. today. <https://mbrenndoerfer.com/writing/optimal-position-sizing-kelly-criterion-leverage>.
CHICAGOAcademic
Michael Brenndoerfer. "Position Sizing & Leverage: Kelly Criterion Strategy." Accessed today. https://mbrenndoerfer.com/writing/optimal-position-sizing-kelly-criterion-leverage.
HARVARDAcademic
Michael Brenndoerfer (2026) 'Position Sizing & Leverage: Kelly Criterion Strategy'. Available at: https://mbrenndoerfer.com/writing/optimal-position-sizing-kelly-criterion-leverage (Accessed: today).
SimpleBasic
Michael Brenndoerfer (2026). Position Sizing & Leverage: Kelly Criterion Strategy. https://mbrenndoerfer.com/writing/optimal-position-sizing-kelly-criterion-leverage