# The Kelly Criterion: You Don’t Know the Half of It

Despite expending substantial resources on a formal financial education, I did not encounter the Kelly criterion in business school or the CFA curriculum. I came across it almost by accident, in William Poundstone’s delightful book *Fortune’s Formula*.

Created in 1956 by John Kelly, a Bell Labs scientist, the Kelly criterion is a formula for sizing bets or investments from which the investor expects a positive return. It is the only formula I’ve seen that comes with a mathematical proof explaining why it can deliver higher long-term returns than any alternative.

In my view, the formula is consistent with the value investing concept of a margin of safety and leads to concentrated portfolios in which the dominant ideas have the greatest edge and smallest downside.

Despite its relative obscurity and lack of mainstream academic support, the Kelly criterion has attracted some of the best-known investors on the planet, Warren Buffett, Charlie Munger, Mohnish Pabrai, and Bill Gross, among them. While the Kelly formula requires an estimate of the probability distribution of investment outcomes ahead of time, i.e., a crystal ball, its mainstream alternative, Harry Markowitz’s mean/variance optimization, calls for an estimate of the covariance matrix, which for a bottom-up investor, I believe is much more difficult to obtain.

After reading Poundstone’s book, I wanted to apply the Kelly criterion to my own investing. I learn by example and my math is rusty, so I looked for a short, non-technical article about how the formula can work in an equity-like investment.

Unfortunately, most of the sources I found use the wrong formula.

The top article in a Google search for “Kelly calculator equity” presents a simple, stylized investment with a 60% chance of gaining and a 40% chance of losing 20% in each simulation. No other outcomes are possible, and the investment can be repeated across many simulations, or periods.

It’s clearly a good investment, with a positive expectation: E(x) = 60% * 20 + 40% * (-20%) = 4%. But what share of the portfolio should it take up? Too small an allocation and the portfolio will lose out on growth. Too large and a few unlucky outcomes — even a single one — could depress it beyond recovery or wipe it out altogether. So what percentage allocation, consistently applied, maximizes the portfolio’s potential long-term growth rate?

The article I found and many like it use the formula **Kelly % = W – [(1 – W) / R]**, where W is the win probability and R is the ratio between profit and loss in the scenario.

For this investment, W is 60% and R is 1 (20%/20%). The loss is expressed as a positive. Plugging in the numbers, the Kelly % = 60% – [(1 – 60%) / (20%/20%)] = 20%. In other words, a 20% allocation to the investment maximizes the portfolio’s potential long-term growth.

This is simply incorrect. The error is intuitive, empirical, and mathematical. The formula does not account for the magnitude of potential profits and losses (volatility), only their ratio to each other. Indeed, the article does not even list the potential gain or loss. Change the potential profit and loss from 20% each to 200% each, and the investment becomes 10 times more volatile. Yet the ratio R stays the same — 200%/200% = 1 — as does the formula’s resulting 20% optimal allocation.

This does not add up.

Consider a simulation with three different allocation scenarios, all replicating the same investment over and over: Red allocates 20% of the portfolio, as the articles suggests, Blue goes all in at 100%, and Green levers up to 150%. The chart below visualizes how the simulation plays out after 100 rounds.

In the Red, “Kelly optimal” scenario, a 20% allocation earned a relatively puny 2x return. The Blue, all-in option generated a 6.2x return. Green outpaced Blue for a time but a string of losses in the later rounds led to a 3.4x return.

This wasn’t just a lucky outcome for Blue. Run the simulation 1,000 times and Blue beats Red 79% and Green 67% of the time. Blue’s median return was at least 3x better than Red’s and almost 2x better than Green’s. In short, the 20% allocation is too conservative and the Green option too aggressive.

**Ending Portfolio Value after 1,000 Simulations (In Dollars, Starting with $1 in Period 1)**

The Kelly formula in the first scenario — **Kelly % = W – [(1 – W)/R] **— is not an anomaly. It turns up in many other sources, including NASDAQ, Morningstar, Wiley’s For Dummies series, Old School Value, etc., and is analogous to the one in *Fortune’s Formula*: **Kelly % = edge/odds**.

But the formula works only for binary bets where the downside scenario is a total loss of capital, as in -100%. Such an outcome may apply to blackjack and horse racing, but rarely to capital markets investments.

If the downside-case loss is less than 100%, as in the scenario above, a different Kelly formula is required: **Kelly % = W/A – (1 – W)/B**, where W is the win probability, B is the profit in the event of a win (20%), and A is the potential loss (also 20%).

Plugging in the values for our scenario: Kelly % = 60%/20% – (1 – 60%)/20% = 100%, which was Blue’s winning allocation.

The theoretical downside for all capital market investments is -100%. Bad things happen. Companies go bankrupt. Bonds default and are sometimes wiped out. Fair enough.

But for an analysis of the securities in the binary framework implied by the **edge/odds** formula, the downside-scenario probability must be set to the probability of a *total* capital loss, not the much larger probability of *some* loss.

There are many criticisms of the Kelly criterion. And while most are beyond the scope of this article, one is worth addressing. A switch to the “correct” Kelly formula — **Kelly % = W/A – (1 – W)/B** — often leads to significantly higher allocations than the more popular version.

Most investors won’t tolerate the volatility and resulting drawdowns and will opt to reduce the allocation. That’s well and good — both variations of the formula can be scaled down — but the “correct” version is still superior. Why? Because it explicitly accounts for and encourages investors to think through the downside scenario.

And in my experience, a little extra time spent thinking about that is richly rewarded.

#### Appendix: Supporting Math

Here is a derivation of the Kelly formula: An investor begins with $1 and invests a fraction (k) of the portfolio in an investment with two potential outcomes. If the investment succeeds, it returns B and the portfolio will be worth 1 + kB. If it fails, it loses A and the portfolio will be worth 1 – kA.

The investment’s probability of success is w. The investor can repeat the investment as often as desired but must invest the same fraction (k) each time. What fraction k will maximize the portfolio in the long term?

In the long term, after n times where n is large, the investor is expected to have w * n wins and (1 – w)n losses. The portfolio P will be worth:

We would like to solve for the optimal k:

To maximize , we take its derivative with respect to k and set it to 0:

Solving for k:

Note that if the downside-scenario loss is total (A = 1), this formula simplifies to the more popular version quoted above because R = B/A = B/1 = B, so:

#### Appendix: Supporting Code

Below is the R code used to produce the simulation and the charts above.

##########################################################

#Kelly Simulation, Binary Security

# by Alon Bochman

##########################################################

trials = 1000 # Repeat the simulation this many times

periods = 100 # Periods per simulation

winprob = 0.6 # Win probability per period

returns = c(0.2,-0.2) # Profit if win, loss if lose

fractions = c(0.2,1,1.5) # Competing allocations to test

library(ggplot2)

library(reshape2)

library(ggrepel)

percent <- function(x, digits = 2, format = “f”, …) {

paste0(formatC(100 * x, format = format, digits = digits, …), “%”)

}

set.seed(136)

wealth = array(data=0,dim=c(trials,length(fractions),periods))

wealth[,,1] =1 #Eq=1 in period 1

#Simulation loop

for(trial in 1:trials) {

outcome = rbinom(n=periods, size=1, prob=winprob)

ret = ifelse(outcome,returns[1],returns[2])

for(i in 2:length(ret)) {

for(j in 1:length(fractions)) {

bet = fractions[j]

wealth[trial,j,i] = wealth[trial,j,i-1] * (1 + bet * ret[i])

}

}

}

#Trial 1 Results

view.trial = 1

d <- melt(wealth)

colnames(d) = c(‘Trial’,’Fraction’,’Period’,’Eq’)

d = subset(d,Trial ==view.trial)

d$Fraction = as.factor(d$Fraction)

levels(d$Fraction) = paste(“Invest “,percent(fractions,digits=0),sep=”)

d[d$Period == periods,’Label’] = d[d$Period == periods,’Eq’]

ggplot(d, aes(x=Period,y=Eq, col=Fraction)) +

geom_line(size=1) + scale_y_log10() +

labs(y=”Portfolio Value”,x=”Period”) +

guides(col=guide_legend(title=”Allocation”)) +

theme(legend.position = c(0.1, 0.9)) +

scale_color_manual(values=c(“red”, “blue”,”green”)) + #Adjust if >2 allocations

geom_label_repel(aes(label = round(Label, digits = 2)),

nudge_x = 1, show.legend = F, na.rm = TRUE)

#All-Trial Results

d = data.frame(wealth[,,periods]) #Last period only

colnames(d) = paste(“Invest “,percent(fractions,digits=0),sep=”)

summary(d)

nrow(subset(d,d[,2] > d[,1])) / trials #Blue ahead of red

nrow(subset(d,d[,2] > d[,3])) / trials #Blue ahead of green

**If you liked this post, don’t forget to subscribe to the Enterprising Investor.**

*All posts are the opinion of the author. As such, they should not be construed as investment advice, nor do the opinions expressed necessarily reflect the views of CFA Institute or the author’s employer.*

Image credit: ©Getty Images/ PATCHARIN SIMALHEK

The general case, wherein the same result as yours is derived, is discussed in the Wikipedia entry for the Kelly criterion.

https://en.wikipedia.org/wiki/Kelly_criterion

Thanks Gregor. Wikipedia has it right. Most other sites – even some professionals – got the formula wrong.

In 1997 my father famously wrote the article “Debunking the Kelly Criterion.” Sports bettors and investors alike stand to gain a lot of wisdom from the article. https://professionalgambler.com/debunking.html

Mr. Miller, I have your book but it is sorely in need of updating. The latest edition is over 14 years old. I wish you would release a new edition or version because the info is critically outdated.Thank you for your time.

I am confused by your article. I am either misunderstanding something, or your article is incorrect. The point of the Kelly Criterion is, if you know the correct value of the inputs, the output will give you the optimum percentage of your Total funds to invest. In the example you gave, the Kelly formula said to bet 20%. However, you said it is more optimal to bet 100%. But if you bet 100%, if you lose once, you are broke, and can’t bet again. So, I don’t see why your charts doesn’t show the bet of 100% flatlining to 0 after 1 loss (same with betting 150%).

If you have a positive expected value for a bet, betting 100% will always yield a better expected return than betting 20%, but the problem, or issue is, after one bet you will be broke, and not be able to ever bet anything again.

If you bet 100%, one loss and you are broke. Same with betting 150%.

If you bet 100% and lose, you are not broke because in this scenario, your loss is -20%, not -100%. See the payoff table near the top of the article. This is typical of several capital markets investments, not so much in Blackjack.

What happens if the loss is only 10%, all other parameters remain the same?

You get an expected value of 8% but doesn’t the Kelly% turn negative or have I miscalculated. If so what does it mean ?

A Kelly% equal to or below zero means you dont have a positive expectation and should thus not bet anything at all! So yes, you have likely miscalculated at some point in that case.

I read the question as “what happens if you are able to cut the loss shorter at only 10%?”.

Surely this should improve results. I don’t know where the 8% comes from or what the “instead of” original figure was, but clearly a 60% chance of a 20% gain versus a 40% chance of only losing 10% means you’d instead way more.

Here k = 60/10 – 40/20 = 4 meaning you should gear up your investment 4 times (and watch those stop-losses like a hawk), which is what spread-betters and CFD traders aim to do.

The problem in the real world is twofold – first that the leverage comes at a profit-eroding daily cost which is hard to factor in to this form of the equation as it does not have a time element. Second, your 10% loss-limit is much more likely to be hit than if it was a 20% limit so you can’t assume “all other parameters remain the same”.

In theory though there woud be an optimal amount to gear up, but you’d have to keep adjusting it, buying more when in profit and selling when losing, which is what is often done in the real world by geared funds. Whether it is “ideal” to buy on the way up and sell on the way down is another discussion, but Kelly says you “should” to maintain the optimal gearing.

The simulation shown suggests green came out by far the best on average, so would it therefore not be better to have several geared-up separately managed groups of investments that were as uncorrelated as possible, in case of a bad run for one or more of them, rather than just one class of investments with 100% of your money and no gearing?

Plugging in the values for our scenario: Kelly % = 60%/20% – (1 – 60%)/20% = 100%, which was Blue’s winning allocation.

Your calculation is wrong.

The correction is

Kelly% = 60%/20%-(1-60%)/20%=20%

No YOUR calculation is wrong; that equation does come out to 100%. You can see this by simplifying: 3-2=1 (aka 100%)

Thank you for putting your article together – it raised some thought-provoking points.

I believe you overlooked what the Kelly Criterion is ultimately meant to represent. Namely, the Kelly Criterion states what amount you should wager for a bet based on the edge/odds under the assumption that you can lose 100% of your wager. Your wager is your risk. You’ll notice in your example the Kelly Criterion says you should wager 20%. If you take the result to mean you should risk 20% of your bankroll instead of wagering 20% your formula and the Kelly Criterion provide the same answer. Your reworked formula states that you should place 100% of your bankroll on the bet. Ultimately, this is only 20% of your bankroll at risk, which is exactly what the original formula came up with. It seems to me that if you interpret the Kelley Criterion to provide the percentage of bankroll you should risk there is not a need to rework the formula. Your simulations look to be equal to 0.2x Kelly, 1x Kelly and 1.5x Kelly. I believe your formula is the same as the original Kelly multiplied by (1/loss percentage).

The article brings up a few issues with the Kelly Criterion in the application to markets. I’d love to hear your thoughts on these points.

1) Leverage is not infinite so in an example where you wanted to place 5 independent market wagers at 20% bankroll risk and each had 20% downside risk, you would need to have access to at least 5x leverage.

2) The Kelly Criterion assumes you can infinitely divide your minimum bet. Securities markets generally have some minimum wager. With a large enough portfolio, the effect may be close to having the option of infinitely divisible bets but I think it is an important point to call out. How should the Kelly Criterion adjust for the minimum bet size as a % of bankroll?

*My comments are not meant to be investment advice of any kind. I am only looking to add thoughtful discussion to the article.

Good points! The reworked formula saves an additional step of figuring out the position size based on the position risk.

Agree with your initial comments that if you allow for the fact that you can lose 100% of your bet (eg blackjack), then the basic Kelly formula works quite well. It still seems to offer a more aggressive bet size than I may be comfortable with though I think that’s the point – to encourage a proper bet size, or something closer to “ideal”. For some that will mean reducing the amount wagered and some, increasing it.

Thank you for sharing your ideas.

Jason

Hi – I’m trying to run the code.

But it breaks on the first function

percent <- function(x, digits = 2, format = "f", …) {

paste0(formatC(100 * x, format = format, digits = digits, …), “%”)

}

What do the … in the first line of the function mean? I've never seen that before.

Thanks!

Actually – I figured it out. In my version of R language the quotation marks ” ” and ‘ ‘ are reversed. I’ve had to go through and reverse your usage.

Not sure why that is … I thought this kind of nonsense only happened in different Python versions 🙂

Got it working now!

What a waste of time.

Foremostly, you did not even bring the correct formula to the table.

Explicit laziness on your part for not even reading E.Thorp’s implementation.

Errors:

1. You modeled the portfolio with discrete probabilities

2. Did account for individual drift rates nor variance rates.

3. No dynamical reallocation between securities and fixed income.

I could go on but I’d be wasting my time on this.

A very interesting article. Indeed the blue strategy maximizes the growth rate of your bankroll in the long run. A more general Kelly formula, which leads to this strategy, is discussed among other practical properties of Kelly betting in Chapter 16 of my book “Surprises in Probability- Seventeen Short Stories, CRC Press, 2019.

Thanks a lot for the article. It certainly helps to understand the logic behind the formula…

I think one can argue a lot about the exact numbers here. However, given all the assumptions that go into the calculation I’d see the result more like a rough indication regarding portfolio allocation…

Thanks for all the kind words, folks. It is amazing to me that Investopedia is *still* showing the wrong Kelly formula, two years later: https://www.investopedia.com/articles/trading/04/091504.asp

I think your model is wrong. If you don’t get profit with A% with probability w, that doesn’t mean you always lost B% with 1- w.

So, the model should be modified like this

k is Kell %,

1^(1 – w – w’)n means the current HYP is between kr and ks, we holding.

w and w’ are win and loss probability

F = (1 + kr)^wn * (1 – ks)^w’n * 1^(1 – w – w’)n

log(F) = wn * log(1 + kr ) + w’n * log(1 – ks) + 0

k = w/s(w + w’) – w’/r(w + w’)