# The Kelly Criterion: You Don’t Know the Half of It

Despite expending substantial resources on a formal financial education, I did not encounter the Kelly criterion in business school or the CFA curriculum. I came across it almost by accident, in William Poundstone’s delightful book *Fortune’s Formula*.

Created in 1956 by John Kelly, a Bell Labs scientist, the Kelly criterion is a formula for sizing bets or investments from which the investor expects a positive return. It is the only formula I’ve seen that comes with a mathematical proof explaining why it can deliver higher long-term returns than any alternative.

In my view, the formula is consistent with the value investing concept of a margin of safety and leads to concentrated portfolios in which the dominant ideas have the greatest edge and smallest downside.

Despite its relative obscurity and lack of mainstream academic support, the Kelly criterion has attracted some of the best-known investors on the planet, Warren Buffett, Charlie Munger, Mohnish Pabrai, and Bill Gross, among them. While the Kelly formula requires an estimate of the probability distribution of investment outcomes ahead of time, i.e., a crystal ball, its mainstream alternative, Harry Markowitz’s mean/variance optimization, calls for an estimate of the covariance matrix, which for a bottom-up investor, I believe is much more difficult to obtain.

After reading Poundstone’s book, I wanted to apply the Kelly criterion to my own investing. I learn by example and my math is rusty, so I looked for a short, non-technical article about how the formula can work in an equity-like investment.

Unfortunately, most of the sources I found use the wrong formula.

The top article in a Google search for “Kelly calculator equity” presents a simple, stylized investment with a 60% chance of gaining and a 40% chance of losing 20% in each simulation. No other outcomes are possible, and the investment can be repeated across many simulations, or periods.

It’s clearly a good investment, with a positive expectation: E(x) = 60% * 20 + 40% * (-20%) = 4%. But what share of the portfolio should it take up? Too small an allocation and the portfolio will lose out on growth. Too large and a few unlucky outcomes — even a single one — could depress it beyond recovery or wipe it out altogether. So what percentage allocation, consistently applied, maximizes the portfolio’s potential long-term growth rate?

The article I found and many like it use the formula **Kelly % = W – [(1 – W) / R]**, where W is the win probability and R is the ratio between profit and loss in the scenario.

For this investment, W is 60% and R is 1 (20%/20%). The loss is expressed as a positive. Plugging in the numbers, the Kelly % = 60% – [(1 – 60%) / (20%/20%)] = 20%. In other words, a 20% allocation to the investment maximizes the portfolio’s potential long-term growth.

This is simply incorrect. The error is intuitive, empirical, and mathematical. The formula does not account for the magnitude of potential profits and losses (volatility), only their ratio to each other. Indeed, the article does not even list the potential gain or loss. Change the potential profit and loss from 20% each to 200% each, and the investment becomes 10 times more volatile. Yet the ratio R stays the same — 200%/200% = 1 — as does the formula’s resulting 20% optimal allocation.

This does not add up.

Consider a simulation with three different allocation scenarios, all replicating the same investment over and over: Red allocates 20% of the portfolio, as the articles suggests, Blue goes all in at 100%, and Green levers up to 150%. The chart below visualizes how the simulation plays out after 100 rounds.

In the Red, “Kelly optimal” scenario, a 20% allocation earned a relatively puny 2x return. The Blue, all-in option generated a 6.2x return. Green outpaced Blue for a time but a string of losses in the later rounds led to a 3.4x return.

This wasn’t just a lucky outcome for Blue. Run the simulation 1,000 times and Blue beats Red 79% and Green 67% of the time. Blue’s median return was at least 3x better than Red’s and almost 2x better than Green’s. In short, the 20% allocation is too conservative and the Green option too aggressive.

**Ending Portfolio Value after 1,000 Simulations (In Dollars, Starting with $1 in Period 1)**

The Kelly formula in the first scenario — **Kelly % = W – [(1 – W)/R] **— is not an anomaly. It turns up in many other sources, including NASDAQ, Morningstar, Wiley’s For Dummies series, Old School Value, etc., and is analogous to the one in *Fortune’s Formula*: **Kelly % = edge/odds**.

But the formula works only for binary bets where the downside scenario is a total loss of capital, as in -100%. Such an outcome may apply to blackjack and horse racing, but rarely to capital markets investments.

If the downside-case loss is less than 100%, as in the scenario above, a different Kelly formula is required: **Kelly % = W/A – (1 – W)/B**, where W is the win probability, B is the profit in the event of a win (20%), and A is the potential loss (also 20%).

Plugging in the values for our scenario: Kelly % = 60%/20% – (1 – 60%)/20% = 100%, which was Blue’s winning allocation.

The theoretical downside for all capital market investments is -100%. Bad things happen. Companies go bankrupt. Bonds default and are sometimes wiped out. Fair enough.

But for an analysis of the securities in the binary framework implied by the **edge/odds** formula, the downside-scenario probability must be set to the probability of a *total* capital loss, not the much larger probability of *some* loss.

There are many criticisms of the Kelly criterion. And while most are beyond the scope of this article, one is worth addressing. A switch to the “correct” Kelly formula — **Kelly % = W/A – (1 – W)/B** — often leads to significantly higher allocations than the more popular version.

Most investors won’t tolerate the volatility and resulting drawdowns and will opt to reduce the allocation. That’s well and good — both variations of the formula can be scaled down — but the “correct” version is still superior. Why? Because it explicitly accounts for and encourages investors to think through the downside scenario.

And in my experience, a little extra time spent thinking about that is richly rewarded.

#### Appendix: Supporting Math

Here is a derivation of the Kelly formula: An investor begins with $1 and invests a fraction (k) of the portfolio in an investment with two potential outcomes. If the investment succeeds, it returns B and the portfolio will be worth 1 + kB. If it fails, it loses A and the portfolio will be worth 1 – kA.

The investment’s probability of success is w. The investor can repeat the investment as often as desired but must invest the same fraction (k) each time. What fraction k will maximize the portfolio in the long term?

In the long term, after n times where n is large, the investor is expected to have w * n wins and (1 – w)n losses. The portfolio P will be worth:

We would like to solve for the optimal k:

To maximize , we take its derivative with respect to k and set it to 0:

Solving for k:

Note that if the downside-scenario loss is total (A = 1), this formula simplifies to the more popular version quoted above because R = B/A = B/1 = B, so:

#### Appendix: Supporting Code

Below is the R code used to produce the simulation and the charts above.

##########################################################

#Kelly Simulation, Binary Security

# by Alon Bochman

##########################################################

trials = 1000 # Repeat the simulation this many times

periods = 100 # Periods per simulation

winprob = 0.6 # Win probability per period

returns = c(0.2,-0.2) # Profit if win, loss if lose

fractions = c(0.2,1,1.5) # Competing allocations to test

library(ggplot2)

library(reshape2)

library(ggrepel)

percent <- function(x, digits = 2, format = “f”, …) {

paste0(formatC(100 * x, format = format, digits = digits, …), “%”)

}

set.seed(136)

wealth = array(data=0,dim=c(trials,length(fractions),periods))

wealth[,,1] =1 #Eq=1 in period 1

#Simulation loop

for(trial in 1:trials) {

outcome = rbinom(n=periods, size=1, prob=winprob)

ret = ifelse(outcome,returns[1],returns[2])

for(i in 2:length(ret)) {

for(j in 1:length(fractions)) {

bet = fractions[j]

wealth[trial,j,i] = wealth[trial,j,i-1] * (1 + bet * ret[i])

}

}

}

#Trial 1 Results

view.trial = 1

d <- melt(wealth)

colnames(d) = c(‘Trial’,’Fraction’,’Period’,’Eq’)

d = subset(d,Trial ==view.trial)

d$Fraction = as.factor(d$Fraction)

levels(d$Fraction) = paste(“Invest “,percent(fractions,digits=0),sep=”)

d[d$Period == periods,’Label’] = d[d$Period == periods,’Eq’]

ggplot(d, aes(x=Period,y=Eq, col=Fraction)) +

geom_line(size=1) + scale_y_log10() +

labs(y=”Portfolio Value”,x=”Period”) +

guides(col=guide_legend(title=”Allocation”)) +

theme(legend.position = c(0.1, 0.9)) +

scale_color_manual(values=c(“red”, “blue”,”green”)) + #Adjust if >2 allocations

geom_label_repel(aes(label = round(Label, digits = 2)),

nudge_x = 1, show.legend = F, na.rm = TRUE)

#All-Trial Results

d = data.frame(wealth[,,periods]) #Last period only

colnames(d) = paste(“Invest “,percent(fractions,digits=0),sep=”)

summary(d)

nrow(subset(d,d[,2] > d[,1])) / trials #Blue ahead of red

nrow(subset(d,d[,2] > d[,3])) / trials #Blue ahead of green

**If you liked this post, don’t forget to subscribe to the Enterprising Investor.**

*All posts are the opinion of the author. As such, they should not be construed as investment advice, nor do the opinions expressed necessarily reflect the views of CFA Institute or the author’s employer.*

Image credit: ©Getty Images/ PATCHARIN SIMALHEK

#### Professional Learning for CFA Institute Members

CFA Institute members are empowered to self-determine and self-report professional learning (PL) credits earned, including content on *Enterprising Investor*. Members can record credits easily using their online PL tracker.

The general case, wherein the same result as yours is derived, is discussed in the Wikipedia entry for the Kelly criterion.

https://en.wikipedia.org/wiki/Kelly_criterion

Thanks Gregor. Wikipedia has it right. Most other sites – even some professionals – got the formula wrong.

In 1997 my father famously wrote the article “Debunking the Kelly Criterion.” Sports bettors and investors alike stand to gain a lot of wisdom from the article. https://professionalgambler.com/debunking.html

Mr. Miller, I have your book but it is sorely in need of updating. The latest edition is over 14 years old. I wish you would release a new edition or version because the info is critically outdated.Thank you for your time.

I am confused by your article. I am either misunderstanding something, or your article is incorrect. The point of the Kelly Criterion is, if you know the correct value of the inputs, the output will give you the optimum percentage of your Total funds to invest. In the example you gave, the Kelly formula said to bet 20%. However, you said it is more optimal to bet 100%. But if you bet 100%, if you lose once, you are broke, and can’t bet again. So, I don’t see why your charts doesn’t show the bet of 100% flatlining to 0 after 1 loss (same with betting 150%).

If you have a positive expected value for a bet, betting 100% will always yield a better expected return than betting 20%, but the problem, or issue is, after one bet you will be broke, and not be able to ever bet anything again.

If you bet 100%, one loss and you are broke. Same with betting 150%.

If you bet 100% and lose, you are not broke because in this scenario, your loss is -20%, not -100%. See the payoff table near the top of the article. This is typical of several capital markets investments, not so much in Blackjack.

What happens if the loss is only 10%, all other parameters remain the same?

You get an expected value of 8% but doesn’t the Kelly% turn negative or have I miscalculated. If so what does it mean ?

A Kelly% equal to or below zero means you dont have a positive expectation and should thus not bet anything at all! So yes, you have likely miscalculated at some point in that case.

I read the question as “what happens if you are able to cut the loss shorter at only 10%?”.

Surely this should improve results. I don’t know where the 8% comes from or what the “instead of” original figure was, but clearly a 60% chance of a 20% gain versus a 40% chance of only losing 10% means you’d instead way more.

Here k = 60/10 – 40/20 = 4 meaning you should gear up your investment 4 times (and watch those stop-losses like a hawk), which is what spread-betters and CFD traders aim to do.

The problem in the real world is twofold – first that the leverage comes at a profit-eroding daily cost which is hard to factor in to this form of the equation as it does not have a time element. Second, your 10% loss-limit is much more likely to be hit than if it was a 20% limit so you can’t assume “all other parameters remain the same”.

In theory though there woud be an optimal amount to gear up, but you’d have to keep adjusting it, buying more when in profit and selling when losing, which is what is often done in the real world by geared funds. Whether it is “ideal” to buy on the way up and sell on the way down is another discussion, but Kelly says you “should” to maintain the optimal gearing.

The simulation shown suggests green came out by far the best on average, so would it therefore not be better to have several geared-up separately managed groups of investments that were as uncorrelated as possible, in case of a bad run for one or more of them, rather than just one class of investments with 100% of your money and no gearing?

Plugging in the values for our scenario: Kelly % = 60%/20% – (1 – 60%)/20% = 100%, which was Blue’s winning allocation.

Your calculation is wrong.

The correction is

Kelly% = 60%/20%-(1-60%)/20%=20%

No YOUR calculation is wrong; that equation does come out to 100%. You can see this by simplifying: 3-2=1 (aka 100%)

Thank you for putting your article together – it raised some thought-provoking points.

I believe you overlooked what the Kelly Criterion is ultimately meant to represent. Namely, the Kelly Criterion states what amount you should wager for a bet based on the edge/odds under the assumption that you can lose 100% of your wager. Your wager is your risk. You’ll notice in your example the Kelly Criterion says you should wager 20%. If you take the result to mean you should risk 20% of your bankroll instead of wagering 20% your formula and the Kelly Criterion provide the same answer. Your reworked formula states that you should place 100% of your bankroll on the bet. Ultimately, this is only 20% of your bankroll at risk, which is exactly what the original formula came up with. It seems to me that if you interpret the Kelley Criterion to provide the percentage of bankroll you should risk there is not a need to rework the formula. Your simulations look to be equal to 0.2x Kelly, 1x Kelly and 1.5x Kelly. I believe your formula is the same as the original Kelly multiplied by (1/loss percentage).

The article brings up a few issues with the Kelly Criterion in the application to markets. I’d love to hear your thoughts on these points.

1) Leverage is not infinite so in an example where you wanted to place 5 independent market wagers at 20% bankroll risk and each had 20% downside risk, you would need to have access to at least 5x leverage.

2) The Kelly Criterion assumes you can infinitely divide your minimum bet. Securities markets generally have some minimum wager. With a large enough portfolio, the effect may be close to having the option of infinitely divisible bets but I think it is an important point to call out. How should the Kelly Criterion adjust for the minimum bet size as a % of bankroll?

*My comments are not meant to be investment advice of any kind. I am only looking to add thoughtful discussion to the article.

Good points! The reworked formula saves an additional step of figuring out the position size based on the position risk.

Agree with your initial comments that if you allow for the fact that you can lose 100% of your bet (eg blackjack), then the basic Kelly formula works quite well. It still seems to offer a more aggressive bet size than I may be comfortable with though I think that’s the point – to encourage a proper bet size, or something closer to “ideal”. For some that will mean reducing the amount wagered and some, increasing it.

Thank you for sharing your ideas.

Jason

I’m very poor in mathematics but I agree with the answer of Aaron. I guess Kelly % = W – [(1 – W)/R] expresses the risk %. Open a position for this risk %. If you loose your bet, you will loose risk % of your capital. In your example, you should open a position with 100% and if you loose your bet then you will loose 20% of your capital.

Correct?

Hi – I’m trying to run the code.

But it breaks on the first function

percent <- function(x, digits = 2, format = "f", …) {

paste0(formatC(100 * x, format = format, digits = digits, …), “%”)

}

What do the … in the first line of the function mean? I've never seen that before.

Thanks!

Actually – I figured it out. In my version of R language the quotation marks ” ” and ‘ ‘ are reversed. I’ve had to go through and reverse your usage.

Not sure why that is … I thought this kind of nonsense only happened in different Python versions 🙂

Got it working now!

What a waste of time.

Foremostly, you did not even bring the correct formula to the table.

Explicit laziness on your part for not even reading E.Thorp’s implementation.

Errors:

1. You modeled the portfolio with discrete probabilities

2. Did account for individual drift rates nor variance rates.

3. No dynamical reallocation between securities and fixed income.

I could go on but I’d be wasting my time on this.

A very interesting article. Indeed the blue strategy maximizes the growth rate of your bankroll in the long run. A more general Kelly formula, which leads to this strategy, is discussed among other practical properties of Kelly betting in Chapter 16 of my book “Surprises in Probability- Seventeen Short Stories, CRC Press, 2019.

Thanks a lot for the article. It certainly helps to understand the logic behind the formula…

I think one can argue a lot about the exact numbers here. However, given all the assumptions that go into the calculation I’d see the result more like a rough indication regarding portfolio allocation…

Thanks for all the kind words, folks. It is amazing to me that Investopedia is *still* showing the wrong Kelly formula, two years later: https://www.investopedia.com/articles/trading/04/091504.asp

I think your model is wrong. If you don’t get profit with A% with probability w, that doesn’t mean you always lost B% with 1- w.

So, the model should be modified like this

k is Kell %,

1^(1 – w – w’)n means the current HYP is between kr and ks, we holding.

w and w’ are win and loss probability

F = (1 + kr)^wn * (1 – ks)^w’n * 1^(1 – w – w’)n

log(F) = wn * log(1 + kr ) + w’n * log(1 – ks) + 0

k = w/s(w + w’) – w’/r(w + w’)

I may be mistaken (I’ve been out of school for sometime)

but doesn’t (w +w’) = 1

K = w/s – w’/r

I’m not sure what s or r stand for

but the formula looks similar to Mr. Bockman’s

Thank you for the article Alon.

I’d like to double check my numbers with you to make sure I understand properly.

My system has a winning rate of 79%.

The average win is 3%.

Average loss 0.8%

=(0.79/0.008-(1-0.79)/0.03)

The Kelly % according to your adjusted formula is 9175%.

This doesn’t make much sense to me. Am I making a mistake?

I got a similar answer that doesn’t make seem to make sense either. Does that mean you should bet the house on your system?

Also, I’d like to know, Alon, when you say “A is the potential loss ” in the formula:

does this refer to the average loss or the maximum expected loss on a single trade?

I have made a spreadsheet that compares the two calculation methods next to each other. You can download it here: http://bit.ly/Kelly2ver

I agree Alon’s method is an improvement. But as a financial trader I can tell you there is still an element missing. Many investments/trades have two variables that should be considered: the potential win versus potential loss (you do that), but also the chance on each of those happening. If you rate those equally (by not including them), you are back in the coin-flip realm.

I have therefore use your formula, but use a WEIGHTED potential profit vs weighted potential loss.

Mike, How would your formula look?

Mr. Bochman, you are one of the fewest of the few writers on this subject that actually acknowledge the occurrence of partial losses, rather than the “if you lose a little, you lose everything” that most writers or commentators express in their math. Probably the oddest thing I’ve ever run across in my albeit limited exposure to what others think about the Kelly Criterion. In his paper “The Kelly Criterion in Blackjack, Sports Betting, and the Stock Market”, author Ed Thorp derives the biased coin-toss model for even money in which the betting fraction f*=p-q, or the probability of winning minus that of losing, but in the situation of uneven money it’s f*=p/a-q/b. where “a” and “b” are the amounts to be lost or gained, respectively, and by minimizing “a”, the only variable over which the player has any direct control, it’s possible to send f* to the moon. Seeing how so many writers and commentators just blindly set “a” to a value of 1 brings home to me a quote of Thorp’s from his early days in the stock market that he was both surprised and encouraged at how little was known by so many. Also a pretty good rebuttal against the efficient market hypothesis if there ever was one. Thanks, and congratulations!

Thanks very informative but it does not seem that you are aware of the architects of Kelly, Ziemba, and Thorp use for equities the kelly criterion of

Kelly =

u(geometric Brownian motion drift) – r(risk free interest rate)/sigma^2

you use the drift, not the mean of log returns,

They have multiple variations of this formula, one for multiple shares in a single portfolio, and Ziemba utilizes a stochastic dynamic programming approach to dynamic rebalancing through intemperol investment periods, the above equation you discussed was only used by them for horse racing and blackjack, not the stock market, it is applicable to option trading NOT shares.

Mr. Bochman,

Thank you for your views on the Kelly Criterion. I’m “surprised and encouraged”, as Ed Thorp would say at how little is known by so many – in this case on the Kelly Criterion itself, as evidenced by formulas such as

f* = p – q/b rather than f* = p/a – q/b and the like, with “a” being the fraction of the player’s account they stand to lose. With sloppy math like that, why should anyone trust an “investment advisor” with their hard-earned money? It’s just amazing how far up the academic ladder this goes. By minimizing “a”, you can amplify “f*” in a scientifically precise way and reap the benefits – by maximizing it as far as it says you can and by using any remainder as insurance. Nice to meet an MBA who can do math!

It’s probably useful to understand the difference between, say, a biased coin (a discrete calculation) and a stock price (a continuous situation). A stock price is an independent variable, with variance. A coin which is biased to return heads 53% of the time, requires only p-q=f*.

It’s interesting, though, that options are more similar to biased coins -in that the delta is a useful approximation of the likelihood that an option expires essentially worthless. Note though that the Black-Scholes calculation of delta allows us to skip several statistical steps-so all we need to do is assess the option chain information. For example, an option showing a delta of .47 suggests a ‘biased coin’ of p=.53. The basic allocation of our wealth is 6%.

I think a major psychological impediment is to extrapolate based on the ‘law of small numbers.’ If you study Thorp, Ziemba and numerous academic articles, the simulations are in the thousands. There is a similarity with the simulations, in that ‘full Kelly’ results are extremely volatile.

In looking at a simulation, we see the final outcome. The results seem ‘obvious’. However, the vast majority of people, unable to visualize the final outcome, will likely throw in the towel after a couple of severe downturns.

Could you explain how you calculate A and B if you were analysing data over the period of a year?

Good work re-deriving the general form of the Kelly Criterion. I am glad to see a more accessible example derivation available on the web thanks to you.

I am a fan of the formula you are calling “correct.” I too struggled to find this version when I first started looking into the KC and ended up deriving it myself, but I also later found a PDF of Kelly’s original derivation, which matches this “correct” version.

I’m here because I am looking at modifying it to maximize expected utility rather than maximizing growth rate (and wondering if it will be any different, since I use ln to model my personal utility function, lol). Anyway, keep up the good work.

Hi Saxon, I am interested in the PDF of Kelly’s original derivation.

I am not sure if you have seen this thread on Twitter about Expected Utility Theory https://twitter.com/breakingthemark/status/1339570230662717441

Hi Tom, try this link: https://www.princeton.edu/~wbialek/rome/refs/kelly_56.pdf

Thanks, I’ll probably give Bernoulli’s paper a read. 🙂