While executing some statistical detective work for a client we had a major “aha!” moment and realized something like “Amdahl’s Law” rephrased in terms of probability would solve everything. We finished our work using direct methods and moved on. But it is an interesting question: what is the probabilist’s (or gambler’s) equivalent of Amdahl’s Law?
Amdahl’s Law is famous idea due to computer architect Gene Amdahl. It is a simple technique that computer scientists use to re-direct their work back to important parts of problems. Suppose you have a complicated system you wish to speed up. Suppose this system is spending a p-fraction of its time in an important sub-process and that you have an idea that would speed up the sub-process by a factor of k. Should you invest the effort?
Amdahl’s Law says (by simple arithmetic): the speed-up (the ratio of the old run-time over the new run-time) the entire system would achieve if you implemented your improvement is not the factor of k you would hope for, but instead:
For example if p = 1/3 then you can only speed up the over all system by at most a factor of 33%, even your idea is so astoundingly good that you have k=1000.
Amdahl’s Law reminds us that speeding up a component you do not lose much time to is not an important accomplishment. In fact Amdahl’s Law directly prescribes looking at your most expensive components as being the largest opportunities for improvement. Appealing to Amdahl’s Law is an important nerd-tool to end “color of the bike shed” arguments (and concentrate only on the design of systems that actually have an impact on outcomes).
It is clear there are similar principles for managing expenses, revenue, effort and so on (such as the Pareto Principle).
But what is the equivalent statement in the harder and more complicated world of probabilities and gambling systems? There are a lot of candidate statements and theorems (such as “look for horses not for zebras”, the Kraft Inequality, Kullback Leibler Distance, Cross Entropy and the Asymptotic Equipartition Principle) but I think the most powerful and direct analogue is: the Kelly Betting System. The Kelly Betting System is a remarkable system that, like Amdahl’s Law, tells us exactly what to look at (and surprisingly some things to ignore).
Kelly’s original paper: “A New Interpretation of Information Rate” J. L. Jr Kelly, AT&T Technical Journal (1956) phrases the problem as betting at a horse race. The technique applies more generally (other forms of gambling, portfolio management, even explaining the preferences of lab-mice) but the clearest example remains a horse race.
We follow the excellent discussion of the problem from Cover and Thomas “Information Theory” Wiley (1991). Consider a simplified horse race where there is only one payoff offered: picking the winning horse. Suppose the (unknown) true probability of the i-th horse winning is p_i. Further suppose the track publishes a set of payoffs for each horse such that if you bet a dollar on the i-th horse and it wins: you are given o_i dollars back.
Now a gambler that has no estimate of the p_i might put all of their money on “the highest paying horse.” That is picking the i such that o_i is maximal (“going for big score”). A somewhat more informed gambler might put all of their money on the “horse with the best expected return” that is a horse i that maximizes p_i * o_i. But this betting strategy “invites ruin”: you have probability of 1 – p_i of losing all of your money. Kelly starts with the controversial idea of trying to maximize expected log-return (instead of maximizing expected return). Maximizing expected log-return avoids ruin, maximizes the exponential rate your wealth grows and maximizes the median wealth over all outcomes (see: “The Kelly System Maximizes Median Fortune” S N Ethier, Journal of Applied Probability (2004) vol. 41 (4) pp. 1230-1236). Even the observation that you don’t always want to put all of your money in a “favorable bet” (that is one with expectation p_i * o_i >1) is an important one.
To get the next part of Kelly’s system consider the sum of reciprocals of track offered payoffs:
At any real track this sum will be greater than 1 (i.e. the o_i will be small, making the sum large). The larger the sum the more clearly unfair the track’s published payoff schedule is. Let us assume we were at a fantastically generous track where this sum is exactly 1 (admittedly unrealistic, and both the paper and the book work beyond this limitation). In this case we can write r_i = 1/o_i and we know r_i > 0 and the r_i sum to 1. That is we can interpret the r_i as the track’s estimate of the probability of the i-th horse winning. If o_i = 100 (the track is paying off 100:1) we then can infer they think the i-th horse has no more than a 1 in 100 chance of winning (else they could not afford to offer the bet). Kelly’s system gives (and proves correct) the following remarkable advice: if the sum given above is 1 (i.e. the track is paying off at least a fair rate) then you can safely bet all of your money and you should bet a p_i fraction of your money on the i-th horse.
That is: if you decide the track is paying off so much that it is worth your while to gamble then you should then completely ignore the track’s payoff schedule in making your bet. You might use the track’s published payoffs as some of your evidence when trying to estimate the p_i (the probability of each horse winning), but once you have estimated these probabilities you then ignore the track’s payoff rates in designing your bets. In fact your expected rate of winning is exactly proportional to how much closer to the true probabilities your estimate is than the track’s estimate is (Cover/Thomas example 6.1.1, so if unless you know something the track does not know you should not bet). Also you should bet even on unlikely and underpaying horses to help cover the possibilities (this is because you are making a series of bets, not just a single bet- so each bet’s value is computed under the assumption that your other bets have failed). This (provably correct) advice is contrary to many obvious and traditional betting systems.
The Kelly System is simultaneously very precise and broadly applicable. For example: it has be extended to many other games and the stock market (see: “The Kelly Criterion and the Stock Market” Louis M Rotando, Edward O Thorp, The American Mathematical Monthly (1992) vol. 99 (10) pp. 922-931). The Kelly System gives actionable advice (exact amounts to bet or exact amounts of effort to invest) and is very specific in saying what to look at.
Just as Amdahl’s law shows us component speedup is a distraction the Kelly System shows us that published rates of return are siren songs. Thus the Kelly System is the gambler’s equivalent of Amdahl’s Law.
Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.