# Probability, Baseball and the WSJ

As if we needed further demonstration that we suck at both math and probabilities, The Wall Street Journal foolishly suggests via headline that the Los Angeles Angels’ season is already effectively over because the team had an MLB-worst 10-20 spring training record. This isn’t the first time that the Journal has performed mathematical malpractice, of course.  But it still demonstrates a fundamental misunderstanding of how to use probabilities – whether in sports, gambling, or investments.

As the great Daniel Kahneman pointed out recently (video here), “to compute probabilities you need to keep several possibilities in your mind at once. It’s difficult for most people. Typically, we have a single story with a theme. People have a sense of propensity, that the system is more likely to do one thing than the other, but it’s quite different from the probabilities where you have to think of two possibilities and weigh their relative chances of happening.”

Instead, the Journal sees that data as indicating a singular conclusion – “a single story with a theme” – that the Angels are doomed.  Even a cursory look at the numbers shows why that suggestion is so flawed.

Here the crux of the argument from the Journal:  “The Angels finished a MLB-worst 10-20 in the exhibition season. Over the last 10 years, the teams with the worst spring record have averaged only 70 regular-season wins, with the 2011 Astros and the 2010 Pirates finishing with the league’s worst record. None of those teams reached the postseason.”

Right off the bat (pardon the pun), the idea that a mere ten years of data can support the definitive conclusion that the spring’s worst team won’t make the play-offs is ludicrous.  Moreover, such a one-to-one comparison (worst spring training = no play-offs) makes way too much of the data we do have.  There are lots of reasons why good teams might perform poorly in the spring including experimentation, extensive use and testing of marginal or fringe players and simple randomness. Moreover, the sample size is much too small to draw definitive conclusions.

The most that can be said from the data is that spring success (or lack thereof) shows some correlation (0.22) to regular season success.  In sum, the top five spring teams are generally three times as likely to make the play-offs as the last five (37.5 percent versus 12.5 percent).  Earlier analysis got similar results.

Accordingly, since most projections have the Angels as among the favorites to reach the post-season (this projection says their play-off probability is 66.4 percent, essentially two-of-three), one might, for example, reasonably question whether that likelihood is too high or that the estimated probability of a division rival’s making the play-offs is too low based upon the spring data and bet accordingly.  Or the Angels’ spring performance may simply be a reminder to Halo apologists to check their work.

The Journal’s approach is typical of gamblers everywhere – take the idea or two you deem best and bet heavily on them. It makes intuitive sense but it’s wrong.  There are way too many variables involved to bet the farm on one or two results.  Too much can go wrong — like a smoked line drive hit right at the shortstop.

As I have noted before, legendary gambler Billy Walters, a crucial member of the famous “computer group,” which used a careful and computerized process to make a fortune and, as a result, to revolutionize sports betting in the 1980s, is an exception to the nearly universal rule that gamblers don’t win in the aggregate (those opulent Las Vegas casinos were built with gambling losses after all).  Walters uses multiple consultants – mostly mathematicians – in making his probability-based projections and makes a ton of money betting on them.  Walters’s process is to create his own line largely using statistical measures of the teams and then to bet when his view of a game is significantly different from available commercial betting lines.

Walters is staggeringly rich (according to 60 Minutes, he is “worth hundreds of millions of dollars”), but he claims a lifetime winning percentage of only 57 percent, as compared to the break-even of 52.38 percent (the winning percentage sports gamblers need to hit to offset paying out a 10 percent vig on their losing bets).  Even so, while he has had losing months (randomness can overcome a good process for substantial periods of time), he has never had a losing year over the last 30.

But that 30-year winning streak only started after he made a major change in his approach.  By his own account, Walters lost his shirt many times over before becoming focused, data-driven and careful to play the long game (making a profit while losing 43 percent of the time requires it, especially because the losing streaks can be very long indeed).  The key is not just betting a couple of seemingly great ideas but, instead, lots of bets – whenever the data suggests a significant edge – with bet size being determined by the extent of the edge.

A single bet with a generally high probability of success is going to go wrong far too often for comfort.  An eighty percent chance of sun still means it will rain 20 out of 100 times with that forecast and no sports betting line offers anything like an 80 percent chance of success.  Gambling – like investing and sports – is probability driven.  So long as we don’t learn to think and act probabilistically, we are doomed to fail at them all.

Advertisement

## 4 thoughts on “Probability, Baseball and the WSJ”

1. Pingback: That’s So Random | Above the Market

2. Pingback: That’s So Random | Above the Market

3. Pingback: The Maleficent 7 | Above the Market