Tina K. Russell

March 31, 2008

What are the odds of that?

Filed under: Uncategorized — Tags: , , , — Tina Russell @ 7:38 pm

A Journey to Baseball’s Alternate Universe – New York Times

[Baseball’s] most mythic achievement is Joe DiMaggio’s 56-game hitting streak, a feat that has never come even close to being matched. Fans and scientists alike, including Edward M. Purcell, a Nobel laureate in physics, and Stephen Jay Gould, the evolutionary biologist, have described the streak as well-nigh impossible.

In a fit of scientific skepticism, we decided to calculate how unlikely Joltin’ Joe’s achievement really was. Using a comprehensive collection of baseball statistics from 1871 to 2005, we simulated the entire history of baseball 10,000 times in a computer. In essence, we programmed the computer to construct an enormous set of parallel baseball universes, all with the same players but subject to the vagaries of chance in each one.

To tease out the meaningful lessons from random effects (fluky streaks that happen by luck), we redid the whole thing 10,000 times. In each of these simulated histories, somebody holds the record for the longest hitting streak. We tabulated who that player was, when he did it, and how long his streak was.

And suddenly the unlikely becomes likely: we get a very long streak each time we run baseball history.

This is a very important statistical lesson: somebody has to have the longest hitting streak in baseball history, and it makes sense that it would be a talented player like DiMaggio. A similar logic plays into the “birthday problem”: most people would be surprised to know that, in a room of 23 people, there’s a slightly greater chance than not that two people have the same birthday. The thing is, when you turn that question around, it becomes: what are the odds that no two people share a birthday, that each person’s is unique? Then you realize that the odds are not one in ~365¼ (the chance that one person has a specific, arbitrary birthday), but more like one in two (the chance that someone in the room has anyone else’s birthday). These odds increase exponentially the more people you add, because more people means more birthdays and more possible birthday matches.

This is important because a lot of fallacious arguments are based on a loose understanding of statistics. (Here’s an example. Here’s another.) Remember that “unlikely” is not the same thing as “impossible,” and a curious statistic is not necessarily a definite trend.

Advertisements

Blog at WordPress.com.