An Applied Math buddy of mine and I were discussing probability theory, statistics, and coincidence. He pointed out that if you take a room full of 23 people, there’s a 50% chance that at least two of them share a birthday.

No way!

Way. Here’s an article from The Mathematical Association of America that illustrates this nicely:

One simple example of a coincidence that often surprises people involves birthdays.

It’s rather unlikely that you and I share the same birthday (month and date). The more people you pull into the group, however, the more likely it is that at least two people will have matching dates.

Ignoring the minor technicality of leap years, it’s clear that in a group of 366 people, at least two must share a birthday. Yet it seems counterintuitive to many that only 23 people are needed in a group to have a 50-50 chance of at least one coincidental birthday.

To see why it takes just 23 people to reach even odds on sharing a birthday, you have to look at the probabilities. Assume that all 365 days have an equal chance of being a birthday. For a party of one, there is no possibility of a coincidence. So, the probability of that particular date being a unique birthday is 365/365. For a second person to have a birthday that doesn’t match that of the first, he or she must be born on any one of the other 364 days of the year.

You obtain the probability of no match between the birthdays of two people by multiplying 365/365 times 364/365, which equals .9973. Hence, the probability of a match is 1 – .9973, or .0027, which is much less than 1 percent.

With two people, there are 363 unused birthdays. The probability that a third person has a birthday that differs from the other two distinct birthdays is 363/365. So, for three people, the chance of having no pair of matching birthdays is 365/365 x 364/365 x 363/365, or .9918.

As the number of people brought into the group increases, the chance of there being no match decreases. By the time the crowd numbers 23 people, the probability of no matching birthdays is .4927. Thus, the chance of at least one match within a group of 23 people is .5073, or slightly better than 50 percent.

The reason the number is as low as 23 is that you aren’t looking for a specific match. It doesn’t have to be two particular people or a given date. Any match involving any date or any two people is enough to create a coincidence. Indeed, there are 253 different pairings possible among 23 people, any of which could lead to a match.

Variations of the birthday problem serve as useful models for analyzing coincidences, says statistician Persi Diaconis of Stanford University.

It’s possible to derive a simple formula that gives an approximate answer for the number required to get a 50-percent chance of a match. That number is 1.2 times the square root of the number of categories. For 365 categories (days of the year), the number is 1.2 multiplied by the square root of 365, or 23.

Using the formula, it’s easy to calculate that, if you were born on a planet where a year is 687 days long, you would need at least 31 people to make the odds of a match better than 50:50.

To get a 95-percent chance of a match, multiply the square root of the number of categories by 2.5. In the case of terrestrial birthdays, you would need about 48 people.

What about triple matches? That’s a trickier calculation, but the answer turns about to be 88 people. Here’s a table showing the number, N, required to have a probability greater than .5 of k or more matches with 365 categories.

k2 3 4 5 6 7 8 9 10 11 12 13 N23 88 187 313 460 623 798 985 1181 1385 1596 1813 Thus, in an audience of 1,000 people, there’s a good chance that at least nine people have the same birthday.

Several years ago, Diaconis and Fred Mosteller of Harvard University derived a formula that covers multiple matches involving multiple categories. For example, what are the chances that three members of a family have a birthday on the same day of a month (though not necessarily the same month)? Taking the number of days per month to be 30, the formula gives the approximate answer that a triple match in day of the month has about a 50-50 chance if at least 18 people are included in the group.

Now, suppose that certain coincidences involve matches that are close but not exact. It turns out, for example, that it takes just 14 people in a room to have even odds of finding two birthdays that are identical or fall on consecutive days. Among seven people, there is about a 60 percent probability that two will have birthdays within a week of each other. Among four people, the probability that two will have birthdays within 30 days of each other is about 70 percent.

“Changing the conditions for coincidence slightly can change the numbers a lot,” Diaconis and Mosteller contend in a 1989 paper in the Journal of the American Statistical Association. “In day-to-day coincidences even without a perfect match, enough aspects often match to surprise us.”

What about the fact that birthdays aren’t actually uniformly distributed throughout the year? In the United States, the data show a seasonal pattern, varying between 5 percent above and 7 percent below the average daily frequency.

Average Daily Birth Frequencies in the United States, 1978-1987

MonthDaily FrequencyJanuary 0026123 February 0026785 March 0026838 April 0026426 May 0026702 June 0027424 July 0028655 August .0028954 September .0029407 October .0027705 November .0026842 December .0026864 Any deviation from a uniform distribution improves the chances of a match, Diaconis says.

Nonetheless, the month-to-month variation of U.S. births is sufficiently small that match probabilities barely budge from those calculated by assuming a uniform distribution.

Source: Ivars Peterson, 1998References:

Diaconis, P., and F. Mosteller. 1989. Methods for studying coincidences. Journal of the American Statistical Association 84(December):853.

Gardner, M. 1986. Coincidence. In Knotted Doughnuts and Other Mathematical Entertainments. New York: W.H. Freeman.

Hanley, J.A. 1992. Jumping to coincidences: Defying odds in the realm of the preposterous. American Statistician 46(August):197.

Matthews, R., and F. Stones. 1998. Coincidences: The truth is out there. Teaching Statistics 20(No. 1):17. (Available at http://ourworld.compuserve.com/homepages/rajm/tscoin.htm.)

Nunnikhoven, T.S. 1992. A birthday problem solution for nonuniform birth frequencies. American Statistician 46(November):270.

Peterson, I. 1998. The Jungles of Randomness: A Mathematical Safari. New York: Wiley.

Stewart, I. 1998. What a coincidence! Scientific American (June):95.