Arizona United States Member #165073 March 24, 2015 220 Posts Offline

Posted: April 15, 2015, 8:17 pm - IP Logged

"At any rate, I'm sure there's somebody in the academic world that would have an explanation as to why it happens as much as it does. I myself have no clue, but I'd love to hear somebody with an advanced degree in math, statistics and probability explain why it does."

It's known as the "birthday paradox". If you have a group of just 23 people, it's more likely than not that two of them will share a birthday. The reason is that you have to consider the number of pairs of people, not just the number of people.

For the same reason, in a set of 894 drawings for a 5/39 matrix, it's more likely than not that two of them will have the same numbers.

mid-Ohio United States Member #9 March 24, 2001 19828 Posts Offline

Posted: April 15, 2015, 11:36 pm - IP Logged

Quote: Originally posted by GiveFive on April 14, 2015

Well, many folks think it shouldn't happen due to the odds. 575,757 to 1 odds in a 5/39 matrix makes it tough for some people to accept that it happens with as much frequency as it apparently does across many states with 5/39 games.

Due to the fact that it happens as much as it does, it seems to me that it's a lot more probable than most people realize. I'm wondering if it's a "probability thing". Statistics and probability while similar in nature aren't the exactly the same thing. Maybe statistics says it shouldn't happen all that much, but probability says it can and should.

At any rate, I'm sure there's somebody in the academic world that would have an explanation as to why it happens as much as it does. I myself have no clue, but I'd love to hear somebody with an advanced degree in math, statistics and probability explain why it does.

You don't have to be an academic to explain the obvious. As you said the odds of hitting all five winning numbers in a 5/39 game are 1:575757 and as I said Ohio's RC5(5/39) had 3700 drawings. Over the period of those 3700 drawings the odds of a repeat went from 1:575757 to 3700:575757 or 1:156. After 3700 drawings you can expect 1 in 156 drawings to be a repeat combination.

* you don't need to buy more tickets, just buy a winning ticket *

NY State United States Member #92609 June 10, 2010 3702 Posts Offline

Posted: April 16, 2015, 4:37 pm - IP Logged

Quote: Originally posted by RJOh on April 15, 2015

You don't have to be an academic to explain the obvious. As you said the odds of hitting all five winning numbers in a 5/39 game are 1:575757 and as I said Ohio's RC5(5/39) had 3700 drawings. Over the period of those 3700 drawings the odds of a repeat went from 1:575757 to 3700:575757 or 1:156. After 3700 drawings you can expect 1 in 156 drawings to be a repeat combination.

In New York, Take5 (a 5/39 game) has had 44 duplicate sets of five numbers since 1/17/92 with 6,584 drawings.

So that's 6584/44 or 1:149 drawings. (very close to what Ohio RC5 has experienced)

I checked Ohio Rolling Cash5 and I see 11 sets of duplicate winners in 3706 drawings or 3706/11 which equals 337. So I'm confused because one way it's once every 156 draws and the other way it's 337.

Is there something wrong with the way I'm doing my calculation? (I ask that because I see that your math is correct.) Thanks! I appreciate your input.

About playing the lottery -- You will lose more than you win. Until you hit a jackpot. Then everything changes!

mid-Ohio United States Member #9 March 24, 2001 19828 Posts Offline

Posted: April 16, 2015, 6:46 pm - IP Logged

Quote: Originally posted by GiveFive on April 16, 2015

In New York, Take5 (a 5/39 game) has had 44 duplicate sets of five numbers since 1/17/92 with 6,584 drawings.

So that's 6584/44 or 1:149 drawings. (very close to what Ohio RC5 has experienced)

I checked Ohio Rolling Cash5 and I see 11 sets of duplicate winners in 3706 drawings or 3706/11 which equals 337. So I'm confused because one way it's once every 156 draws and the other way it's 337.

Is there something wrong with the way I'm doing my calculation? (I ask that because I see that your math is correct.) Thanks! I appreciate your input.

"Is there something wrong with the way I'm doing my calculation? (I ask that because I see that your math is correct.) Thanks! I appreciate your input."

No there isn't, you just didn't read my post correct. I posted at 3700 drawings the odds of a repeat are 1:155 and at the beginning the odds were 1:575757 or 0 since you can't have a repeat until after the first drawing. If you took the average of 0 and 3700 or 1850 and calculated for that many drawings the odds of a repeat are 1:311.(not 377 but closer) I imagine the relationship is not a straight line but a curve that accelerate as the number of drawings increases.

*NY game with 6584 drawings can expect a repeat every 1:87 drawings.

* you don't need to buy more tickets, just buy a winning ticket *

Arizona United States Member #165073 March 24, 2015 220 Posts Offline

Posted: April 16, 2015, 8:39 pm - IP Logged

Quote: Originally posted by RJOh on April 16, 2015

"Is there something wrong with the way I'm doing my calculation? (I ask that because I see that your math is correct.) Thanks! I appreciate your input."

No there isn't, you just didn't read my post correct. I posted at 3700 drawings the odds of a repeat are 1:155 and at the beginning the odds were 1:575757 or 0 since you can't have a repeat until after the first drawing. If you took the average of 0 and 3700 or 1850 and calculated for that many drawings the odds of a repeat are 1:311.(not 377 but closer) I imagine the relationship is not a straight line but a curve that accelerate as the number of drawings increases.

*NY game with 6584 drawings can expect a repeat every 1:87 drawings.

The probability that a given drawing will match the numbers of a previous drawing is almost linear with the total number of drawings. I say "almost" because previous duplicates affect it a little - it's exactly linear with the number of distinct sets drawn in previous drawings.

The expected total number of duplicates in a set of n drawings for a 5/39 matrix is close to n*(n-1)/1,151,514. For 3700 drawings, that's about 12; for 6584, 38; for 7740, 52. The actual numbers of 11, 44, and 50 are all well within reasonable variation.

NY State United States Member #92609 June 10, 2010 3702 Posts Offline

Posted: April 17, 2015, 7:16 am - IP Logged

Quote: Originally posted by Murgatroyd on April 16, 2015

The probability that a given drawing will match the numbers of a previous drawing is almost linear with the total number of drawings. I say "almost" because previous duplicates affect it a little - it's exactly linear with the number of distinct sets drawn in previous drawings.

The expected total number of duplicates in a set of n drawings for a 5/39 matrix is close to n*(n-1)/1,151,514. For 3700 drawings, that's about 12; for 6584, 38; for 7740, 52. The actual numbers of 11, 44, and 50 are all well within reasonable variation.

Thanks so very much for your post!!

I'm curious as to the origin of the formula that calculates the expected total number of duplicates. Did you work it out yourself, or is it an equation that has been around for years?

Please know that I'm a layman when it comes to math!! I realize that any explanation you might post could be well beyond my capability to comprehend! Unfortunately for me, you may have to "dumb it down" a bit, if that's even something that can be done!

Would you know if there's something I could Google? I've tried to Google an explanation for all the repeating winning numbers in a 5/39 matrix that have happened in many states, but because I don't know precisely what to Google, the hits that come back haven't helped me much. I saw your post stating that The Birthday Paradox is responsible for all of the duplicate sets of winners that 5/39 games have experienced over the years, and I Googled that. Although it was terrific reading, the Wikopedia article I read started to go well beyond my ability to comprehend why it happens. Thanks again!

About playing the lottery -- You will lose more than you win. Until you hit a jackpot. Then everything changes!

NY United States Member #23835 October 16, 2005 3474 Posts Offline

Posted: April 17, 2015, 6:53 pm - IP Logged

Just to expand on what's been said, in case it's not clear.

"you have to consider the number of pairs of people, not just the number of people.

A group of 23 people can make 253 possible pairs. A can be paired with B, C, D and so on, for 22 pairs. B can be paired with C, D, E and so on, for 21 pairs. The process repeats until V can be paired with W, making only one pair. That makes the total 22+21+20 ... +2 +1. Any particular person in a group of 23 has only 22 chances to find another person in the group with the same birthday as them. When you take all of the possible pairs, there are 253 chances that one person will have the same birthday as one of the other 22. With an average of about 365.25 days per year, the chances of sharing a birthday are about 1 in 365.25, so the chances of not sharing a birthday are 364.25 in 365.25. The chance that none of the 253 pairs will share a birthday is 364.25/365.25 raised to the 253rd power, or 49.98%, leaving a 50.02% chance that there will be a pair with the same birthday.

It's exactly the same with the ever-increasing set of winning combinations. If there have been 3700 combinations drawn the first combination can match any of the next 3699. The 2nd can match any of the next 3698, and so on. You can add that up the long way: 3699+3698+3697 ... +3 +2 +1 to see that there are 6,843,150 pairs. You can also calculate it more easily as (3700*3699)/2 = 6,843,150. Making that generic, we get (n * (n-1))/2 = number of pairs for n drawings.

"I'm curious as to the origin of the formula [n*(n-1)/1,151,514] that calculates the expected total number of duplicates"

Notice that the formula starts out the same as the formula for calculating the number of pairs: n* (n-1). Instead of dividing by 2, it's divided by the number of possible combinations. That means the number of repeats expected as a result of simple probability is (number of pairs / number of combinations) * 2. In general, the higher the number of drawings the more likely it is that the actual number of repeats will closely match the expected number.

"I imagine the relationship is not a straight line but a curve that accelerate as the number of drawings increases."

n* (n-1) is the same as n^2 - n. The number of pairs is a function of the square of the number of drawings, so it's close to a geometric progression. After 100 drawings there will be 4,950 pairs. After 200 drawings there will be 19,900 pairs, or almost 4 times as many. Double it again, to 400 drawings and there will be 79,800 pairs, which is a bit more than 16 times as many as after 100 drawings. That means that the chances that the will be a repeat somewhere in the drawing history is an accelerating curve. The chances that a particular drawing or a particular set of drawings (such as the next 87) will produce a repeat is a different matter, and is a simple function of the percentage of combinations that have already been drawn. When the percentage of combinations that have already been drawn is very low the linear relationship that Murgatroyd describes will be true. As the percentage of combinations that have been drawn increases and repeats start occurring, the relationship will depart further and further from a straight line. At the extreme, if all P3 combinations have been drawn the chances of a repeat will remain the same even if the number of drawings increases by a factor of a million.

NY State United States Member #92609 June 10, 2010 3702 Posts Offline

Posted: April 17, 2015, 9:34 pm - IP Logged

Quote: Originally posted by KY Floyd on April 17, 2015

Just to expand on what's been said, in case it's not clear.

"you have to consider the number of pairs of people, not just the number of people.

A group of 23 people can make 253 possible pairs. A can be paired with B, C, D and so on, for 22 pairs. B can be paired with C, D, E and so on, for 21 pairs. The process repeats until V can be paired with W, making only one pair. That makes the total 22+21+20 ... +2 +1. Any particular person in a group of 23 has only 22 chances to find another person in the group with the same birthday as them. When you take all of the possible pairs, there are 253 chances that one person will have the same birthday as one of the other 22. With an average of about 365.25 days per year, the chances of sharing a birthday are about 1 in 365.25, so the chances of not sharing a birthday are 364.25 in 365.25. The chance that none of the 253 pairs will share a birthday is 364.25/365.25 raised to the 253rd power, or 49.98%, leaving a 50.02% chance that there will be a pair with the same birthday.

It's exactly the same with the ever-increasing set of winning combinations. If there have been 3700 combinations drawn the first combination can match any of the next 3699. The 2nd can match any of the next 3698, and so on. You can add that up the long way: 3699+3698+3697 ... +3 +2 +1 to see that there are 6,843,150 pairs. You can also calculate it more easily as (3700*3699)/2 = 6,843,150. Making that generic, we get (n * (n-1))/2 = number of pairs for n drawings.

"I'm curious as to the origin of the formula [n*(n-1)/1,151,514] that calculates the expected total number of duplicates"

Notice that the formula starts out the same as the formula for calculating the number of pairs: n* (n-1). Instead of dividing by 2, it's divided by the number of possible combinations. That means the number of repeats expected as a result of simple probability is (number of pairs / number of combinations) * 2. In general, the higher the number of drawings the more likely it is that the actual number of repeats will closely match the expected number.

"I imagine the relationship is not a straight line but a curve that accelerate as the number of drawings increases."

n* (n-1) is the same as n^2 - n. The number of pairs is a function of the square of the number of drawings, so it's close to a geometric progression. After 100 drawings there will be 4,950 pairs. After 200 drawings there will be 19,900 pairs, or almost 4 times as many. Double it again, to 400 drawings and there will be 79,800 pairs, which is a bit more than 16 times as many as after 100 drawings. That means that the chances that the will be a repeat somewhere in the drawing history is an accelerating curve. The chances that a particular drawing or a particular set of drawings (such as the next 87) will produce a repeat is a different matter, and is a simple function of the percentage of combinations that have already been drawn. When the percentage of combinations that have already been drawn is very low the linear relationship that Murgatroyd describes will be true. As the percentage of combinations that have been drawn increases and repeats start occurring, the relationship will depart further and further from a straight line. At the extreme, if all P3 combinations have been drawn the chances of a repeat will remain the same even if the number of drawings increases by a factor of a million.

WOW!! Thanks! (I think)

I guess I was right when I suspected that it wouldn't be an easy task to dumb it down for a guy like me. ;-) G5

About playing the lottery -- You will lose more than you win. Until you hit a jackpot. Then everything changes!