Quick Links You last visited December 3, 2016, 12:35 pm All times shown are Eastern Time (GMT5:00)  Standard deviation of lotto setsS.Windsor, CT United States Member #4580 May 4, 2004 119 Posts Offline  Posted: December 22, 2004, 7:33 am  IP Logged  
For each lotto game one can add the numbers in a set and get a sum, which will form a nearly perfect normal curve if all combinations are used. But each set has a mean and a std.dev. , which will not form a normal curve because there will be more of the lowest value than of the highest. E.g. in the 5/53 lotto part of Powerball the lowest std.dev. will be 1.58 for the set 12345 and all othe r sets of five consecutive numbers, of which there must be 48, while the highest value must be 27.67 for the sets 1235253 and 12515253. The mean of these two extremes will be 14.63. But a test of 205 sets from 10/12/02 yeilded 14.87 while a com puter simulation for 10 000 draws gave 14.77. Does anybody know a formula for predicting the mean std.dev. for any lotto? This mean itself must have a variance, which could be used to set a filter at 95%. If we divide the range 27.671.58 in six units we get 4.35 as a std.dev. for the mean, and if we divide the spread in eight units we get 3.26. The average of these two is 3.8. It is interesting to note that for the 205 sets it was 3.78. Thus if we use the value 3.8 we could use the range 14.8 +/ 7.6 for 95% . Any comments? Bertil   
Canada Member #6394 August 21, 2004 97 Posts Offline  Posted: December 24, 2004, 5:18 pm  IP Logged  
How did you get 1.58 for the 12345 set?   
I am The Avatar... SC United States Member #4355 April 15, 2004 345 Posts Online  Posted: December 24, 2004, 5:32 pm  IP Logged  
Bertil, There is no formula that I know of to "predict" the mean of the standard deviation. One can try to "bootstrap" the mean of the standard deviation. You seem to be quite adept at simulating random sets. Bootstrapping allows you to simulate the distribution of the standard deviation from which you can "estimate" the mean standard deviation. Happy bootstrapping. 'chaser   
Canada Member #2192 August 29, 2003 27 Posts Offline  Posted: December 24, 2004, 9:15 pm  IP Logged  
Quote: Originally posted by Bertil on December 22, 2004
For each lotto game one can add the numbers in a set and get a sum, which will form a nearly perfect normal curve if all combinations are used. But each set has a mean and a std.dev. , which will not form a normal curve because there will be more of the lowest value than of the highest. E.g. in the 5/53 lotto part of Powerball the lowest std.dev. will be 1.58 for the set 12345 and all othe r sets of five consecutive numbers, of which there must be 48, while the highest value must be 27.67 for the sets 1235253 and 12515253. The mean of these two extremes will be 14.63. But a test of 205 sets from 10/12/02 yeilded 14.87 while a com puter simulation for 10 000 draws gave 14.77. Does anybody know a formula for predicting the mean std.dev. for any lotto? This mean itself must have a variance, which could be used to set a filter at 95%. If we divide the range 27.671.58 in six units we get 4.35 as a std.dev. for the mean, and if we divide the spread in eight units we get 3.26. The average of these two is 3.8. It is interesting to note that for the 205 sets it was 3.78. Thus if we use the value 3.8 we could use the range 14.8 +/ 7.6 for 95% . Any comments? Bertil
Go here: http://www.itl.nist.gov/div898/handbook/eda/section3/eda3662.htm or search for Uniform Distribution For a 6/49 game Standard Deviation = SQRT((BA)**2/12) where B=49 A=1   
S.Windsor, CT United States Member #4580 May 4, 2004 119 Posts Offline  Posted: December 25, 2004, 9:07 am  IP Logged  
Quote: Originally posted by Fenix on December 24, 2004
How did you get 1.58 for the 12345 set?
I got the numbers from my handheld calculator. If you calculate the s.d. by hand you will get sqrt 10/4 as a sample but sqrt 10/5 as a population. Here we are dealing with samples. Bertil   
S.Windsor, CT United States Member #4580 May 4, 2004 119 Posts Offline  Posted: December 25, 2004, 9:14 am  IP Logged  
Quote: Originally posted by Nick Koutras on December 24, 2004
Quote: Originally posted by Bertil on December 22, 2004
For each lotto game one can add the numbers in a set and get a sum, which will form a nearly perfect normal curve if all combinations are used. But each set has a mean and a std.dev. , which will not form a normal curve because there will be more of the lowest value than of the highest. E.g. in the 5/53 lotto part of Powerball the lowest std.dev. will be 1.58 for the set 12345 and all othe r sets of five consecutive numbers, of which there must be 48, while the highest value must be 27.67 for the sets 1235253 and 12515253. The mean of these two extremes will be 14.63. But a test of 205 sets from 10/12/02 yeilded 14.87 while a com puter simulation for 10 000 draws gave 14.77. Does anybody know a formula for predicting the mean std.dev. for any lotto? This mean itself must have a variance, which could be used to set a filter at 95%. If we divide the range 27.671.58 in six units we get 4.35 as a std.dev. for the mean, and if we divide the spread in eight units we get 3.26. The average of these two is 3.8. It is interesting to note that for the 205 sets it was 3.78. Thus if we use the value 3.8 we could use the range 14.8 7.6 for 95% . Any comments? Bertil
Go here: http://www.itl.nist.gov/div898/handbook/eda/section3/eda3662.htm or search for Uniform Distribution For a 6/49 game Standard Deviation = SQRT((BA)**2/12) where B=49 A=1
In his book H.Schneider determined the s.d. for 52 draws of the Florida and the UK 6/49 game and got 14.1. This value does not agree with your formula, which I'm confused about. I'm familiar with the formula SQRT (49^21)/12 for a single integer draw from 49 but not yours. Please clarify. Bertil
  
United States Member #9059 November 26, 2004 128 Posts Offline  Posted: December 25, 2004, 1:55 pm  IP Logged  
Hi,
13.85641 is very close to 14.1, the difference is common probability vs actual statistics
Regards   
S.Windsor, CT United States Member #4580 May 4, 2004 119 Posts Offline  Posted: December 25, 2004, 6:00 pm  IP Logged  
Quote: Originally posted by Hyperdimension on December 25, 2004
Hi,
13.85641 is very close to 14.1, the difference is common probability vs actual statistics
Regards
Hi Hype, The value 13.856 must come from SQRT 192, which comes from (491)^2/12. But this formula apllies to a continuous uniform distribution. But we are here dealing wwith a discrete unif.distr. for which the formula is (N^21)/2, which wold yield 14.14. But that value refers to a single integer sample, not to six. When sampling without replacement we must include the finite population corrction factor SQRT (Nn)/(N1) or else there would be no difference in taking 5 or 6 samples. We can now predict the variance for the sum of samples. In the 6/49 game it is 32.8 for the mean sum 150. But we are still short a formula for the variance of the std.dev. I suspect we must settle for an approximation. Bertil
  
United States Member #9059 November 26, 2004 128 Posts Offline  Posted: December 26, 2004, 1:14 am  IP Logged  
Hi,
Ion Saliu has a program call FORMULA.exe and Superformula, both programs calculate the Standard deviation for an dvent of probability p in N of binomial dvents,
I'll use Superformula for the next example,
The program calculates p as a fraction of 2 values, 6 in 49 in this case,
1st element of the fraction p = 6
2nd element of the fraction p = 49
Enter the number of trials, N =2000
Results:
The standard deviation for an dvent of probability
p = .12244898
in 2000 binomial experiments is:
BSD = 14.66
The expected (theoretical) number of successes is: 245
Based on the Normal Probability Rule:
Ã¹ 68.2% of the successes will fall within 1 Standard Deviation
from 245  i.e., between 230  260
Ã¹Ã¹ 95.4% of the successes will fall within 2 Standard Deviations
from 245  i.e., between 215  275
Ã¹Ã¹Ã¹ 99.7% of the successes will fall within 3 Standard Deviations
from 245  i.e., between 200  290
Regards
  
S.Windsor, CT United States Member #4580 May 4, 2004 119 Posts Offline  Posted: December 26, 2004, 8:24 am  IP Logged  
Quote: Originally posted by Hyperdimension on December 26, 2004
Hi,
Ion Saliu has a program call FORMULA.exe and Superformula, both programs calculate the Standard deviation for an dvent of probability p in N of binomial dvents,
I'll use Superformula for the next example,
The program calculates p as a fraction of 2 values, 6 in 49 in this case,
1st element of the fraction p = 6 2nd element of the fraction p = 49 Enter the number of trials, N =2000
Results:
The standard deviation for an dvent of probability p = .12244898 in 2000 binomial experiments is: BSD = 14.66
The expected (theoretical) number of successes is: 245
Based on the Normal Probability Rule:
Ã¹ 68.2% of the successes will fall within 1 Standard Deviation from 245  i.e., between 230  260 Ã¹Ã¹ 95.4% of the successes will fall within 2 Standard Deviations from 245  i.e., between 215  275 Ã¹Ã¹Ã¹ 99.7% of the successes will fall within 3 Standard Deviations from 245  i.e., between 200  290
Regards
Hi again, I respect anybody who can make sense out of Saliu's writings, because I'm unable to decide if he is a genius or a crackpot. The idea of treating lotto draws as a binomial dvent strikes me as unsound. They are hypergeometric dvents and thus we need the finite population correction factor. If we look at each draw we need to focus on one parameter to describe it and sum or mean would seem to be reasonable. In a 6/49 game there are 14 million combinations and their sum or mean form a perfect normal curve not a binomial one. The number 14.77 caught my attention but I'm unable to understand what 245 means, nor do I see what the ranges represent. Can you please clarify. Bertil
  
United States Member #9059 November 26, 2004 128 Posts Offline  Posted: December 26, 2004, 4:41 pm  IP Logged  
Hi,
The right person to answer your question is Mr. Ion Saliu,
I find an interesting article about binomial distribution, with the next example:
Poisson Distribution
In extreme cases, very small p so that the standard deviation is not much less than the mean, the Gaussian Distribution is not appropriate, but a different approximation is: the Poisson Distribution. Going back to the Binomial Distribution (which is still exact), we only need to worry about values of n much smaller than N.
In the Indiana Lottery, people choose 6 numbers from 149. There are
45!/(39! 6!) = 13983816
combinations; more than one person can buy a ticket with the same number. Suppose 28 million tickets are sold in a given week, what is the probability for zero winners? one winner? etc...
One number is selected, we have to assume that all numbers are purchased with equal likelihood, so that on any given purchase the probability that it will be the winner is about 1 in 14 million (that is p=1/14000000). We have:
a= N p = 28000000 (1/14000000) = 2
Pa(n) ≅ ea an/n!
P2(0) ≅ e2 20/0! = e2 = 0.1353
P2(1) ≅ e2 21/1! = e2 = 0.2707
P2(2) ≅ e2 22/2! = e2 = 0.2707
P2(3) ≅ e2 23/3! = e2 = 0.1804
P2(4) ≅ 0.0902
P2(5) ≅ 0.0361
P2(6) ≅ 0.0120
P2(7) ≅ 0.0034
P2(8) ≅ 0.0009
P2(9) ≅ 0.0002
P2(10) ≅ 0.00003
We see that it is most likely that there are 1 or 2 winners, but 0, 3, and 4 would not be surprising. The probability for more than 10 winners is about 0.000008 (be very suspicious if this occurs!).
What is the probability to sell 50 million tickets without a winner? The expected number of winners is
a = N p = (50000000)(1/14000000) = 3.57
The chance to get zero is:
P3.57(0) ≅ e3.57 3.570/0! = e3.57 = 0.028
That is, about 1/35 (not completely unlikely).
What is the probability to sell 100 million tickets without a winner? The expected number of winners is
a = N p = (108)(1/14000000) = 7.1
The chance to get zero is:
P7.1(0) ≅ e7.1 7.10/0! = e7.1 = 0.00079
We don't expect to see this very often (1/1265).
Sounds interesting   
S.Windsor, CT United States Member #4580 May 4, 2004 119 Posts Offline  Posted: December 26, 2004, 9:39 pm  IP Logged  
Quote: Originally posted by Hyperdimension on December 26, 2004
Hi, The right person to answer your question is Mr. Ion Saliu, I find an interesting article about binomial distribution, with the next example: Poisson Distribution In extreme cases, very small p so that the standard deviation is not much less than the mean, the Gaussian Distribution is not appropriate, but a different approximation is: the Poisson Distribution. Going back to the Binomial Distribution (which is still exact), we only need to worry about values of n much smaller than N. In the Indiana Lottery, people choose 6 numbers from 149. There are 45!/(39! 6!) = 13983816 combinations; more than one person can buy a ticket with the same number. Suppose 28 million tickets are sold in a given week, what is the probability for zero winners? one winner? etc... One number is selected, we have to assume that all numbers are purchased with equal likelihood, so that on any given purchase the probability that it will be the winner is about 1 in 14 million (that is p=1/14000000). We have: a= N p = 28000000 (1/14000000) = 2 Pa(n) ≅ ea an/n! P2(0) ≅ e2 20/0! = e2 = 0.1353 P2(1) ≅ e2 21/1! = e2 = 0.2707 P2(2) ≅ e2 22/2! = e2 = 0.2707 P2(3) ≅ e2 23/3! = e2 = 0.1804 P2(4) ≅ 0.0902 P2(5) ≅ 0.0361 P2(6) ≅ 0.0120 P2(7) ≅ 0.0034 P2(8) ≅ 0.0009 P2(9) ≅ 0.0002 P2(10) ≅ 0.00003 We see that it is most likely that there are 1 or 2 winners, but 0, 3, and 4 would not be surprising. The probability for more than 10 winners is about 0.000008 (be very suspicious if this occurs!). What is the probability to sell 50 million tickets without a winner? The expected number of winners is a = N p = (50000000)(1/14000000) = 3.57 The chance to get zero is: P3.57(0) ≅ e3.57 3.570/0! = e3.57 = 0.028 That is, about 1/35 (not completely unlikely). What is the probability to sell 100 million tickets without a winner? The expected number of winners is a = N p = (108)(1/14000000) = 7.1 The chance to get zero is: P7.1(0) ≅ e7.1 7.10/0! = e7.1 = 0.00079 We don't expect to see this very often (1/1265). Sounds interesting
Hi, your comment is unrelated to the problem we were trying to solve. So in the same spirit let me mention a few other facts about the 6/49 game. If 14 million tickets are sold before the draw there is likely to remain 36.8% not sold, with 21 million sold there will be 22.3% not sold, with 42 million sold there will remain 5% not sold, with 56 million sold there will still remain 1.83% not sold and with 70 million sold there will be 0.674% remaining . All of these calculations are theoretical and have little bearing on reality. Bertil
  
CA United States Member #2987 December 10, 2003 832 Posts Offline  Posted: December 27, 2004, 12:55 am  IP Logged  
Bertil  hyperdimension's math is based on probability, not possibility. If a lottery is completely random in its generation of tickets, and if there are 13,983,816 possibilities and the same number of tickets are sold, it is almost a mathematical certainty that all the tickets sold will not cover all the possibilities in the lottery. The odds of that happening far exceed the chances of winning. When the first ticket is sold, it is unique  there is no other ticket currently in that lottery with that set of numbers. The second ticket, therefore, has 1 chance in 13,983,816 of duplicating the first ticket. The third ticket sold has 2 chances in 13,983,816 of duplicating either of the first two tickets. And so would go the progression. And, given this, you would think that around the point when 50% of the tickets were sold, the odds would have it that the ticket you bought would have an even chance of duplicating any of the previously sold tickets. Not so  that would only apply if any of the previously sold tickets haven't already duplicated another ticket. Probability is computed in fractions or decimals that are added progressively  the second ticket in this given lottery has 1/13,983,815 possibility of being an exact duplicate of the first ticket. The third ticket has a 1/13,983,815 chance of being an exact duplicate of the first ticket plus a 1/13,983,814 chance of being an exact duplicate of the second ticket. And so it goes from that point. Somewhere between the 30% to 40% total sales point the fractions or decimals add up to 1/2 or 0.5  at that point if a number hasn't already been duplicated the odds are 1::1 that that ticket will duplicate one of the previous tickets. Given this, if a number generator is truly random, some sets of numbers may appear six or seven times and more in a lottery of this size, and other sets will not appear at all. Hope this helps to clear this up. gl john Blessed Saint Leibowitz, keep 'em dreamin' down there..... Next week's convention for Psychics and Prognosticators has been cancelled due to unforeseen circumstances. =^.^=   
United States Member #9059 November 26, 2004 128 Posts Offline  Posted: December 27, 2004, 1:51 am  IP Logged  
Hi,
Thank you for the explanation johnph77,
Continuing with the problem, first I created the full wheel 6/49, in total 13,983,816 tickets, obtaining the next results..
Valid N Mean Minimum Maximum Std.Dev.
1var 13983816 7.14286 1.000000 44.00000 5.736564
2var 13983816 14.28571 2.000000 45.00000 7.405872
3var 13983816 21.42857 3.000000 46.00000 8.112726
4var 13983816 28.57143 4.000000 47.00000 8.112726
5var 13983816 35.71429 5.000000 48.00000 7.405872
6var 13983816 42.85714 6.000000 49.00000 5.736564
Variance
1var 32.90817
2var 54.84694
3var 65.81633
4var 65.81633
5var 54.84694
6var 32.90817
Then my computer crash   
CA United States Member #2987 December 10, 2003 832 Posts Offline  Posted: December 27, 2004, 2:39 am  IP Logged  
This is starting to bug me for some reason. I'm going to try this one more time as I feel my previous explanation was inadequate. I have a random number generator generating tickets for the lottery example given above. Defying all the odds in the known universe, the RNG has given me 13,983,815 different sets of numbers in the same amount of draws. What are the odds of drawing that last possibility in the 13,983,816th draw? You got it  the same odds of predicting any given set of numbers in the lottery  1::13,983,816. That means that the random number generator has 13,983,815 chances of duplicating one of the previously drawn sets of numbers. That's the other end of determining probability. gl john Blessed Saint Leibowitz, keep 'em dreamin' down there..... Next week's convention for Psychics and Prognosticators has been cancelled due to unforeseen circumstances. =^.^=   
