Standard deviation of lotto setsPrev Topic Next Topic

New Topic New Poll

Bertil

S.Windsor, CT
United States
Member #4,580
May 4, 2004
119 Posts
Offline

Dec 22, 2004, 7:33 am

For each lotto game one can add the numbers in a set and get a sum,
which will form a nearly perfect normal curve if all combinations are used.
But each set has a mean and a std.dev. , which will not form a normal
curve because there will be more of the lowest value than of the highest.
E.g. in the 5/53 lotto part of Powerball the lowest std.dev. will be 1.58
for the set 1-2-3-4-5 and all othe r sets of five consecutive numbers, of
which there must be 48, while the highest value must be 27.67 for the
sets 1-2-3-52-53 and 1-2-51-52-53. The mean of these two extremes will
be 14.63. But a test of 205 sets from 10/12/02 yeilded 14.87 while a com-
puter simulation for 10 000 draws gave 14.77.
Does anybody know a formula for predicting the mean std.dev. for any lotto?
This mean itself must have a variance, which could be used to set a filter at 95%.
If we divide the range 27.67-1.58 in six units we get 4.35 as a std.dev. for the mean,
and if we divide the spread in eight units we get 3.26. The average of these two is 3.8.
It is interesting to note that for the 205 sets it was 3.78. Thus if we use the value 3.8
we could use the range 14.8 +/- 7.6 for 95% . Any comments?
Bertil
Fenix

Canada
Member #6,394
August 21, 2004
97 Posts
Offline

Dec 24, 2004, 5:18 pm

How did you get 1.58 for the 1-2-3-4-5 set?
straightchaser

I am The Avatar...

SC
United States
Member #4,355
April 15, 2004
384 Posts
Online

Dec 24, 2004, 5:32 pm

Bertil,
There is no formula that I know of to "predict" the mean of the standard deviation. One can try to "bootstrap" the mean of the standard deviation. You seem to be quite adept at simulating random sets. Bootstrapping allows you to simulate the distribution of the standard deviation from which you can "estimate" the mean standard deviation. Happy bootstrapping.
'chaser
Nick Koutras

Canada
Member #2,192
August 29, 2003
27 Posts
Offline

Dec 24, 2004, 9:15 pm

Quote: Originally posted by Bertil on December 22, 2004

For each lotto game one can add the numbers in a set and get a sum,

which will form a nearly perfect normal curve if all combinations are used.

But each set has a mean and a std.dev. , which will not form a normal

curve because there will be more of the lowest value than of the highest.

E.g. in the 5/53 lotto part of Powerball the lowest std.dev. will be 1.58

for the set 1-2-3-4-5 and all othe r sets of five consecutive numbers, of

which there must be 48, while the highest value must be 27.67 for the

sets 1-2-3-52-53 and 1-2-51-52-53. The mean of these two extremes will

be 14.63. But a test of 205 sets from 10/12/02 yeilded 14.87 while a com-

puter simulation for 10 000 draws gave 14.77.

Does anybody know a formula for predicting the mean std.dev. for any lotto?

This mean itself must have a variance, which could be used to set a filter at 95%.

If we divide the range 27.67-1.58 in six units we get 4.35 as a std.dev. for the mean,

and if we divide the spread in eight units we get 3.26. The average of these two is 3.8.

It is interesting to note that for the 205 sets it was 3.78. Thus if we use the value 3.8

we could use the range 14.8 +/- 7.6 for 95% . Any comments?

Bertil

Go here: http://www.itl.nist.gov/div898/handbook/eda/section3/eda3662.htm

or search for Uniform Distribution

For a 6/49 game Standard Deviation = SQRT((B-A)**2/12)

where B=49 A=1
Bertil

S.Windsor, CT
United States
Member #4,580
May 4, 2004
119 Posts
Offline

Dec 25, 2004, 9:07 am

Quote: Originally posted by Fenix on December 24, 2004

How did you get 1.58 for the 1-2-3-4-5 set?

I got the numbers from my hand-held calculator. If you calculate the s.d.
by hand you will get sqrt 10/4 as a sample but sqrt 10/5 as a population.
Here we are dealing with samples.
Bertil
Bertil

S.Windsor, CT
United States
Member #4,580
May 4, 2004
119 Posts
Offline

Dec 25, 2004, 9:14 am

Quote: Originally posted by Nick Koutras on December 24, 2004

Quote: Originally posted by Bertil on December 22, 2004

For each lotto game one can add the numbers in a set and get a sum,
which will form a nearly perfect normal curve if all combinations are used.
But each set has a mean and a std.dev. , which will not form a normal
curve because there will be more of the lowest value than of the highest.
E.g. in the 5/53 lotto part of Powerball the lowest std.dev. will be 1.58
for the set 1-2-3-4-5 and all othe r sets of five consecutive numbers, of
which there must be 48, while the highest value must be 27.67 for the
sets 1-2-3-52-53 and 1-2-51-52-53. The mean of these two extremes will
be 14.63. But a test of 205 sets from 10/12/02 yeilded 14.87 while a com-
puter simulation for 10 000 draws gave 14.77.
Does anybody know a formula for predicting the mean std.dev. for any lotto?
This mean itself must have a variance, which could be used to set a filter at 95%.
If we divide the range 27.67-1.58 in six units we get 4.35 as a std.dev. for the mean,
and if we divide the spread in eight units we get 3.26. The average of these two is 3.8.
It is interesting to note that for the 205 sets it was 3.78. Thus if we use the value 3.8
we could use the range 14.8 7.6 for 95% . Any comments?
Bertil

Go here: http://www.itl.nist.gov/div898/handbook/eda/section3/eda3662.htm

or search for Uniform Distribution

For a 6/49 game Standard Deviation = SQRT((B-A)**2/12)
where B=49 A=1

In his book H.Schneider determined the s.d. for 52 draws of the Florida
and the UK 6/49 game and got 14.1. This value does not agree with
your formula, which I'm confused about. I'm familiar with the formula
SQRT (49^2-1)/12 for a single integer draw from 49 but not yours.
Please clarify.
Bertil
Hyperdimension

United States
Member #9,059
November 26, 2004
221 Posts
Offline

Dec 25, 2004, 1:55 pm

Hi,

13.85641 is very close to 14.1, the difference is common probability vs actual statistics

Regards

Consciousness order chaos.

http://thefootballpools.blogspot.com
Bertil

S.Windsor, CT
United States
Member #4,580
May 4, 2004
119 Posts
Offline

Dec 25, 2004, 6:00 pm

Quote: Originally posted by Hyperdimension on December 25, 2004

Hi,

13.85641 is very close to 14.1, the difference is common probability vs actual statistics

Regards

Hi Hype,
The value 13.856 must come from SQRT 192, which comes from (49-1)^2/12.
But this formula apllies to a continuous uniform distribution. But we are here
dealing wwith a discrete unif.distr. for which the formula is (N^2-1)/2, which
wold yield 14.14. But that value refers to a single integer sample, not to six.
When sampling without replacement we must include the finite population
corrction factor SQRT (N-n)/(N-1) or else there would be no difference in taking
5 or 6 samples. We can now predict the variance for the sum of samples. In the
6/49 game it is 32.8 for the mean sum 150. But we are still short a formula for
the variance of the std.dev. I suspect we must settle for an approximation.
Bertil
Hyperdimension

United States
Member #9,059
November 26, 2004
221 Posts
Offline

Dec 26, 2004, 1:14 am

Hi,

Ion Saliu has a program call FORMULA.exe and Superformula, both programs calculate the Standard deviation for an dvent of probability p in N of binomial dvents,

I'll use Superformula for the next example,

The program calculates p as a fraction of 2 values, 6 in 49 in this case,

1st element of the fraction p = 6

2nd element of the fraction p = 49

Enter the number of trials, N =2000

Results:

The standard deviation for an dvent of probability

p = .12244898

in 2000 binomial experiments is:

BSD = 14.66

The expected (theoretical) number of successes is: 245

Based on the Normal Probability Rule:

Ã¹ 68.2% of the successes will fall within 1 Standard Deviation

from 245 - i.e., between 230 - 260

Ã¹Ã¹ 95.4% of the successes will fall within 2 Standard Deviations

from 245 - i.e., between 215 - 275

Ã¹Ã¹Ã¹ 99.7% of the successes will fall within 3 Standard Deviations

from 245 - i.e., between 200 - 290

Regards

Consciousness order chaos.

http://thefootballpools.blogspot.com
Bertil

S.Windsor, CT
United States
Member #4,580
May 4, 2004
119 Posts
Offline

Dec 26, 2004, 8:24 am

Quote: Originally posted by Hyperdimension on December 26, 2004

Hi,

Ion Saliu has a program call FORMULA.exe and Superformula, both programs calculate the Standard deviation for an dvent of probability p in N of binomial dvents,

I'll use Superformula for the next example,

The program calculates p as a fraction of 2 values, 6 in 49 in this case,

1st element of the fraction p = 6
2nd element of the fraction p = 49
Enter the number of trials, N =2000

Results:

The standard deviation for an dvent of probability
p = .12244898
in 2000 binomial experiments is:
BSD = 14.66

The expected (theoretical) number of successes is: 245

Based on the Normal Probability Rule:

Ã¹ 68.2% of the successes will fall within 1 Standard Deviation
from 245 - i.e., between 230 - 260
Ã¹Ã¹ 95.4% of the successes will fall within 2 Standard Deviations
from 245 - i.e., between 215 - 275
Ã¹Ã¹Ã¹ 99.7% of the successes will fall within 3 Standard Deviations
from 245 - i.e., between 200 - 290

Regards

Hi again,
I respect anybody who can make sense out of Saliu's writings, because I'm
unable to decide if he is a genius or a crackpot.
The idea of treating lotto draws as a binomial dvent strikes me as unsound.
They are hypergeometric dvents and thus we need the finite population
correction factor. If we look at each draw we need to focus on one parameter
to describe it and sum or mean would seem to be reasonable. In a 6/49
game there are 14 million combinations and their sum or mean form a perfect
normal curve not a binomial one.
The number 14.77 caught my attention but I'm unable to understand what 245
means, nor do I see what the ranges represent. Can you please clarify.
Bertil
Hyperdimension

United States
Member #9,059
November 26, 2004
221 Posts
Offline

Dec 26, 2004, 4:41 pm

Hi,

The right person to answer your question is Mr. Ion Saliu,

I find an interesting article about binomial distribution, with the next example:

Poisson Distribution

In extreme cases, very small p so that the standard deviation is not much less than the mean, the Gaussian Distribution is not appropriate, but a different approximation is: the Poisson Distribution. Going back to the Binomial Distribution (which is still exact), we only need to worry about values of n much smaller than N.

In the Indiana Lottery, people choose 6 numbers from 1-49. There are

45!/(39! 6!) = 13983816

combinations; more than one person can buy a ticket with the same number. Suppose 28 million tickets are sold in a given week, what is the probability for zero winners? one winner? etc...

One number is selected, we have to assume that all numbers are purchased with equal likelihood, so that on any given purchase the probability that it will be the winner is about 1 in 14 million (that is p=1/14000000). We have:

a= N p = 28000000 (1/14000000) = 2

Pa(n) ≅ e-a an/n!

P2(0) ≅ e-2 20/0! = e-2 = 0.1353

P2(1) ≅ e-2 21/1! = e-2 = 0.2707

P2(2) ≅ e-2 22/2! = e-2 = 0.2707

P2(3) ≅ e-2 23/3! = e-2 = 0.1804

P2(4) ≅ 0.0902

P2(5) ≅ 0.0361

P2(6) ≅ 0.0120

P2(7) ≅ 0.0034

P2(8) ≅ 0.0009

P2(9) ≅ 0.0002

P2(10) ≅ 0.00003

We see that it is most likely that there are 1 or 2 winners, but 0, 3, and 4 would not be surprising. The probability for more than 10 winners is about 0.000008 (be very suspicious if this occurs!).

What is the probability to sell 50 million tickets without a winner? The expected number of winners is

a = N p = (50000000)(1/14000000) = 3.57

The chance to get zero is:

P3.57(0) ≅ e-3.57 3.570/0! = e-3.57 = 0.028

That is, about 1/35 (not completely unlikely).

What is the probability to sell 100 million tickets without a winner? The expected number of winners is

a = N p = (108)(1/14000000) = 7.1

The chance to get zero is:

P7.1(0) ≅ e-7.1 7.10/0! = e-7.1 = 0.00079

We don't expect to see this very often (1/1265).

Sounds interesting

Consciousness order chaos.

http://thefootballpools.blogspot.com
Bertil

S.Windsor, CT
United States
Member #4,580
May 4, 2004
119 Posts
Offline

Dec 26, 2004, 9:39 pm

Quote: Originally posted by Hyperdimension on December 26, 2004

Hi,

The right person to answer your question is Mr. Ion Saliu,

I find an interesting article about binomial distribution, with the next example:

Poisson Distribution

In extreme cases, very small p so that the standard deviation is not much less than the mean, the Gaussian Distribution is not appropriate, but a different approximation is: the Poisson Distribution. Going back to the Binomial Distribution (which is still exact), we only need to worry about values of n much smaller than N.

In the Indiana Lottery, people choose 6 numbers from 1-49. There are

45!/(39! 6!) = 13983816

combinations; more than one person can buy a ticket with the same number. Suppose 28 million tickets are sold in a given week, what is the probability for zero winners? one winner? etc...
One number is selected, we have to assume that all numbers are purchased with equal likelihood, so that on any given purchase the probability that it will be the winner is about 1 in 14 million (that is p=1/14000000). We have:

a= N p = 28000000 (1/14000000) = 2
Pa(n) ≅ e-a an/n!
P2(0) ≅ e-2 20/0! = e-2 = 0.1353
P2(1) ≅ e-2 21/1! = e-2 = 0.2707
P2(2) ≅ e-2 22/2! = e-2 = 0.2707
P2(3) ≅ e-2 23/3! = e-2 = 0.1804
P2(4) ≅ 0.0902
P2(5) ≅ 0.0361
P2(6) ≅ 0.0120
P2(7) ≅ 0.0034
P2(8) ≅ 0.0009
P2(9) ≅ 0.0002
P2(10) ≅ 0.00003

We see that it is most likely that there are 1 or 2 winners, but 0, 3, and 4 would not be surprising. The probability for more than 10 winners is about 0.000008 (be very suspicious if this occurs!).

What is the probability to sell 50 million tickets without a winner? The expected number of winners is

a = N p = (50000000)(1/14000000) = 3.57

The chance to get zero is:

P3.57(0) ≅ e-3.57 3.570/0! = e-3.57 = 0.028

That is, about 1/35 (not completely unlikely).

What is the probability to sell 100 million tickets without a winner? The expected number of winners is

a = N p = (108)(1/14000000) = 7.1

The chance to get zero is:

P7.1(0) ≅ e-7.1 7.10/0! = e-7.1 = 0.00079

We don't expect to see this very often (1/1265).

Sounds interesting

Hi, your comment is unrelated to the problem we were trying to solve. So in
the same spirit let me mention a few other facts about the 6/49 game.
If 14 million tickets are sold before the draw there is likely to remain 36.8%
not sold, with 21 million sold there will be 22.3% not sold, with 42 million
sold there will remain 5% not sold, with 56 million sold there will still remain
1.83% not sold and with 70 million sold there will be 0.674% remaining .
All of these calculations are theoretical and have little bearing on reality.
Bertil
johnph77

CA
United States
Member #2,987
December 10, 2003
832 Posts
Offline

Dec 27, 2004, 12:55 am

Bertil -
hyperdimension's math is based on probability, not possibility. If a lottery is completely random in its generation of tickets, and if there are 13,983,816 possibilities and the same number of tickets are sold, it is almost a mathematical certainty that all the tickets sold will not cover all the possibilities in the lottery. The odds of that happening far exceed the chances of winning.
When the first ticket is sold, it is unique - there is no other ticket currently in that lottery with that set of numbers. The second ticket, therefore, has 1 chance in 13,983,816 of duplicating the first ticket. The third ticket sold has 2 chances in 13,983,816 of duplicating either of the first two tickets. And so would go the progression. And, given this, you would think that around the point when 50% of the tickets were sold, the odds would have it that the ticket you bought would have an even chance of duplicating any of the previously sold tickets. Not so - that would only apply if any of the previously sold tickets haven't already duplicated another ticket.
Probability is computed in fractions or decimals that are added progressively - the second ticket in this given lottery has 1/13,983,815 possibility of being an exact duplicate of the first ticket. The third ticket has a 1/13,983,815 chance of being an exact duplicate of the first ticket plus a 1/13,983,814 chance of being an exact duplicate of the second ticket. And so it goes from that point. Somewhere between the 30% to 40% total sales point the fractions or decimals add up to 1/2 or 0.5 - at that point if a number hasn't already been duplicated the odds are 1::1 that that ticket will duplicate one of the previous tickets.
Given this, if a number generator is truly random, some sets of numbers may appear six or seven times and more in a lottery of this size, and other sets will not appear at all.
Hope this helps to clear this up.
gl
john

Blessed Saint Leibowitz, keep 'em dreamin' down there.....

Next week's convention for Psychics and Prognosticators has been cancelled due to unforeseen circumstances.

=^.^=
Hyperdimension

United States
Member #9,059
November 26, 2004
221 Posts
Offline

Dec 27, 2004, 1:51 am

Hi,

Thank you for the explanation johnph77,

Continuing with the problem, first I created the full wheel 6/49, in total 13,983,816 tickets, obtaining the next results..

Valid N Mean Minimum Maximum Std.Dev.

1var 13983816 7.14286 1.000000 44.00000 5.736564

2var 13983816 14.28571 2.000000 45.00000 7.405872

3var 13983816 21.42857 3.000000 46.00000 8.112726

4var 13983816 28.57143 4.000000 47.00000 8.112726

5var 13983816 35.71429 5.000000 48.00000 7.405872

6var 13983816 42.85714 6.000000 49.00000 5.736564

Variance

1var 32.90817

2var 54.84694

3var 65.81633

4var 65.81633

5var 54.84694

6var 32.90817

Then my computer crash

Consciousness order chaos.

http://thefootballpools.blogspot.com
johnph77

CA
United States
Member #2,987
December 10, 2003
832 Posts
Offline

Dec 27, 2004, 2:39 am

This is starting to bug me for some reason. I'm going to try this one more time as I feel my previous explanation was inadequate.
I have a random number generator generating tickets for the lottery example given above. Defying all the odds in the known universe, the RNG has given me 13,983,815 different sets of numbers in the same amount of draws. What are the odds of drawing that last possibility in the 13,983,816th draw?
You got it - the same odds of predicting any given set of numbers in the lottery - 1::13,983,816. That means that the random number generator has 13,983,815 chances of duplicating one of the previously drawn sets of numbers. That's the other end of determining probability.
gl
john

Blessed Saint Leibowitz, keep 'em dreamin' down there.....

Next week's convention for Psychics and Prognosticators has been cancelled due to unforeseen circumstances.

=^.^=

New Topic New Poll

Subscribe to this topic

Standard deviation of lotto setsPrev TopicNext Topic

Standard deviation of lotto setsPrev Topic Next Topic