Welcome Guest
Log In | Register )
You last visited December 3, 2016, 12:35 pm
All times shown are
Eastern Time (GMT-5:00)

Standard deviation of lotto sets

Topic closed. 17 replies. Last post 12 years ago by Nick Koutras.

Page 1 of 2
PrintE-mailLink
S.Windsor, CT
United States
Member #4580
May 4, 2004
119 Posts
Offline
Posted: December 22, 2004, 7:33 am - IP Logged

 

      For each lotto game one can add the numbers in a set and get a sum,

      which will form a nearly perfect normal curve if all combinations are used.

      But each set has a mean and a std.dev. , which will not form a normal

      curve because there will be more of the lowest value than of the highest.

      E.g. in the 5/53 lotto part of Powerball the lowest std.dev. will be 1.58

      for the set 1-2-3-4-5 and all othe r sets of five consecutive numbers, of

      which there must be 48, while the highest value must be 27.67 for the

      sets 1-2-3-52-53 and 1-2-51-52-53. The mean of these two extremes will

    be 14.63. But a test of 205 sets from 10/12/02 yeilded 14.87 while a com-

    puter simulation for 10 000 draws gave 14.77.

      Does anybody know a formula for predicting the mean std.dev. for any lotto?

    This mean itself must have a variance, which could be used to set a filter at 95%.

    If we divide the range 27.67-1.58 in six units we get 4.35 as a std.dev. for the mean,

    and if we divide the spread in eight units we get 3.26. The average of these two is 3.8.

    It is interesting to note that for the 205 sets it was 3.78. Thus if we use the value 3.8

      we could use the range 14.8 +/- 7.6 for 95% .  Any comments?

      Bertil

    Avatar

    Canada
    Member #6394
    August 21, 2004
    97 Posts
    Offline
    Posted: December 24, 2004, 5:18 pm - IP Logged

    How did you get 1.58 for the 1-2-3-4-5 set?

      straightchaser's avatar - avatar
      I am The Avatar...
      SC
      United States
      Member #4355
      April 15, 2004
      345 Posts
      Online
      Posted: December 24, 2004, 5:32 pm - IP Logged

      Bertil,

      There is no formula that I know of to "predict" the mean of the standard deviation. One can try to "bootstrap" the mean of the standard deviation. You seem to be quite adept at simulating random  sets.  Bootstrapping allows you to simulate the distribution of the standard deviation from which you can "estimate" the mean standard deviation.  Happy bootstrapping.

      'chaser

        Avatar

        Canada
        Member #2192
        August 29, 2003
        27 Posts
        Offline
        Posted: December 24, 2004, 9:15 pm - IP Logged
        Quote: Originally posted by Bertil on December 22, 2004

              For each lotto game one can add the numbers in a set and get a sum,

              which will form a nearly perfect normal curve if all combinations are used.

              But each set has a mean and a std.dev. , which will not form a normal

              curve because there will be more of the lowest value than of the highest.

              E.g. in the 5/53 lotto part of Powerball the lowest std.dev. will be 1.58

              for the set 1-2-3-4-5 and all othe r sets of five consecutive numbers, of

              which there must be 48, while the highest value must be 27.67 for the

              sets 1-2-3-52-53 and 1-2-51-52-53. The mean of these two extremes will

             be 14.63. But a test of 205 sets from 10/12/02 yeilded 14.87 while a com-

             puter simulation for 10 000 draws gave 14.77.

              Does anybody know a formula for predicting the mean std.dev. for any lotto?

             This mean itself must have a variance, which could be used to set a filter at 95%.

             If we divide the range 27.67-1.58 in six units we get 4.35 as a std.dev. for the mean,

             and if we divide the spread in eight units we get 3.26. The average of these two is 3.8.

             It is interesting to note that for the 205 sets it was 3.78. Thus if we use the value 3.8

              we could use the range 14.8 +/- 7.6 for 95% . Any comments?

              Bertil






        Go here: http://www.itl.nist.gov/div898/handbook/eda/section3/eda3662.htm



        or search for Uniform Distribution



        For a 6/49 game Standard Deviation =      SQRT((B-A)**2/12)

        where B=49   A=1



          S.Windsor, CT
          United States
          Member #4580
          May 4, 2004
          119 Posts
          Offline
          Posted: December 25, 2004, 9:07 am - IP Logged
          Quote: Originally posted by Fenix on December 24, 2004


          How did you get 1.58 for the 1-2-3-4-5 set?



                    I got the numbers from my hand-held calculator. If you calculate the s.d.

                    by hand you will get sqrt 10/4 as a sample but sqrt 10/5 as a population.

                    Here we are dealing with samples.

                    Bertil

            S.Windsor, CT
            United States
            Member #4580
            May 4, 2004
            119 Posts
            Offline
            Posted: December 25, 2004, 9:14 am - IP Logged
            Quote: Originally posted by Nick Koutras on December 24, 2004



            Quote: Originally posted by Bertil on December 22, 2004




                  For each lotto game one can add the numbers in a set and get a sum,

                  which will form a nearly perfect normal curve if all combinations are used.

                  But each set has a mean and a std.dev. , which will not form a normal

                  curve because there will be more of the lowest value than of the highest.

                  E.g. in the 5/53 lotto part of Powerball the lowest std.dev. will be 1.58

                  for the set 1-2-3-4-5 and all othe r sets of five consecutive numbers, of

                  which there must be 48, while the highest value must be 27.67 for the

                  sets 1-2-3-52-53 and 1-2-51-52-53. The mean of these two extremes will

                 be 14.63. But a test of 205 sets from 10/12/02 yeilded 14.87 while a com-

                 puter simulation for 10 000 draws gave 14.77.

                  Does anybody know a formula for predicting the mean std.dev. for any lotto?

                 This mean itself must have a variance, which could be used to set a filter at 95%.

                 If we divide the range 27.67-1.58 in six units we get 4.35 as a std.dev. for the mean,

                 and if we divide the spread in eight units we get 3.26. The average of these two is 3.8.

                 It is interesting to note that for the 205 sets it was 3.78. Thus if we use the value 3.8

                  we could use the range 14.8 7.6 for 95% . Any comments?

                  Bertil








            Go here: http://www.itl.nist.gov/div898/handbook/eda/section3/eda3662.htm

            or search for Uniform Distribution

            For a 6/49 game Standard Deviation =      SQRT((B-A)**2/12)
            where B=49   A=1



                    In his book H.Schneider determined the s.d. for 52 draws of the Florida

                    and the UK 6/49 game and got 14.1. This value does not agree with

                    your formula, which I'm confused about. I'm familiar with the formula

                    SQRT (49^2-1)/12 for a single integer draw from 49 but not yours.

                    Please clarify.

                    Bertil

              Hyperdimension's avatar - latest trace_171.gif

              United States
              Member #9059
              November 26, 2004
              128 Posts
              Offline
              Posted: December 25, 2004, 1:55 pm - IP Logged

              Hi,



              13.85641 is very close to 14.1, the difference is common probability vs actual statistics

              Regards

              El pensamiento ordena el caos..

              http://1x2quinielas.blogspot.com

                S.Windsor, CT
                United States
                Member #4580
                May 4, 2004
                119 Posts
                Offline
                Posted: December 25, 2004, 6:00 pm - IP Logged
                Quote: Originally posted by Hyperdimension on December 25, 2004


                Hi,

                13.85641 is very close to 14.1, the difference is common probability vs actual statistics


                Regards



                        Hi Hype,

                        The value 13.856 must come from SQRT 192, which comes from (49-1)^2/12.

                        But this formula apllies to a continuous uniform distribution. But we are here

                        dealing wwith a discrete unif.distr. for which the formula is (N^2-1)/2, which

                        wold yield 14.14. But that value refers to a single integer sample, not to six.

                        When sampling without replacement we must include the finite population

                        corrction factor SQRT (N-n)/(N-1) or else there would be no difference in taking

                        5 or 6 samples. We can now predict the variance for the sum of samples. In the

                      6/49 game it is 32.8 for the mean sum 150. But we are still short a formula for

                        the variance of the std.dev. I suspect we must settle for an approximation.

                      Bertil

                       

                       

                  Hyperdimension's avatar - latest trace_171.gif

                  United States
                  Member #9059
                  November 26, 2004
                  128 Posts
                  Offline
                  Posted: December 26, 2004, 1:14 am - IP Logged

                  Hi,



                  Ion Saliu has a program call FORMULA.exe and Superformula, both programs calculate the Standard deviation for an dvent of probability p in N of binomial dvents,



                  I'll use Superformula for the next example,



                  The program calculates p as a fraction of 2 values, 6 in 49 in this case,



                  1st element of the fraction p = 6

                  2nd element of the fraction p = 49

                  Enter the number of trials, N =2000



                  Results:



                  The standard deviation for an dvent of probability

                  p = .12244898

                  in 2000 binomial experiments is:

                                     BSD = 14.66

                  The expected (theoretical) number of successes is: 245

                  Based on the Normal Probability Rule:



                  ù 68.2% of the successes will fall within 1 Standard Deviation

                  from 245 - i.e., between 230 - 260

                  ùù 95.4% of the successes will fall within 2 Standard Deviations

                  from 245 - i.e., between 215 - 275

                  ùùù 99.7% of the successes will fall within 3 Standard Deviations

                  from 245 - i.e., between 200 - 290

                  Regards



                  El pensamiento ordena el caos..

                  http://1x2quinielas.blogspot.com

                    S.Windsor, CT
                    United States
                    Member #4580
                    May 4, 2004
                    119 Posts
                    Offline
                    Posted: December 26, 2004, 8:24 am - IP Logged
                    Quote: Originally posted by Hyperdimension on December 26, 2004


                    Hi,

                    Ion Saliu has a program call FORMULA.exe and Superformula, both programs calculate the Standard deviation for an dvent of probability p in N of binomial dvents,

                    I'll use Superformula for the next example,

                    The program calculates p as a fraction of 2 values, 6 in 49 in this case,

                    1st element of the fraction p = 6
                    2nd element of the fraction p = 49
                    Enter the number of trials, N =2000

                    Results:

                    The standard deviation for an dvent of probability
                    p = .12244898
                    in 2000 binomial experiments is:
                                       BSD = 14.66


                    The expected (theoretical) number of successes is: 245


                    Based on the Normal Probability Rule:

                    ù 68.2% of the successes will fall within 1 Standard Deviation
                    from 245 - i.e., between 230 - 260
                    ùù 95.4% of the successes will fall within 2 Standard Deviations
                    from 245 - i.e., between 215 - 275
                    ùùù 99.7% of the successes will fall within 3 Standard Deviations
                    from 245 - i.e., between 200 - 290


                    Regards





                            Hi again,

                            I respect anybody who can make sense out of Saliu's writings, because I'm

                            unable to decide if he is a genius or a crackpot.

                            The idea of treating lotto draws as a binomial dvent strikes me as unsound.

                            They are hypergeometric dvents and thus we need the finite population

                            correction factor. If we look at each draw we need to focus on one parameter

                            to describe it and sum or mean would seem to be reasonable. In a 6/49

                            game there are 14 million combinations and their sum or mean form a perfect

                            normal curve not a binomial one.

                            The number 14.77 caught my attention but I'm unable to understand what 245

                            means, nor do I see what the ranges represent. Can you please clarify.

                            Bertil 

                      Hyperdimension's avatar - latest trace_171.gif

                      United States
                      Member #9059
                      November 26, 2004
                      128 Posts
                      Offline
                      Posted: December 26, 2004, 4:41 pm - IP Logged

                      Hi,



                      The right person to answer your question is Mr. Ion Saliu,



                      I find an interesting article about binomial distribution, with the next example:



                      Poisson Distribution



                      In extreme cases, very small p so that the standard deviation is not much less than the mean, the Gaussian Distribution is not appropriate, but a different approximation is: the Poisson Distribution. Going back to the Binomial Distribution (which is still exact), we only need to worry about values of n much smaller than N.



                      In the Indiana Lottery, people choose 6 numbers from 1-49. There are



                          45!/(39! 6!) = 13983816



                      combinations; more than one person can buy a ticket with the same number. Suppose 28 million tickets are sold in a given week, what is the probability for zero winners? one winner? etc...

                      One number is selected, we have to assume that all numbers are purchased with equal likelihood, so that on any given purchase the probability that it will be the winner is about 1 in 14 million (that is p=1/14000000). We have:



                          a= N p = 28000000 (1/14000000) = 2

                          Pa(n) ≅ e-a an/n!

                          P2(0) ≅ e-2 20/0! = e-2 = 0.1353

                          P2(1) ≅ e-2 21/1! = e-2 = 0.2707

                          P2(2) ≅ e-2 22/2! = e-2 = 0.2707

                          P2(3) ≅ e-2 23/3! = e-2 = 0.1804

                          P2(4) ≅ 0.0902

                          P2(5) ≅ 0.0361

                          P2(6) ≅ 0.0120

                          P2(7) ≅ 0.0034

                          P2(8) ≅ 0.0009

                          P2(9) ≅ 0.0002

                          P2(10) ≅ 0.00003



                      We see that it is most likely that there are 1 or 2 winners, but 0, 3, and 4 would not be surprising. The probability for more than 10 winners is about 0.000008 (be very suspicious if this occurs!).



                      What is the probability to sell 50 million tickets without a winner? The expected number of winners is



                          a = N p = (50000000)(1/14000000) = 3.57



                      The chance to get zero is:



                          P3.57(0) ≅ e-3.57 3.570/0! = e-3.57 = 0.028



                      That is, about 1/35 (not completely unlikely).



                      What is the probability to sell 100 million tickets without a winner? The expected number of winners is



                          a = N p = (108)(1/14000000) = 7.1



                      The chance to get zero is:



                          P7.1(0) ≅ e-7.1 7.10/0! = e-7.1 = 0.00079



                      We don't expect to see this very often (1/1265).



                      Sounds interesting

                      El pensamiento ordena el caos..

                      http://1x2quinielas.blogspot.com

                        S.Windsor, CT
                        United States
                        Member #4580
                        May 4, 2004
                        119 Posts
                        Offline
                        Posted: December 26, 2004, 9:39 pm - IP Logged
                        Quote: Originally posted by Hyperdimension on December 26, 2004


                        Hi,

                        The right person to answer your question is Mr. Ion Saliu,

                        I find an interesting article about binomial distribution, with the next example:

                        Poisson Distribution

                        In extreme cases, very small p so that the standard deviation is not much less than the mean, the Gaussian Distribution is not appropriate, but a different approximation is: the Poisson Distribution. Going back to the Binomial Distribution (which is still exact), we only need to worry about values of n much smaller than N.

                        In the Indiana Lottery, people choose 6 numbers from 1-49. There are

                            45!/(39! 6!) = 13983816

                        combinations; more than one person can buy a ticket with the same number. Suppose 28 million tickets are sold in a given week, what is the probability for zero winners? one winner? etc...
                        One number is selected, we have to assume that all numbers are purchased with equal likelihood, so that on any given purchase the probability that it will be the winner is about 1 in 14 million (that is p=1/14000000). We have:

                            a= N p = 28000000 (1/14000000) = 2
                            Pa(n) ≅ e-a an/n!
                            P2(0) ≅ e-2 20/0! = e-2 = 0.1353
                            P2(1) ≅ e-2 21/1! = e-2 = 0.2707
                            P2(2) ≅ e-2 22/2! = e-2 = 0.2707
                            P2(3) ≅ e-2 23/3! = e-2 = 0.1804
                            P2(4) ≅ 0.0902
                            P2(5) ≅ 0.0361
                            P2(6) ≅ 0.0120
                            P2(7) ≅ 0.0034
                            P2(8) ≅ 0.0009
                            P2(9) ≅ 0.0002
                            P2(10) ≅ 0.00003



                        We see that it is most likely that there are 1 or 2 winners, but 0, 3, and 4 would not be surprising. The probability for more than 10 winners is about 0.000008 (be very suspicious if this occurs!).

                        What is the probability to sell 50 million tickets without a winner? The expected number of winners is

                            a = N p = (50000000)(1/14000000) = 3.57

                        The chance to get zero is:

                            P3.57(0) ≅ e-3.57 3.570/0! = e-3.57 = 0.028

                        That is, about 1/35 (not completely unlikely).

                        What is the probability to sell 100 million tickets without a winner? The expected number of winners is

                            a = N p = (108)(1/14000000) = 7.1

                        The chance to get zero is:

                            P7.1(0) ≅ e-7.1 7.10/0! = e-7.1 = 0.00079

                        We don't expect to see this very often (1/1265).

                        Sounds interesting



                                  Hi, your comment is unrelated to the problem we were trying to solve. So in

                                  the same spirit let me mention a few other facts about the 6/49 game.

                                  If 14 million tickets are sold before the draw there is likely to remain 36.8%

                                  not sold, with 21 million sold there will be 22.3% not sold, with 42 million

                                  sold there will remain 5% not sold, with 56 million sold there will still remain

                                  1.83% not sold and with 70 million sold there will be 0.674% remaining .

                                  All of these calculations are theoretical and have little  bearing on reality.

                                  Bertil

                                   

                          johnph77's avatar - avatar
                          CA
                          United States
                          Member #2987
                          December 10, 2003
                          832 Posts
                          Offline
                          Posted: December 27, 2004, 12:55 am - IP Logged

                          Bertil -

                          hyperdimension's math is based on probability, not possibility. If a lottery is completely random in its generation of tickets, and if there are 13,983,816 possibilities and the same number of tickets are sold, it is almost a mathematical certainty that all the tickets sold will not cover all the possibilities in the lottery. The odds of that happening far exceed the chances of winning.

                          When the first ticket is sold, it is unique - there is no other ticket currently in that lottery with that set of numbers. The second ticket, therefore, has 1 chance in 13,983,816 of duplicating the first ticket. The third ticket sold has 2 chances in 13,983,816 of duplicating either of the first two tickets. And so would go the progression. And, given this, you would think that around the point when 50% of the tickets were sold, the odds would have it that the ticket you bought would have an even chance of duplicating any of the previously sold tickets. Not so - that would only apply if any of the previously sold tickets haven't already duplicated another ticket.

                          Probability is computed in fractions or decimals that are added progressively - the second ticket in this given lottery has 1/13,983,815 possibility of being an exact duplicate of the first ticket. The third ticket has a 1/13,983,815 chance of being an exact duplicate of the first ticket plus a 1/13,983,814 chance of being an exact duplicate of the second ticket. And so it goes from that point. Somewhere between the 30% to 40% total sales point the fractions or decimals add up to 1/2 or 0.5 - at that point if a number hasn't already been duplicated the odds are 1::1 that that ticket will duplicate one of the previous tickets.

                          Given this, if a number generator is truly random, some sets of numbers may appear six or seven times and more in a lottery of this size, and other sets will not appear at all.

                          Hope this helps to clear this up.

                          gl

                          john

                          Blessed Saint Leibowitz, keep 'em dreamin' down there..... 

                          Next week's convention for Psychics and Prognosticators has been cancelled due to unforeseen circumstances.

                           =^.^=

                            Hyperdimension's avatar - latest trace_171.gif

                            United States
                            Member #9059
                            November 26, 2004
                            128 Posts
                            Offline
                            Posted: December 27, 2004, 1:51 am - IP Logged

                            Hi,



                            Thank you for the explanation johnph77,



                            Continuing with the problem, first I created the full wheel 6/49, in total 13,983,816 tickets, obtaining the next results..



                                       Valid N        Mean           Minimum      Maximum        Std.Dev.

                            1var 13983816     7.14286       1.000000     44.00000     5.736564

                            2var 13983816     14.28571     2.000000     45.00000     7.405872

                            3var 13983816     21.42857     3.000000     46.00000     8.112726

                            4var 13983816     28.57143     4.000000     47.00000     8.112726

                            5var 13983816     35.71429     5.000000     48.00000     7.405872

                            6var 13983816     42.85714     6.000000     49.00000     5.736564



                                    Variance

                            1var 32.90817

                            2var 54.84694

                            3var 65.81633

                            4var 65.81633

                            5var 54.84694

                            6var 32.90817

                            Then my computer crash

                            El pensamiento ordena el caos..

                            http://1x2quinielas.blogspot.com

                              johnph77's avatar - avatar
                              CA
                              United States
                              Member #2987
                              December 10, 2003
                              832 Posts
                              Offline
                              Posted: December 27, 2004, 2:39 am - IP Logged

                              This is starting to bug me for some reason. I'm going to try this one more time as I feel my previous explanation was inadequate.

                              I have a random number generator generating tickets for the lottery example given above. Defying all the odds in the known universe, the RNG has given me 13,983,815 different sets of numbers in the same amount of draws. What are the odds of drawing that last possibility in the 13,983,816th draw?

                              You got it - the same odds of predicting any given set of numbers in the lottery - 1::13,983,816. That means that the random number generator has 13,983,815 chances of duplicating one of the previously drawn sets of numbers. That's the other end of determining probability.

                              gl

                              john

                              Blessed Saint Leibowitz, keep 'em dreamin' down there..... 

                              Next week's convention for Psychics and Prognosticators has been cancelled due to unforeseen circumstances.

                               =^.^=