Welcome Guest
Log In | Register )
You last visited January 17, 2017, 4:45 am
All times shown are
Eastern Time (GMT-5:00)

Chi testing on non-public powerball data

Topic closed. 15 replies. Last post 7 years ago by TinFoilHat85705.

Page 1 of 2
51
PrintE-mailLink
TinFoilHat85705's avatar - DiscoBallGlowing
New Member
Tucson
United States
Member #86504
February 5, 2010
28 Posts
Offline
Posted: February 17, 2010, 6:50 pm - IP Logged

I've ran some chi-testing on powerball data, more specifically on the non-public sets. The light blue dotted line (53g mod-208) is for two ball sets ago (#21-24). The thick light maroon line (55g mod-301) is for the very last ball set (#25-28). The light green thick dashed line (59g-67) is for our current ball set (#29-32) upto 2/16/2010. The thin light purple line (59g R) is from random draw data for additional comparison purposes. Time goes from left to right meaning that newer draws would be found on the right. The y axis indicates chitest values between 0.55 and 1.

53g and 55g were stretched inwards to fit against/with the current ball set 59g. There were more draws in 55g, next amount was in 53g, and the current set depth has 67 points. Chi tests were ran after each new set of 6 non-public draws across the individual ball sets in that set of balls, meaning that for each draw a ball set is picked and then 6 draws are drawn and chi testing was computed by groupings as each ball set incremented the values to observe instead of going one individual non-public draw at a time. Once completed, the first 50 results were discarded for each set examined (116 six grouped draws minus the first 50 results equals the 67th depth for our current set).

53g appeared steady. 55g took a big dive right before they changed that ball set and increased the ball count on 1/7/2009. Which brings us to our current set 59g that appears to be failing as bad as 55g, and 59g is only at about 28% deep (time/duration) as 55g took to get to that point.

I would like to work with others on these findings that can shed additional light on the data.
Comps

I posted another topic about quad-duplication. Here are those results:

1>> Quad-duplication shows up in 62% of all draws (public & non-public). Of these, 84% did not end up being a "public" draw.

2>> There were also 3% more crossing quad-duplications found than there were draws (all public & non-public).

3>> In regards to 5 matching numbers, only two have ever been public, with a span of only 513 draws between on one. 32 other 5 matching (duplications) are not public (spans between range from 36 to 1,243). One, with a span of 468 draws between, even matched down to the powerball itself! Others (31%) seem to cause a ball set change and/or an increase in the heighth of the ball numbers.

*Span = public draws, count doesn't include non-public

    TinFoilHat85705's avatar - DiscoBallGlowing
    New Member
    Tucson
    United States
    Member #86504
    February 5, 2010
    28 Posts
    Offline
    Posted: February 18, 2010, 4:00 pm - IP Logged

    Here's an updated graph that is not stretched to fit. Again, here's the break-down:

    53g / light blue little dotted line = Two ball sets ago (#21-24)
    55g / dark red thick line = Last ball set (#25-28)
    59g / light green thick dashed line = Current ball set (#29-32)

    Each interval represents chi-test on one draw (one draw meaning the actual "public" draw in addition to the other 5 non-public testing draws) amongst eachother with calculated expected values for the ranges as they went from earliest (left) to most recent (right).

    Comps2

      Avatar

      United States
      Member #83701
      December 13, 2009
      225 Posts
      Offline
      Posted: March 13, 2010, 10:26 pm - IP Logged

      I've ran some chi-testing on powerball data, more specifically on the non-public sets. The light blue dotted line (53g mod-208) is for two ball sets ago (#21-24). The thick light maroon line (55g mod-301) is for the very last ball set (#25-28). The light green thick dashed line (59g-67) is for our current ball set (#29-32) upto 2/16/2010. The thin light purple line (59g R) is from random draw data for additional comparison purposes. Time goes from left to right meaning that newer draws would be found on the right. The y axis indicates chitest values between 0.55 and 1.

      53g and 55g were stretched inwards to fit against/with the current ball set 59g. There were more draws in 55g, next amount was in 53g, and the current set depth has 67 points. Chi tests were ran after each new set of 6 non-public draws across the individual ball sets in that set of balls, meaning that for each draw a ball set is picked and then 6 draws are drawn and chi testing was computed by groupings as each ball set incremented the values to observe instead of going one individual non-public draw at a time. Once completed, the first 50 results were discarded for each set examined (116 six grouped draws minus the first 50 results equals the 67th depth for our current set).

      53g appeared steady. 55g took a big dive right before they changed that ball set and increased the ball count on 1/7/2009. Which brings us to our current set 59g that appears to be failing as bad as 55g, and 59g is only at about 28% deep (time/duration) as 55g took to get to that point.

      I would like to work with others on these findings that can shed additional light on the data.
      Comps

      I posted another topic about quad-duplication. Here are those results:

      1>> Quad-duplication shows up in 62% of all draws (public & non-public). Of these, 84% did not end up being a "public" draw.

      2>> There were also 3% more crossing quad-duplications found than there were draws (all public & non-public).

      3>> In regards to 5 matching numbers, only two have ever been public, with a span of only 513 draws between on one. 32 other 5 matching (duplications) are not public (spans between range from 36 to 1,243). One, with a span of 468 draws between, even matched down to the powerball itself! Others (31%) seem to cause a ball set change and/or an increase in the heighth of the ball numbers.

      *Span = public draws, count doesn't include non-public

      So how would the constraint that five numbers are drawn from a set each time relate to a distribution based analysis?   The five numbers drawn constraint implies that no single number will appear within a set of 5 more than once and as these are discrete sets of 5, there are boundary situations where in the published data they are closer than 5 drawn but only because one is near the beginning of one set while the other is near the end.   Ultimately the constraint means there's always the artificial limit of no number ever exceeding 1/5 of the sample set.   Clearly this constraint isn't reflected by the chi square distribution nor the normal distribution.

        TinFoilHat85705's avatar - DiscoBallGlowing
        New Member
        Tucson
        United States
        Member #86504
        February 5, 2010
        28 Posts
        Offline
        Posted: March 15, 2010, 12:16 pm - IP Logged

        I don't fully understand what you are saying. I don't know much about chi other than self-teachings and mostly because I got confirmation they use it for their abnormality testing. I do have 18 years of non-public data though and wish I had others that could help in dissecting it.

        When you were responding, were you aware that there's 4 sets of balls each numbered 1 through XX and for each draw 6 draws actually occur with only one being labeled the public winning drawn numbers? So you have four different sets of data times 6 to allow for distribution based analysis....no?

          dr65's avatar - black panther.jpg
          Pennsylvania
          United States
          Member #74096
          May 2, 2009
          23217 Posts
          Offline
          Posted: March 15, 2010, 12:20 pm - IP Logged

          Sorry, but I think I need a tin foil hat now.

          Confused

          You might have to simplify some data or at least explain it.

            Avatar

            United States
            Member #83701
            December 13, 2009
            225 Posts
            Offline
            Posted: March 15, 2010, 12:52 pm - IP Logged

            I don't fully understand what you are saying. I don't know much about chi other than self-teachings and mostly because I got confirmation they use it for their abnormality testing. I do have 18 years of non-public data though and wish I had others that could help in dissecting it.

            When you were responding, were you aware that there's 4 sets of balls each numbered 1 through XX and for each draw 6 draws actually occur with only one being labeled the public winning drawn numbers? So you have four different sets of data times 6 to allow for distribution based analysis....no?

            My understanding is that they run test draws first and if the criteria that they've set for independence isn't met, they switch to a set that did pass that criteria before making the draw, at least that's what the Texas lottery commission website says, their criteria is that no number be drawn more than five times in six draws.   The idea is that if a ball set has been compromised then it would be unlikely to pass the pre-test and the alternate ball set would be used in the draw, of course the alternate ball set most also pass the pre-test.   Something like chi square distribution simply gives you a confidence value in the similarity of the observed distribution and the expected distribution, in truth with lottery draws using balls, I would suspect that the sample set is so small and the expected distribution is so flat that an exact analysis would be possible and preferable to a chi square test.   I can see them running chi square tests on pseudo random number generators as a pre-test to a computerized draw.   The pre-test draws are public information in the Texas website if you know where to look for them, the link is labeled "Pre-Test Results".  I suspect that the data is not intentionally kept from the public, it's just that the public rarely realize that it's available publicly.   The ball sets in Texas are identified by numbers and the draw machines by letters.   The machine and ball set used for the draw is also indicated in the public data and since they rotate through the machines and ball sets on a schedule, you could break out your statistics by ball sets and machine and make an educated guess as to which ball set and machine would be used in the next draw.   Of course this divides down your data set tremendously.   Technically speaking all the data is public as it's a state organization so if there's information that you want and you can't find it, you can file an "Open Records Request" but you may have to pay the wages of the intern they assign to compile the information for you.

              konane's avatar - wallace
              Atlanta, GA
              United States
              Member #1265
              March 13, 2003
              3348 Posts
              Offline
              Posted: March 15, 2010, 1:24 pm - IP Logged

              I don't fully understand what you are saying. I don't know much about chi other than self-teachings and mostly because I got confirmation they use it for their abnormality testing. I do have 18 years of non-public data though and wish I had others that could help in dissecting it.

              When you were responding, were you aware that there's 4 sets of balls each numbered 1 through XX and for each draw 6 draws actually occur with only one being labeled the public winning drawn numbers? So you have four different sets of data times 6 to allow for distribution based analysis....no?

              Welcome to Lottery Post!  Love your screen name! Big Grin

              We have several really good number crunchers here who may be able to assist you.  I certainly hope so.    

              I am aware of 4 ballsets each for white and red balls.  They recently confirmed that number, plus confirmed the same number a year ago when I inquired.  6 draws prior to jackpot draw is new information, thank you for letting us know.

              Good luck to everyone!

                TinFoilHat85705's avatar - DiscoBallGlowing
                New Member
                Tucson
                United States
                Member #86504
                February 5, 2010
                28 Posts
                Offline
                Posted: March 16, 2010, 12:20 pm - IP Logged

                You could say that my journey started when I wanted to know if forecast predictions could be made on "random" numbers. This quest will probably carry with me throughout my life. It is currently being applied to the lottery though, despite all those who have told me there's absolutely nothing to analyze (when the programmed word "random" is applied).

                Seven years ago, I pulled up the PowerBall® and the Fantasy 5 (now called The Pick) number histories. They were both similar in the idea that random numbers were drawn from a pool of numbers and people placed bets.

                What I noticed almost immediately is that the Fantasy 5 was very evenly distributed when summing counts on individual ball numbers (over a range) versus the mountain peaks and ranges of the PB. Of other analysis I've conducted since then, the PB is still the only one that seems to have some things that are not easily explainable (and is the only one that I know of or have looked at that is still real balls versus computer simulated).

                I've tried counts on ranges with focus on one too many levels and have decided to call it quits several times, until a little crack of light dawns and spreads new energy. One big example is when I found on PB's site the hidden part that not only shows which draw came from which ball set, but also indicated that 6 draws occur per draw. I spent countless hours trying to separate the ball sets for examination to not much avail.

                Then I found what appeared to be some typos or discrepancies and contacted musl. To my surprise, I had indeed found typos and in return asked if there was any way to get data prior 2005. I now have data going back 18 years and I still found more typos (about 55 data points out of 11,000) to which half they told me are too old to get definitive answers on (which alter some of my results, but only by a hair of a margin).

                The most recent findings that baffled me (besides the fact that since 1992 they've had 4 ball sets and 6 draws per draw) are: quad/fifth duplication and chi testing results. I'll try to concisely simplify my findings on both right here.

                Quad/duplication (this is when I only had 4,000 rows of data versus the now 11,000):

                First, I wanted to know how often 4 of the 5 drawn white ball #'s matched elsewhere in any other draw (as 4 out of even 55 equals 340,000 combinations, which implies that it shouldn't happen very often). I created a macro in Excel to point out every occurrence and found that it happened far more than anticipated. 103% to be exact, or rather, out of 4,000 rows of draw data, 4 out of 5 numbers on any one row match another row (in total) 4,120 times because some draws/rows can have a multiple quad-duplication elsewhere.

                Second, I wanted to know how often 5 of the 5 matched. So far (when I was only examining 4,000), only 2 appear to be public and interestingly enough, numbers generated from atmospheric noise from a random # producing site yielded similar results, meaning that in a span of only 1,500 consecutive draws, all 5 can match identically. But back to the 5 of 5 (even though there's 5 million combinations) appearing in only 2 of the "public" winning drawn numbers when 31 other 5 identical matches just happened to fall on the pre and post tests - instead of "public" draws - (or possibly, they didn’t want the public to see that that kind of duplication does happen - system of the design anyone)?

                Chi:

                After probing further with musl on the typos they couldn’t pull up data for, they also shared that they break down the ball sets separately and run chi tests. I immediately hit the internet for what chi was and how I could use it. Nothing really helped except my subconscious soaking on it for a couple of hours when I realized that each ball set is one row of counts (moving history) by columns of ball # and that expected values can be calculated, all in excel. Manually, you can get it to give you a chi probability test percentage based on one setting of how deep to look. For me, I went from start to finish on a particular ball set and could get the entire last chi result of that whole set.

                Then I realized, I could create a macro that would step interval 6 draws at a time by the ball set that it occurred at (grouping one actual draw together versus separating them one line at a time). What I mean by this is taking the very first ball set (of 6 actual draws, not just one), calculating the chi, and recording at its interval level/depth (in this instance, #1). Second, take the macro and step to the next ball set chosen, and re-calculate adding those 6 new rows of 5 balls drawn data. Because you've got the separation of ball sets AND 6 draws per draw, it doesn’t take more than 5 weeks to discard the first 50 results (instead of one full year), like most of the chi tutorials indicate is a needed examination step. What you get is the charts I had already posted to this thread (and the comments I included explaining how odd they behave). When compared to #'s generated from that random sample source mentioned earlier, PB seems controlled.

                So now that I've got 18 years of data, down to the ball sets #'s and all 6 draws per every draw since then (despite having 4 draws that have errors in them, 1997, that effect 45 other draws later), my plans are to recalculate quad/fifth duplication and chi results because before, I was only doing it on 36% of the history.

                Of course, if anyone cares enough to want to help dissect in any aspect... I'm willing to share the data I have (but most don't care about "every little detail", where as I have to know why they've done what they've done in order to attempt to come up with quality bets).

                  TinFoilHat85705's avatar - DiscoBallGlowing
                  New Member
                  Tucson
                  United States
                  Member #86504
                  February 5, 2010
                  28 Posts
                  Offline
                  Posted: March 16, 2010, 4:05 pm - IP Logged

                  My reference that there have been thirty something duplications of 5 white balls is accurate for all 18 years (back to 1992). I thought I had found that many in only 4,000 draws (2005 to 2010), but there is actually 33 with only 2 being "public" draws (in all 11,202 draws) as of 3/10/2010.

                  So they're telling us that only 6% of the seemingly impossible odds (at minimum 1.2 million combinations) of when fifth duplication occurred, that it just so happened to land on "public" draws and the other 31 (or 94%) fell on pre & post-tests!?

                  All together, it happened 3% of the time (11,202 draws since 1992) and with combination odds of at least 1.2 million, is that why they would want to control what the public sees (or rather, how it looks)?

                    TinFoilHat85705's avatar - DiscoBallGlowing
                    New Member
                    Tucson
                    United States
                    Member #86504
                    February 5, 2010
                    28 Posts
                    Offline
                    Posted: March 17, 2010, 8:09 pm - IP Logged

                    Chi Update:

                    My prediction is that any day now (literally), they are going to do away with the current ball set #29-32 (went in place on 1/7/2009). It's values are falling far lower than the previous set and unless their unknown rules are to behave like the set from 2000-2002, it just hasn't rebounded at all in the last 8 weeks. Before tonight's draw, the chi probability test result (compared to expected values) is at 0.58140.

                      TinFoilHat85705's avatar - DiscoBallGlowing
                      New Member
                      Tucson
                      United States
                      Member #86504
                      February 5, 2010
                      28 Posts
                      Offline
                      Posted: March 18, 2010, 2:38 pm - IP Logged

                      With the results from 3/17/2010, it continued to decline (now): 0.56556

                        Avatar
                        NASHVILLE, TENN
                        United States
                        Member #33372
                        February 20, 2006
                        1044 Posts
                        Offline
                        Posted: March 20, 2010, 8:36 pm - IP Logged

                        The Powerball drawing is shown on TV.  One can see the balls fall into the chute.  This, IMHO, negates all that went before. 

                        They can pre-test 24/7, change the ball set, mix and match all they want.  The only draw that counts is the one they televise.  The only draw we should be concerned with is the one they televise. 

                        Hopefully someone will take the time to explain why those pre-tests are of such great concern.  I don't see the significance.

                          TinFoilHat85705's avatar - DiscoBallGlowing
                          New Member
                          Tucson
                          United States
                          Member #86504
                          February 5, 2010
                          28 Posts
                          Offline
                          Posted: March 23, 2010, 12:29 pm - IP Logged

                          I personally don't believe everything I see or get told, or I wouldn't have found the data I now have... it just personally interests me to dissect it and find things that don't seem to make sense. As of 3/15/2010... out of 11,202 draws (since 1992), 1,122 had the PB # match one of the white balls (10%). However, 83% of those coincidentally fell on a test draw versus a public draw. So the public only knows of this type of thing happening 1/5 of the time. Suurrre no one's in control of what the public gets. It's also kinda like how on 4/3/1993 the numbers that were drawn were: 15, 22, 24, 32, 39 and PB of 18 (no winner). Then, on 12/27/2000, the same exact numbers were drawn down to the PB matching as well. This time, there happened to be two winners. Shouldn't the odds of that happening be slimmer than slim? Out of 31 occurrences of all 5 white ball numbers matching, this is only one of the two that were not a test draw.

                            TinFoilHat85705's avatar - DiscoBallGlowing
                            New Member
                            Tucson
                            United States
                            Member #86504
                            February 5, 2010
                            28 Posts
                            Offline
                            Posted: March 23, 2010, 4:21 pm - IP Logged

                            Or how about on 4/30/1997, numbers drawn were: 19, 16, 17, 15, 18 and PB# 17... and yet it was only a pre-test! What are the chances that it happened, and didn't also happen to be a public winning draw?

                              rdgrnr's avatar - walt
                              Way back up in them dadgum hills, son!
                              United States
                              Member #73904
                              April 28, 2009
                              14903 Posts
                              Offline
                              Posted: April 2, 2010, 5:05 pm - IP Logged

                              I just poured 4 fingers of Jack Daniels in a glass.

                              Those are the only numbers in this thread that I can comprehend.

                              Now I'm gonna slam that sucker and see if I can understand.

                              And then repeat as necessary.


                                                                           
                                                   
                                                                       

                               

                               

                               

                               

                                                                                                                                 

                              "The only thing necessary for evil to triumph is for good men to do nothing"

                                                                                                                          --Edmund Burke