hypersoniq's Blog

A search for correlations to raw frequency and follower frequency...

Spawning an idea... process followers the same exact way... via classification! Same rules, +/- one standard deviation around the 10% expectancy... classify via Hot Cold and Neutral the same way... 

Then...

Record the intersections of the resultant neutral sets ! 

Not as safe of a reduction, but it might help identify a correlation between the raw frequency and follower frequency. Would need to increase the follower set to 1,500 because holding the followers to their theoretical 10% expectancy will make 150 raw draws perfectly mesh with 1,500 follower draws... apples to apples.

If there is no direct correlation in frequency, the data should show that also. Then I will know... one way or the other.

This will take some time to implement, so there may be some "radio silence" after the test is done. The follower set may not be as uniform... but it should be, in theory anyway.

Entry #396

Midway through the 7 day test, 50% success.

As 5-0-1 was drawn in last night's PA Pick3 Evening game, that number was indeed a NNN draw. (All from the neutral set, which is what remains when eliminating the Hot and Cold numbers).

Still looking for that NNN combo to be drawn in the day game, but there are still 4 draws remaining.

There were 7 numbers in each column in the neutral set for evening, while the day draw has 6 in each.

IF the test matches literally every back test observation made, then this is the first and only "safe" elimination I have found in over 2 decades. It will prove that over a span of draws, the NNN combination WILL appear at least once. This is the very first step, as any subsequent number eliminations introduce the chance of throwing out the winners.

Since I am not playing until I can narrow down to one pick, I am going to first freeze the data by not updating the draw sheets all week, and seeing what further eliminations would be a good idea. I am also still searching for ANY correlations between the hot/cold data and the follower data.

For the next test I will be looking to eliminate the high and low neutral from each set... not the high and low numbers, but the high and low frequencies. This will also be back tested at several points in draw history to see how many of the NNNs become eliminated by such a reduction.

Every single system before this one was based on the full history set. The systems would either produce a hit in the first month and then not again for over a year, OR I would get bored of how far off they were and give up after only a few days.

This time is different, because the pattern NNN cycles every 7 draws (or less!) Which is useful as the base of any further system development. If you could start of number selection with a group of numbers that produce a straight hit every 7 days,  while safely eliminating between 60% and 70% of the 1,000 possible straight combos, then why not move forward with that?

And the best part is, it is pure statistics that got me to see this pattern that occurs so frequently. It was counting the frequency of drawn digits over a relatively short draw span (150). One standard deviation (calculated on the fly by the Python script) in either direction of the 10% expectancy of each number's frequency yields a set that will produce a straight hit within 7 draws. There is no gray area, and this was made possible by classification. The H N and C are classifiers of frequency.

It is going to be difficult to pick just one from the rather large list of resultant combos (may be over 300 in some cases), but it is better than starting out with all 1,000 combos. This is where I will need to go next, to find what other systems can be fed this list of numbers in the range and help narrow down further.

Maybe there will be a correlation of the frequencies and their distance from the expectancy, or maybe from the Median, which SHOULD be 10% in a truly random discrete uniform distribution. Let's not forget that both the pick 3 day and pick 3 evening full data sets fail the chai squared test for true randomness (but not by much).

The betting strategies are also such that even playing the number at a .50 straight/.50 box could allow play of up to 5 picks and still have the potential to pay for itself with the possibility of a decent profit on a straight hit.

As alway, this will be worked out in the pick 3 before any attempt is made in the pick 5. The NNNNN does not appear as frequently in that game as the similar NNN in pick 3, though it is by far the most represented classification.

Eliminating numbers is part of the game. When you make just one pick, you are rejecting the other 999 combinations anyway, I am just trying to do that with a bit more purpose.

Entry #395

A 7 day test of the hot and cold classifier.

As I start out trying to turn my most recent code into playable picks (far from that yet), I have made many observations at many different points in the draw history. I have noticed that within 7 draws, you will see at least 1 drawing with numbers classified as neutral in each position.

As a hypothesis, I am stating that based on observation, a Neutral Neutral Neutral drawing will happen at least once every 7 draws. I have to prove that... this is what I will hope to do over the next 7 days. I am only attempting to prove the very first part of a multi step process that is still being worked out... so you are where I am. The following numbers were generated and classified using a draw history size of 150, and an offset of 0, which means it is the data that would be available when making selections.

The Tests

Mid Day Pennsylvania Pick 3 Neutrals

Column A = 0, 3, 4, 5, 7, 8

Column B = 2, 3, 4, 5, 8, 9

Column C = 2, 3, 5, 6, 8, 9

 

Evening Pennsylvania Pick 3 Neutrals

Column A = 0, 2, 4, 5, 6, 7, 8

Column B = 0, 4, 5, 6, 7, 8, 9

Column C = 1, 2, 5, 6, 7, 8, 9

The assertion (based on observation) is that we will see a straight hit from the numbers above at least once over the next 7 days, on BOTH the day and the night numbers.

Why this is important... yeah, it seems like one of those things that produces too many picks to play, BUT this is the start of the selection process. IF it holds true here is what the data tells us... hots and colds that exist above or below one standard deviation from the 10% expectancy can be SAFELY ELIMINATED!!!!!

For the mid day, there are 6 neutrals in each column (6x6x6= 216) and we have effectively eliminated 784 of the 1,000 combos!

Since the evening uses different draw data (7x7x7 = 343) we end up effectively eliminating 657 of the 1,000 possible combos (10x10x10 = 1,000)

Of course, any further steps WILL potentially eliminate the winning combos, but we have to start somewhere, and this is the only potential SAFE elimination.

So, over the next 7 draws of each, we will be looking for at least one straight hit from the mid day data AND at least one straight hit from the night game. That is the only success scenario. If either set fails to produce a winner, then it is back to the drawing board. After 20+ years of this I am getting tired of that trip!

Let's see where it goes from here...

NOTE: I am not playing anything here, this test is just to prove that I got the first part right. 200 to 300 combos is still far too many to get a return on.

Entry #394

How quartiles will be used.

While the dividing of a set of numbers in statistics is known as quantiles, for example, dividing into 100 quantiles is the same as a percentile, we are using the special case of a quartile, as our observed draw frequency list will be divided into 4 parts, giving 3 results.

The first quartile exists at the 25% mark.

The second quartile, also known as the median, exists at the 50% mark.

The third quartile sits at the 75% mark.

This data is not looking at drawn numbers, only the frequencies of those numbers (how many times were they drawn in the selected date range). These should be easy to calculate, as they are built into the available statistics functions in the Pandas library.

Of course due diligence will require me to enter the frequency data into R Studio to compare, ensuring accurate data. Once we can trust the results, the hope is to incorporate all of it together.

Remember, any time you choose to eliminate data, you are probably throwing out the winning combo, that is just the nature of these games.

The odds of winning are still 1:1000, but you have to start somewhere.

When eliminating hot and cold, for example, in an output per column of the neutral numbers might look like 7N, 7N, 6N. This means that you can calculate the combos by multiplying these numbers. So 7x7x6 = 294.

The assertion (based on observation) so far is that within 7 draws of the calculated data, from one to four combos will come from this reduced set. NNN is the only regular pattern to be seen.

Further reduction may be had by either eliminating the high and low frequency neutrals... that could look like 5x5x4=100 combos, but that may eliminate the winning combo. Part of the observation has been that of the NNN combos observed, they tend to sit between the first and third quartiles... it is "filling in" from the middle!

If we could use this info to eliminate 70% to 90% of the combos, a paper play test might be in order. And we may as well post that here so anyone watching can get an idea of how systems are tested BEFORE making any actual bets. I will start posting data sets so we can follow along as soon as I have the quartiles set up in the code, so most likely tomorrow.

Entry #393

Making sense of the few clues in the data.

As would be expected, the data generated from the combined follower and hot/cold scripts takes some time to study. Leaving the ability to specify the number of draws being studied and the classifier offset so the follower data can also be analyzed from the same end draw has been a big help in such topics as observing follower delay, or how many draws until the followers hit.

One thing that pops out is the need to code in some more statistics. I moved the output to R Studio to gain more info, but that is a process. Turns out there is a Python library that allows access to R calculations! If you have ever used R, you know the commands are powerful.  I can enter the draw frequencies as a set and simply run 'summary(set)' to gain information such as the quartiles and the mean(average).

Here was the exciting discovery highlights so far...

1. There are 27 total draw profiles that can be observed using H, C and N... by FAR the one that occurs the most is NNN. 

2. No matter where in history I took a sample, there is from 1 to 4 NNN draws in the first 7 that follow! Why does this matter? Imagine the choices being pared down by ELIMINATING Hot and cold numbers... if you need to pick from all, there are 10x10x10 = 1,000 possible combos... a sample that happens quite frequently is one or two hots and colds per column, leaving a neutral count such as 7,7,6... 7x7x6 = 294 !!! This part eliminates almost 70% of the combos! ONE to FOUR of combos in that set will appear in the next 7 draws.

3. Further eliminations may be possible by removing from the choices those neutrals that exist outside of the first and third quartiles! All of the NNN combos observed so far fell within that range. Imagine your choices now get lowered to 4x4x4 = 64 or something similar... that is a reduction of over 90% from the original 1,000.

This is the next phase... using the follower data to further focus on the one best guess. The higher the position on the follower list may be the piece of the puzzle that helps zero in on the one best guess that matches the NNN criteria with the best performing follower number that also lands between Q1 and Q3.

A guarantee? There is no such thing in these games. A better chance to catch a pick 3 straight more than once a year? Maybe. Still a cheap system, total cost if done at .50 st/ .50 box is only $14 for a week... there would be a separate evening and day combo.

I am using the first 7 draws because that is the max number of days you can play in advance at the PA lottery kiosk. This would definitely change for a game like the PA Match 6 where they allow play for 26 days. But I am not getting too far ahead right now... focusing on the pick 3. The next challenge will be much easier as it is a per column basis... the pick 5, which has been the target for over 2 years now. 

There is always much to learn, and eventually I will grow weary of this system if it fails to do better than previous ones.

I still have coding to do so I can print out the quartiles for each column... should not be too difficult as I already have it spitting out the standard deviation for each. Then I need to find a direct correlation between the neutral numbers and the follower list... there may not be any, but it does have it's place here at the beginning to help narrow down to one combo.

Sometimes things take a while to "click" so I don't rush things. For each hour coding, there are weeks to even months of thought and research. Writing code becomes much easier when you have an idea of what to look for.

Entry #392

Maybe instead of looking at the "most" or "least", the answers could be closer to the middle...

Studying where the numbers in the " neutral zone" are sitting with respect to their 10% expectancy may offer more answers than studying the extremes. It might take a re-framing of long held beliefs in higher frequency being the better bet (one of the pillars of the Gail Howard systems). It is true that the drawn numbers don't know if they are hot or overdue, but what if some larger truths are governing how the number frequencies fill in around the central tendency of the expectancy.

Something I would have missed without adding the short term analysis to the last few scripts.

Pick 3 will be the proving ground, and being a per column function, eventually moving to add the pick 5. Still a long way from a live test, still trying to plan out interpretation of the output, but work is underway!

Entry #391

There was no need to include any "divide by zero" logic in the hot/cold script!

The divisor in the program when calculating the percent to auto range hot and cold is always equal to the number of draws (X). The dividend is the number of times a particular number was counted. If a number does not appear, 0 divided by X is a legal operation that returns 0.

Had this been a situation where the divisor MIGHT equal zero there are 2 ways (at least) to deal with it in Python. One is an if statement to check that the divisor was > 0, the other is a Try and Except block that would try the division and execute the code in an except block if it encountered a divide by zero error.

To test, I ran the code with a small value of X=5, and the numbers that had a count of 0 were classified as cold with a percent of 0.

When I added that as an item to the checklist (to check for possible errors) I was just trying to cover everything that could go wrong. So, stopping to think about what was actually happening resulted in adding no extra lines of code because the scenario of division BY zero was not a thing that could happen (unless for some reason I set X to 0, then I guess I would deserve that error.

Now it is time to freeze the updating and study the output to find any links between the classifications of hot and cold and the follower distributions. The first experiments have determined that using hot and cold as a reduction technique still leaves a great deal of numbers in the neutral zone, but the neutral classification NNN appears way more than any other classification combo.

Knowing that there is no chance for the division by zero error means that this script will work with the jackpot games as well... a place where eliminating numbers could be more helpful!

Entry #390

What the hot and cold data is showing

After running many passes with varying short term durations one thing is clear. The most popular classification is Neutral, Neutral, Neutral. Regardless of whether I classify over 7 draws or 700, NNN is the most common, not even close with any other set.

Why this is problematic... with one standard deviation above and below classified as hot or cold, the numbers that are closer to expectancy far outnumber those that hit H or C.

Adding the limited set of follower data, I hope to make it possible to have a better idea of which 3 neutral numbers would appear together within 7 draws. When the classification set is limited to 7 draws, there have been as many as 4 out of 7 showing up as NNN. The basic premise being that picking 3 neutral numbers would yield a better shot at a hit than incorporating the hots or colds... the playable set is reduced by eliminating the "outliers". Add to that picking from these many neutral numbers by cross referencing the frequency of followers might yet yield a winning combo.

It remains to find the sweet spot for draw numbers, for the X sample base that the hot and cold are derived from AND the shortest period of follower draws to get reliable data.

The current set uses 1,000 draws for follower data and 150 draws for hot-cold calcuation.

The play system to accompany would be to pick one NNN combo and play 50/50 for 7 days.

Not quite ready for a live test, but getting ever closer.

Entry #389

Another coding session...

Today I altered the follower script to both allow for a draw offset (so it can match the offset of the hot/cold script) and to limit the draws to the number entered so it can deal with shorter term trends as well. It was a bit more difficult than anticipated because the follower program is a bit more complex. Getting the function to run with a limited set of draws and make it ignore the offset when set to 0 was one of the easier parts. The code ran every time, but the offsets were wrong to start (Pandas iLoc hell) and there were remnants of an older experiment in there causing the mismatch. Since I control the input file which is cleaned and free of blanks, there was no need to keep the logic that was dealing with NaNs due to uneven column lengths. Works as expected and now reduced to 70 lines of code!

Now the fun part, to splice the function together with the follower script and look at results to see if any new information can be gained.

So now it gets complex. After many runs I know that the most frequent follower does not necessarily come up in the next draw. Using the short term follower setup, we can restrict the follower data to more recent trends. The hot and cold script will use pure statistics to classify hot and cold numbers. Putting the information together will hopefully lead to a better pick.

The process, look for recurring HNC (hot neutral cold) patterns, then cross reference them with those numbers on the follower list. If it is a recurring pattern of NNN, pick the Ns that did best on the follower lists per column... then play that combo for a week.

Because of the offset, I can see the short term data available AND it's effect against the next number of draws offset. Then, by setting the offset to 0, I can work with all the current info available to make a pick. 

Difficult process that again may amount to nothing, but making the changes was enjoyable. I tried as much as possible over the last year or so to adhere to software engineering best practices of modular reusable code and atomic functions, so this is the chance to put it into practice.

Entry #388

Why do I just not "get it" that the lotteries are unpredictable?

It has been over 20 years chasing the impossible, generating one pick for a chance at a straight hit... pick 2 to power ball, it does not matter...

Systems, statistics, programs, spreadsheets... and in reality, the few that had a hit were coincidental at best.

So far this journey went from having ideas that I did not know how to implement to being able to write scripts and build spreadsheets to do just about everything but win on a regular basis.

I never was about wheels or trapping because I am too cheap of a player. In most cases I would play one number on one game because any more cash output and it would not be fun anymore.

Granted, I learned more about coding and problem solving with this hobby than I learned taking classes, so it was certainly not all wasted time, but...

What is it that motivates continued work towards what I am sure is an impossible problem? I have walked away for years at a time only to wind up updating severely outdated history files for another try.

Maybe it is an ego thing? I can't be too dumb to figure this out, or can I? I have gained more experience through this foolish quest in spreadsheets and now Python, so it was not ALL wasted time, I just don't get why I don't just finally admit what I want to do can't be done and move that time to a different hobby... I would probably be a better guitar player for sure.

As I write this, I am about to fire up the computer and go back to the Hot/Cold script to make sure it does not encounter a "division by zero" error, that was a checklist item. The worst part is that I can generate tons of data that I am unsure how to interpret...

I am sure any of us who try to solve the lottery problem have low moments like this... what keeps you going?

Entry #387

The latest output of the Hot Cold Classifier

The standard deviation of column A is 0.52.
The standard deviation of column B is 0.57.
The standard deviation of column C is 0.56.

Distribution counts of 3458 draws for each column:
Value        A        B        C
0    350 N (10.12%)    347 N (10.03%)    347 N (10.03%)
1    338 N (9.77%)    338 N (9.77%)    324 C (9.37%)
2    332 N (9.6%)    337 N (9.75%)    368 H (10.64%)
3    325 C (9.4%)    330 N (9.54%)    321 C (9.28%)
4    389 H (11.25%)    343 N (9.92%)    352 N (10.18%)
5    337 N (9.75%)    395 H (11.42%)    369 H (10.67%)
6    351 N (10.15%)    354 N (10.24%)    362 N (10.47%)
7    365 H (10.56%)    315 C (9.11%)    336 N (9.72%)
8    339 N (9.8%)    353 N (10.21%)    316 C (9.14%)
9    332 N (9.6%)    346 N (10.01%)    363 N (10.5%)


Final classifier count summary:
A: 2 H - 7 N - 1 C
B: 1 H - 8 N - 1 C
C: 2 H - 5 N - 3 C

Classifications for the last 7 draws:
6 N    6 N    2 H
2 N    1 N    6 N
9 N    4 N    0 N
5 N    2 N    3 C
3 C    3 N    6 N
2 N    8 N    5 H
1 N    5 H    7 N

 

The output now adds bot calculating standard deviation and displaying it for each column. I no longer set the Hot and Cold thresholds by passing arguments, but rather by direct calculation. The number of draws is interesting that in the case of a discrete uniform distribution, this is how you would obtain a confidence level of 95% with a 1% margin of error. I am not exactly sure why I chose to use the Z score to come up with that number, but this script WAS written for experimentation.

Look how low the standard deviation got as the number of draws increased... at under 100 draws, it was giving a much higher standard deviation. Here is the run for 90 draws...

The standard deviation of column A is 2.48.
The standard deviation of column B is 3.44.
The standard deviation of column C is 2.77.

Distribution counts of 90 draws for each column:
Value        A        B        C
0    9 N (10.0%)    9 N (10.0%)    6 C (6.67%)
1    12 H (13.33%)    10 N (11.11%)    10 N (11.11%)
2    9 N (10.0%)    9 N (10.0%)    9 N (10.0%)
3    13 H (14.44%)    5 C (5.56%)    11 N (12.22%)
4    11 N (12.22%)    12 N (13.33%)    7 N (7.78%)
5    6 C (6.67%)    14 H (15.56%)    12 H (13.33%)
6    9 N (10.0%)    6 N (6.67%)    10 N (11.11%)
7    7 N (7.78%)    7 N (7.78%)    4 C (4.44%)
8    7 N (7.78%)    5 C (5.56%)    9 N (10.0%)
9    7 N (7.78%)    13 H (14.44%)    12 H (13.33%)


Final classifier count summary:
A: 2 H - 7 N - 1 C
B: 2 H - 6 N - 2 C
C: 2 H - 6 N - 2 C

Classifications for the last 7 draws:
6 N    6 N    2 N
2 N    1 N    6 N
9 N    4 N    0 C
5 C    2 N    3 N
3 H    3 C    6 N
2 N    8 C    5 H
1 H    5 H    7 C

Still trying to find that sweet spot to get just the right amount of variance...

Entry #386

So many decisions for one script! Thought hot/cold would be easier...

As the calculation of the standard deviation is almost complete, the need becomes apparent to create a back test, but at what intervals?

There are 2 that move to the front of the pack right away...

1. Process the data in chunks so the function is fed X+Y draws so the chunks are larger but separated by X+Y or...

2. Process the data by chunking only Y draws back. This would allow the process to be back tested and provide a full output of observed HNC patterns for the entire draw history (to X+Y from the origin draw)

The second option would be a more thorough exploration data set to find common HNC classification patterns because it only deals with data at a time that you would NOT have known beforehand, no a priori knowledge, which is ideal because moving forward you would not have that information.

This would simplify the process of getting a pick by simply counting the most frequent HNC patterns. All that would need to be recorded are the Y classifications, which would match the draw history.

A few challenges to overcome...

1. Counting the remaining chunk sizes to ensure there are X+Y draws remaining to process. A simple count of the remaining data frame rows should handle this.

2. Writing the Y classifications to a CSV file, because this requires buffering the output of each column and writing complete rows after all columns have run. This is mostly solved in the output, but I have to ensure it writes the data appropriately so that it is the same date ascending order of the original data.

3. Figuring out the HNC to play, as the classifications are mostly a one to many relationship... there could, for example, be 3 Hots in a column, so which Hot to play?

4. Refactor the input arguments. I will no longer need to specify the hot & cold threshold percents, so perhaps input the expectancy so the calculated standard deviation can be added and subtracted from it to set the thresholds.

And that is just for the pick N games... followers are a future add...

Busy Hobby!

Entry #385

Lottery results from the statistics point of view

Lottery results are what statisticians call a "Discrete Uniform Distribution". The premise that each value in a discrete ( limited membership, like 0 through 9) set has an equal chance of appearing makes the graph of results different from, say, a Bell Curve.

In this distribution, one standard deviation is considered statistically significant. The standard deviation is simply the square root of the variance, which is a measure of how much each data point sits away from the central tendency (the mean).

For the purpose of the difference between Hot, Cold and Neutral classification, the expectancy is each digit should appear 10% of the time in a pick N game where the set is (0,1,2,3,4,5,6,7 8,9).

In the development of the Python script, I went with a gut feeling of >= 12% to be Hot and 8% or less to be Cold, with anything in between to be classified as neutral.

After a few runs of the script, I took the distribution counts an placed them into R studio to run some simple tests on the standard deviation and found that it was usually between 2.5 and 3. I was not far off! So, it is relatively simple to calculate standard deviation at run time using the stats library or Pandas. The next update will be incorporating this functionality into the script. By doing this, I would no longer be unsure of the statistical significance of the Hots and Colds, as it would be correct for each column... 1 standard deviation above or greater for HOT and one standard deviation or below for COLD.

Here is the interesting part... when taking a sample of draws, the larger X becomes (X being the number of past draws), the more the results occurred nearer their expectancy. This would result in fewer overall Hots or Colds and more Neutrals. The bottom line is that too few draws produces volatile variance and too many produce too steady of a variance.

What does this mean? I need to find the "sweet spot" of the number of draws where Hots and Colds are produced. That yet unknown range where there is just enough volatility in the variance to have a shot at gaining actionable output.

So, no finish line yet, but progress is being made!

Entry #384

The expectancy for numbers in a 6/49

The pick N games is simple, there are 10 possible digits that can be selected in each position, each having an equal chance, therefore the expectancy is that each number has a 10% chance, and therefore a 10% expectancy over any varying amount of samples. Having looked at this through several runs of the hot/cold script, this mostly holds true. While the largest group of digits between 0 and 9 hold around 10%, the hots exceed 12% and the colds are under 8%.

So what if I were to use the program for a run at the PA Match 6? That is a classic 6/49 game.

The expectancy changes!

In a 6/49 game in sorted order, each column has 44 possible numbers, and 5 that cannot appear in that column when sorted. With 44 possible numbers, the expectancy (expressed as a percent) would be (1÷44)×100 = 2.27%

If I run the program as written, it would classify every number as cold! Also, if a number were picked 0 times in the selected window, the program would crash with a "division by zero" error.

All I need to do to make the same program safe to run is a simple if statement that will only process when the count of the digit is greater than 0, otherwise output "0%". The other step would be to auto range the expectancy and calculate the hot & cold thresholds at run time, depending on the range of data in each column of the history file. This will mean more work, but ultimately more flexibility as the expectancy for the 5 white balls and the bonus ball in both MM and PB are different.

One program to do it all...

Entry #383

Always more questions than answers, but short term trends can be back tested

As the integration of a limited set of follower distributions is added to the hot/cold script, it becomes important to try and select the proper sample sizes from the whole history files...

It is one thing to calculate by counting, another entirely to learn how to use the information output... and it becomes obvious that no single system is a magic bullet.

Sticking in the core concept of frequency, the short term data of hot and cold is based on frequency, short term followers are based on frequency.

Once the code is written for initial operation (I am down to implementing the limited draw number and Y offset for the follower function) the difficult part does begin... creating a back test to see how this program plays out across the entire draw history...

I have the requisite building blocks in place, but the most important variable will be the back test offset. This will determine the number of draws to repeat the existing script at different points in time. I feel that the largest data set size, the number of short term followers to collect, will be the deciding factor. Then there is collecting this data in a csv file for further analysis. Coding will proceed anyway, but here are the burning unanswered questions left to answer....

How many draws are required to get an accurate measure of followers? 100, 1,000?

How many draws will it take to get a clear picture of repeatable hot and cold trends?

Is it possible to calculate the statistical significance of the distribution and assign the thresholds of hot and cold based on standard deviation? << actively working on this one.

I think I will be off of playing for quite some time while figuring this one out... 

In other news, while calculating the "most frequent" combo for the PA mid day, it was determined to be 198, that number came out straight yesterday and of course I was NOT playing it... such is life...

Entry #382