Original Blog Entry: The thought behind choosing draw samples

hypersoniq — Tue, 08 Apr 2025 18:44:05 GMT

There is one fact of the discrete uniform distribution formed by lottery results in a pick 3 (or 2/4/5), that is that each number has a 10% chance of being selected.

150 draws is the sample size I use for classification of raw frequency data. Why? Because given the fact above, each number has a chance to be picked 15 times in each position. The expectancy is each digit CAN be drawn 15 times. The variance can be easily seen by how much above or below 15 any digit has been drawn.

1500 is the sample size for follower frequency data. Why? Because followers are only based on the last draw, at that number each digit still has a chance of being selected 15 times. The variance is still easy to spot.

Since each set, though counting a different metric, is expected to show 15 draws per number, the comparison holds at 15 on the same 10% expectancy.

If I wanted to observe more or less, the variables would have to be adjusted. For 10 expected appearances they would be set to 100 and 1,000 and for 30 they would be set to 300 and 3,000.

The number of classification draws is set to see recurring patterns in as short a period as possible. In the pick 3 evening draw in PA, that works out to 7 draws, which also coincides with how many draws they allow for advanced play.

In order to back test, I simply increase this variable, but only consider the first 7 classification results. This is the key to automation should I want to profile the entire draw history.

The mid day game has an observed period of 8 draws, while most NNN draws are found within 7, it does not always hold like the evening draw.

Modifying the follower function to output similar to the raw frequency function is nearly done.

Since neutrals are usually 70% of the observed frequency, the reduction in possible pick 3 combos is 7x7x7= 343, thereby safely eliminating 657 straight combos... if extended to the pick 5, using the average of 70%, 7x7x7x7x7 = 16,807, eliminating 83,198 of the possible 100,000 combos... but is it done safely?

It depends on the NNNNN period in the classification... if it is longer than 7 to 14 days, then not so much. I am not in full development of the pick 5 application of this script, only focusing on the pick 3 until something works.

Up next is finishing the modification of the follower function to give similar output as the raw function. Already have the standard deviation and quartile output.

Then the real fun begins, finding ANY correlations between the 2 data sets... with the full knowledge that there may not be any at all.... [ More ]