The real target is "most of the time"


The problem with predictive statistics is that your analysis holds "most" of the time, but not always. If it holds always then it is usually too general, and if it holds rarely it is most likely over fit from too narrow of a scope.

The current state of the follower script is that I have noticed that most of the time, the next number is located within one standard deviation from the center of the distribution list. (Up or down, for a full range of 2 standard deviations) That does not eliminate the fringe cases where the most frequent or least frequent are selected. Maybe "most" is the best we can hope for...

Still looking for other clues in the statistics, but it did not take long to find a starting point.

I will eventually have to find a way to automate the analysis, as the slow manual process now involves poring over 16 separate data sets, but that is where the discoveries happen.

As the current reduction results in about 3 numbers per colum, a 3x3 matrix of picks would cost $27 to play. If it guaranteed a hit, then it would be a no brainer to stop here, but that goes against the original design goals, so the studying continues...

