hypersoniq's Blog

Next dev task, adding followers to the hot/cold script.

This next project will simply add the entire follower function to the existing hot/cold script with the new functionality of specifying how many draws to check for followers, AND an offset, that will match Y so results can be checked. Same 0 functionality to get current playable data will apply as well.

I feel capturing follower data in a shorter term would be a useful addition, as the hot and cold data could help pinpoint the next follower.

The other nice feature is that it will work with any integer data I pass it, so that includes jackpot game histories as well... only possible because of the "per column" design pattern I implemented.

That may be the long range goal, mix in all of the old scripts and combine the information they provide.

This update will be a difficult task comparatively because the follower script is far more complex than the hot/cold script. The money saved by not playing while in development is worth the effort, even with how cheap I play!

Entry #381

How the hot/cold data is verified correct...

It is actually quite simple with a spreadsheet.

1. Were the last Y draws correct and in the right order? They should match exactly the last Y draws on the spreadsheet history.

2. Were the distribution counts accurate? Select X draws above the Y value and manually check the counts, they were equal.

3. Were the hot/cold thresholds applied correctly? The equation is 

(D/X)*100

Where D is the distribution count of the digit in question and X is the value of X, which is the total number of draws involved in the count.

For example, in the previous post the first column picked a number that was drawn 8 times as a HOT (criteria being the number of times appearing is >= 12%) so that looks like 

(8/60)*100 = 13.33333 %

Which is greater than 12% so a valid HOT by the criteria.

That is one thing that makes coding difficult, is validating that what you expected the program to do to what it is actually doing. If the program ran that means it was free of Syntax errors, but only through testing and validation can you be sure the program is free of Semantic errors... those are a prime source of "bugs" that produce unexpected and erroneous output.

I am definitely a fan of the Pandas library for Python, it is a huge time saver to use a data frame to hold data. It is definitely suited to help in the work those of us do trying to solve impossible problems by manipulating data trying to win a game that has a massive house advantage... maybe one day...

Entry #380

The output of the Hot/Cold analysis script.

When running the program on the current Pennsylvania pick 3 evening data, here is the output of the current script (with the launch settings as ProfileHotCold("rawP3E.csv",60, 10, 12, 8) ...

Distribution for column A over 60 draws:
1 8 H
4 8 H
0 7 N
2 7 N
6 7 N
8 7 N
3 5 N
5 4 C
9 4 C
7 3 C


Distribution for column B over 60 draws:
9 9 H
0 8 H
4 8 H
2 6 N
5 6 N
7 6 N
1 5 N
8 5 N
6 4 C
3 3 C


Distribution for column C over 60 draws:
1 8 H
3 8 H
2 7 N
5 7 N
8 7 N
9 7 N
6 5 N
0 4 C
4 4 C
7 3 C


Final classifier count summary:
A: 2 H - 5 N - 3 C
B: 3 H - 5 N - 2 C
C: 2 H - 5 N - 3 C

Classifications for the last 10 rows (side by side):
5 C    5 N    8 N
6 N    5 N    6 N
3 N    7 N    6 N
6 N    6 C    2 N
2 N    1 N    6 N
9 C    4 H    0 C
5 C    2 N    3 H
3 N    3 C    6 N
2 N    8 N    5 N
1 H    5 N    7 C

 

a few things of note... this was not a script for general use, hence the lack of headers in the distribution counts, but here is how that works... 

example: 1 8 H 

1 is the digit, 8 is the number of times it came out (frequency, the distribution is sorted by frequency, descending), H is the classifier because it was >= 12% of the 60 draws in X (13.33% actually)

The classifier count is per column, with A being the first column of results. it is a quick visual summary of the 3 distribution columns above it.

Also notice the frequency of patterns in the last 10 draws... N being the most common overall, with 4 draws with an  N N N profile and 2 with the pattern N C N... it's as if the hot ones are less of a factor.

I wrote it that if I enter a 0 for Y, it skips classifying any draws because it is the goal to have current data classified to make a pick.

Not sure what good it will do, but it was another idea turned into code, so on that front, it was already a win.

Entry #379

First impressions of the new Hot/Cold Python script...

As I cleaned up the output to give the needed info in a readable format, I had noticed a few things...

1. Using a percentage of total draws means the digit post script (H1, H2 etc...) is not needed, as the numbers that meet either hot or cold threshold can change. For instance in one column of the evening pick 3, when running with 35 draws for X, there was a column with 0 hots and only one cold. That means the numbers tended to fall within a tight tolerance of their 10% expectations. 12% or greater for Hot and 8% or less for Cold may be too much... I made it so that I could pass those in when calling the function.

2. Running on the pick 5 mid day with 50 for X showed an all Hot draw halfway through the Y draws (10)

3. The most common classifier is Neutral, that is less than 12% or greater than 8% of expectancy. Perhaps a run should be made with 11% and 9%?

4. I wanted a raw count of H N and C for each row, so I wrote a few more lines of code to display that. This is how you can quickly see that the Ns were dominant.

5. Trends change, the H gives way to N and C the larger Y is set... this could be the way to determine ideal X range, however. I am going to run a set with X=50 and Y=50 to see what patterns emerge (if any)

6. This is only an aggregator, like odd/even or high/low... still need a way to narrow down to a single pick.

Although I was thrilled to go from a blank screen to working code in a few hours, I did invest much time over the last week or so thinking about what I wanted the program to do and planning it out.

The addition of follower counting in a shorter term might be the next add in to this... if a recurring pattern is found, it would be nice to cross reference the follower distribution... if a pattern of HHH emerges, the counting script gives the HNC numbers that can be selected from the follower script distribution lists... such that if 2 is a hot number, and 2 is not very high on the follower list, it might be the indicator needed to pick that number.

That follower script is already functional, and probably not worth integrating into this code since I can simply run both and compare the output screens side by side. I just need to use a passable offset parameter to the follower script so it does not look at ALL of the draws... using pandas iloc[] was made for such a task... probably less than 5 lines of code.

Maybe Thursday will be a good day to make an attempt at that theory... a paper play to test the concept...

Happy Coding!

Entry #378

Planning for edge cases when coding.

As I am getting ready to start coding the hot/cold script, I realize that after seeing these distributions before, there is a real possibility of numbers at a boundary (such as the 3rd hot number and the first neutral number) having the same frequency. For example, in X draws a 4 is drawn in a column 6 times, but a 7 is also drawn 6 times... if the 4 is H3 and the 7 is N1 then the hot/neutral designators don't really apply... 

In this situation, I wonder if it would be best to "grow" the neutral zone for that situation, which would result in H1, H2, N1 ... N5, C1, C2, C3. Likewise for the barrier of N and C... 

The other solution would be to calculate the percentage, such that a number needs to be greater than it's expectancy of 10% of the draws to be classified as HOT, and below it's expectancy of 10% to be counted as COLD.

Maybe H would be >= 12%, C would be <=8% and all others default to neutral... This solution could also completely cut a category if none of the hots or colds reach their respective thresholds.

This is definitely a programming life cycle thing, spend weeks planning so the relatively small amount of time spent coding has the best chance of success... programs can complete because they are free of syntax errors, but the results may not be useful if there were any semantic errors that you fail to plan for...

Entry #377

The end goal of the hot/cold script

The glaring omission in the original follower script was that there is no taking into account any trends.

I would look at the distribution list for each column and have zero clue which one in the list would be the next out.

The goal here, outside of learning about short term trends is to eventually integrate this into the follower concept, adding the Hot/Neutral/Cold to the distribution output, and perhaps use a shorter term for the follower count to also capture the more recent followers rather than looking at the entire game history.

The ultimate goal being to put together many of the ideas into one program to give the best guess possible. And sticking to column-at-a-time allows the flexibility to analyze any of the pick N games by simply pointing to a different csv file.

But one thing at a time... finally get a day off tomorrow to move this hot cold idea into executable code.

Entry #376

Variable term hot/cold analysis script, concept notes

The script itself is nothing I have not used in other scripts, so I don't expect any protracted length of time in coding. Following the similar format of reading from a csv file... the planned work flow...

Add x and y as passed in to determine how many draws to use.

Read csv input into a pandas data frame because it is incredibly powerful to calculate offsets.

Take the last Y draws and put them into a list, in order. 

Jump x +y draws (a variable I will label "depth") back and count the frequency distribution of all digits 0 to 9 for X draws.

Assign rank by requesting the distribution from the default pandas statistics functions, adding the labels to the output (H1 through C3)

Using that distribution construct, assign the Hx/Nx/Cx patterns to the last Y draws.

Perform this as a funcrion that can loop through any pick n game, like I made for followers... one script can run on pick2 to pick 5...

Print the results to the display...

The trick here will be to find out how many draws to collect the X data from, as the most likely starting candidate for Y will be 7, as it is the max advanced play on daily games.

If 30 draws is the right amount, then a simple run with x=30 and Y=7 should hopefully explain how hot and cold numbers tend to distribute in the short term... but the program.will only need a weekly run, unlike the daily requirement of the follower script.

Entry #375

What about tweaking parameters?

I am considering a python script for the pick 3 type games that can take variable parameter input to adjust the settings. 

Something like a scan of X number of draws to determine hot, neutral and cold, and then look out Y future draws to determine the composition from the hot/neutral/cold group.

Such that running the formula

displayComposition(30,10)

Would take the last X+Y draws (in this case 40), grab the frequency distribution of the X (30) draws, split them into hot/neutral/cold, then display the next Y draws (in this case 10) with a composition such as HNH or CCN, to help determine the composition of hot, neutral and cold numbers that were drawn.

The point of the variables is simple, I don't know the ideal number of draws to do a recent history on, so this allows for some experimentation.

Because it will be modular, it will be able to be called several times in one run with different parameters, such as 

displayComposition(30,10)

displayComposition(250,20)

displayComposition(1000,7)

Since the composition would change with each change in X, we would be searching for some general guideline in the Y output, such as a higher amount of HNN draws when using X history...

I can do a great number of tasks with Python, but I am still sure I am not asking the right questions... after over 20 years of ideas, mostly in excel, I am losing motivation. Therefore I need some different avenues to explore, and one which I have neglected is the analysis of shorter term trends. Everything up to now has been done with entire game histories.

So a grouping of the top 3 hot, the middle 4 neutral and the last 3 cold seems like a fair split.

Output looking like

H1 = 7

H2 = 4

H3 = 2

N1 = 6

...

C3 =1

Would be the result of the analysis, and the output of the Y draws would look like

761 - H1, N1, C3

...

442 - H2, H2, H3

The generalization, which group it comes from, such as H, can be further refined with the digit that represents WHICH H it was, such as H1 being the hottest of the hots and C3 being the coldest of the colds.

Being able to change the number of draws out with Y can help to determine just how long the trends can extend, and also open the door to a sliding back test by partitioning the history into chunks of size X+Y.

There are still plenty of unknowns such as optimal values for X and Y, but it seems like a fitting start to begin short term trend analysis.

Entry #374

Embracing the coincidence of it all, a 6/49 strategy

So, we will be looking at a 6/49 game, the PA Match 6. Needing 2 number sets to start with...

1. The most frequent by position

2. The most frequent by any position

3. (The dumb part). For each position select the number that is between them... that would be for example: IF the most frequent by position is a 1 in the first column, and the most frequent overall (after sorting) is a 5, then select a 3 because that is the same distance. BUT... what if the numbers do not have a middle? Such as 1 and 4? Easy... for the first 3 positions that the one higher... eg 1, 2, 3, 4 would be 3. For the highest 3 take the one lower, such that 45, 44, 43, 42 you play the 43. IF the number is the same, play that number... also if they are one apart, such as 39 and 40 in the 4th position, play the lower number (39)

Sounds too simple, and it surely has no scientific basis, you could use any 2 lines, such as the last 2 draws...

In my case, I am using numbers derived from frequency, but since we are forgetting the math and making up a coincidental system from scratch... why not?

I saw something on a Netflix spy show recently (Night Agent) where they were planning a break in and the advice given was to start with the stupidest plan and refine from there...

It is no more or less valid than any other workout system, and could be applied to any game with 6 drawn balls, such as the PB or MM.

Let's hope for dumb luck!

Entry #373

I can track my "off time" by how long it has been since updating draw histories

Updated the Match 6 file, last entry was October 10th, 2024...

Mid and eve PA pick 3, last updated Dec. 9th 2024...

Even though I am not actively working with these files, every now and again I will update them.

Working today with the graphic part of the QP idea, plotting a grid, putting numbers on that grid and making the numbers selected plot a point in the center of their respective grid spaces. After that, will draw the triangle (or line, in the case of the same number twice), plot the center point of the triangle formed, and connect it to the actual draw result grid space. Thinking blue for the triangle and green for the connection line.

Once I have gathered as much data as possible (line lengths, angles, etc) in a summary report printout, then I will begin a brainstorming session with AI... only with actual example data in hand.

Entry #372

Plan B if QP data fails to give insight.

The data entered could easily come from game history as the last 3 draws. Not only would no changes be required, but it could then be back tested... minimal code redesign to read from a .csv rather than enter manually.

I believe that I had mentioned looking at trigonometry earlier... this is the implementation of that.

Entry #371

Getting started, an idea for visualizing the QP data

The first idea that pops into my mind when trying to imagine how to display the collected data is to draw a grid to mark the 3 quick pick values in each game.

When deciding a grid layout, the 7 x 10 grid from the PA lottery kiosk seems sufficient.

The bonus balls could be contained in a 7 x 4 grid, also as used on the PA lottery kiosk, but it might be more useful as a 6 x 5 grid.

Here is the initial goal in Python...

1. Draw the grid containing all of the game numbers, lowest in the top left, highest in the bottom right. The grid target values are the dead center of the squares, and the dimensions are 1 x 1.

2. Place a visible dot in the center of the 3 quick pick values.

3. Calculate and draw a triangle (or a line if two of the same number are given) from the center of the grid squares containing the QP values... line 1 is value 1 to value 2, line 2 is value 1 to value 3, and line 3 is value 2 to value 3.

For this triangle we want to record angles and line lengths, and the point representing the dead center of the formed triangle

4. Plot the point in a different color for the winning number in that position. Grabbing any relevant measurements, such as line length from the center of the winning grid to the center of the triangle.

5. With this data collected, begin the conversation with generative AI to begin constructing a formula to describe both the relationship of each triangle to the winning point, and also the process of creating a generalized formula to solve all data points simultaneously. This will be the time soak...

6. Once there is something to work with (good or bad), grab 3 QPs for a draw and attempt to apply the generalized solution to obtaining the values of the next draw.

Certainly not ideal, as step 5 may never be truly completed, but at least it is a starting point!

Entry #370

The idea... solve simultaneous equations.

Where to begin to use 3 QPs to reverse engineer the winning combo?

The first idea is to solve simultaneous equations. There are 3 well established methods for this

Elimination

Substitution

Graphing.

We will start with graphing, as this gives a visual as well. With 3 data points, we can plot these in 3d space for each set, then where lines intersect we should find a solution... IF lines intersect...

Our sample set is limited to one week of data for both games, so we have 5 sets. It is with this small sample size and huge possible outcome list where we start.

What is known is the 3 QPs and the result of each of the draws.

Remember, there may be absolutely NOTHING here, but it seems like a good exercise in trying to solve an impossible problem.

Entry #369

QP data is collected, now for the difficult part...

After collecting QP data for a week's worth of Power Ball (Monday/Wednesday/Saturday) and the Mega Millions (Tuesday/Friday) from 3 different locations, I can see how truly bad QPs are. Not one set of 3 QPs managed to pull in more than one white ball match, and zero bonus ball matches...

Obviously One Arizona player got lucky in the MM draw last night, and looking at the payout, only 9 PA players managed a 4 out of 5 without the bonus ball...

And so it begins, the analysis part...

Step one, use the 3 QPs for each draw to "reverse engineer" matching the winning numbers for that draw... whatever that looks like. Repeat the process for each of the 5 draws for which I have collected data.

Step 2 will be to look at the ways in which the data points were manipulated to produce the winners and look for any common factors so that these steps might be applied to a more general process or formula. Here is where generative AI will be used. To both explore the concept and to generate starter code for a program that takes as input 3 QP lines and applies processing to generate one "best guess" pick.

Step 3 will be to test... getting 3 cheap QPs for the draw early enough to apply the process and then play a 4th ticket for the same draw with the processed result.

I do not have high hopes for this method, and it falls primarily under the category of "no stone unturned". This could, in theory, provide a seed for the vertical horizon system that works better than using the most frequent overall numbers... or not.

The up side is that all other play is suspended, so there is a small benefit financially.

Happy Coding!

Entry #368

MM QPs are just as bad as PB QPs...

First draw with 3 QPs on the Mega Millions and the result was just as bad. Only one white ball matched, #49, but it was on 2 of the 3 tickets.

Last gathering of QPs for Friday's MM draw then the work begins.

Because this is the first dive into the concept that the QPs are less than fair, expectations are low.

Because the end result system would require purchasing 3 QPs before selecting one line, this already violates my long held belief that one ticket is enough... it is however somewhat cheaper because either game still has an option to NOT play a multiplier, therefore either game can be played for less than $10 (3× $2 QPs and one $3 "loaded" ticket).

It may take a long time to develop something that results in a playable line.

Entry #367