hypersoniq's Blog

Features for the app, a more complete idea

It will start with a splash screen, giving buttons to update or process a game.

The game selection screen will have a button for each of the 14 games.

When selected it will process the game chosen with the default settings of

Window = 0 (how many draws after a snapshot to classify)

Offset = 0 (how far back in history to start, in days)

Sample size = 15 (how many fair chances to give each ball to appear. Formula is high ball × Sample size)

The data view will be selected by default, with column data presented.

There will be a "settings" button which allows adjustment of the window, offset and sample size, this window will have buttons for recalculate and exit or cancel and return.

The ball view toggle switch is also on the main data screen. Also a button to show column statistics (more on that later) 

Also the generate all neutral QP button and the choose another game button.

I can almost see it!

Entry #561

Since the scripts for update and classify run without errors...

It is time to get real about learning the Kivy framework for the app.

One behavior I had to test was launching each update script from within a new script, making sure there are no glitches in execution. This test went very well, it updated all 14 games with one single run command! I spot checked a few files and everything went as planned, even with the jackpot games that have a bonus ball!

One replacement I will need to make is to replace the console outputs that show which files are done with a number that breaks things down to 100% for all updates, this will be read by a Kivy progress meter! It will start at 0% and jump to 100% when done. When a csv file is updated, the output will be

progress = the percent of each game update success, roughly 7.14%

The classifier script is called with arguments specific to each game type, so a list of buttons representing all games will be presented. The call with arguments will be coded in the onButtonClick() function. I tested the powerball to execute the 5 white ball file file with a ball range of 1 to 69, and the power ball file to execute with a ball range of 0 to 26. The test had to run them sequentially and it did! Within this button click logic will also be the grid layout for the data, which will vary by game type as well.

In the final output will be the data I currently produce, which includes the number, the frequency, the classification (C, N or H), the percentage of the frequency to the sample size and the number of draws since it's last appearance.

Here is where it gets fun...

There will be 2 views set to a toggle switch. The data above will be in the "Data View", the other side of the switch will present the "Ball View", where the columns will contain graphics of the balls (just like on the PA website) with different color backgrounds representing Hot (Red), Neutral (Gray) and Cold (Blue). This ball view will be the quick visual summary, while the data view shows the breakdown of results.

At the bottom will be a button that lets me choose another game, and a button that will exit the program. The navigation buttons will change via screen context, but exit will be present on all screens.

So that is the basic vision for the app in this phase. If I can make all of that work, then it will be a success (even if it never produces a winning pick). I have a huge head start because all of the scripts that do the work are already written and tested.

The end user story for the Android implementation will be

- I have the ability to update draws for all games anywhere at any time, no laptop needed

- I have the ability to run the classification script from anywhere at any time for any game, no laptop required

- reasonable run times will allow going from app launch to data in minutes. <3 minutes is the current observation.

Obviously other ideas will pop up, and I will then be holding an entire framework on my phone! If I want to put in follower distributions, it will be a simple addition to the functionality, but the fully updated data will already be there... and unlike most other android apps... no ads!

Entry #560

The spreadsheet went together in about 10 minutes

Starting as always with the pick 3 evening data. Interpretation will take MUCH longer. Having the plan in the previous post was helpful.

The summation of the vertical sum when compared to the lead in sum is a simple subtraction of the horizontal sum from the vertical sums, but it was kept in to see where the numbers come from.

Ranges:

Hsum is 0 to 27

Vsum in each column is also 0 to 27

Sum of vsums is 0 to 81

Lead in vsums are 0 to 18

Sum of lead in vsums is 0 to 54

There is much more to do, such as distribution and figuring it all out... but the concept for the current data was mere minutes to put together.

I am going to try some software called LextEdit, which allows running straight up SQL queries on data sources such as excel sheets and csv files... that could prove interesting at creating good views of the data with the full power of SQL.

It is definitely more of a "back burner" project as I continue to navigate the Python GUI world. The basic start at seeing if it is useful will to grab random sets of 4 consecutive draws and see if the data exists in the first 3 to arrive at the fourth. It may or may not... the randomness is still there.

Entry #559

Sum sheet plan

I do not have a day off until Wednesday, so that is when I hope to put a pick 3 sheet together for the PA mid and eve games.

So, from the updater csv files I will import PAPickThreeMid.csv and PAPick3Eve.csv into their own sheets. This will give the first 4 columns of Date, ball 1, ball 2 and ball 3.

Will have to skip the first 2 draws so the formulas will have a full vertical set to measure. But the plan is (starting from the E column...

Col E will be the standard horizontal sum.

Col F, G and H will be the vertical sums

Col I will be the sum of the vertical sums

And finally Col J, K and L will be the "lead in" vertical sums, such that if the last 3 draws were.

247

103

785

The lead in vertical sums are the sums of the oldest 2 draws, so from the above example 3, 4, 10.

This will hopefully make it easier to tell which sums might help with prediction, as the lead in v sum + your pick will give the 3 draws needed to have a v sum and an h sum.

While I will be just getting a feel for the data by looking at it, and profiling the bell curve for each, this may just be the perfect application of my old follower script, as it will display the distribution of sums that tend to follow each of the 28 possibilities (or the 19 possibilities in the lead in sums)

Because of the modular column by column approach of the follower script, it needs zero modifications to be applied to this new data. It will also scale from the pick 2 through the pick 5 and beyond. The time invested tweaking scripts for systems that did not directly work proves useful because the code is still reusable! Framework building blocks...

Also have a few days to think and imagine before writing a single formula. Like adding an L column to get a sum of the lead in sums... going all in on the sum theme!

Entry #558

More accurate terminology for the Sum idea

The concept of sums is not new, but it is something I have not worked with. The idea comes from the world of machine learning, in particular feature detection and feature extraction.

The way sums seem to be done for lottery is across one row. Such that a pick 3 number of 1 3 9 has a sum of 13. This provides spatial context to the draw combo. Collecting this data and graphing it will show the distribution of draw sums, compacting the range of all 1,000 possibilities into 28 sums (0 through 27).

From time series analysis, I will be adding a vertical sum of the last 3 draws (for pick 3), this will have the exact same range and provide a temporal context for each column.

What I am looking for is some way to correlate the vertical sums and the horizontal sum to find patterns that emerge that are just not visible when looking at the raw history results.

If you look at the last 3 draws, the horizontal component is just as random as the numbers themselves, however, you already know 2 of the 3 components for the next vertical sum...

The spreadsheet should be super simple to create. Summary statistics on the distribution will take a bit longer. Nothing too difficult. Then the part where actionable intel can be gleaned from the data may take quite some time.

The vertical, or "rolling" sums seem to be more useful than the rolling averages also found in time series data analysis, particularly for the nature of lottery data.

Wouldn't it be interesting if getting a pick was essentially like solving the bottom row of a sudoku puzzle?

Entry #557

Sums in pick N games

Outside of dividing the draw history into a division of sum ranges, what else can be done?

In a pick 3 it is easy, all sums fall between 0 and 27, pick 4 between 0 and 36, and pick 5 between 0 and 45.

Looking at a plot of the sum ranges should form a bell curve around the middle sums. Playing within some range of sums around the middle should present some percentage of all outcome possibilities.

What about adding another direction... previous draw sums? See where they might intersect.

The number of vertical sums would be the same as the number of balls in the combo... 3 for pick 3. 5 for pick 5.

So if there is a Gaussian curve created by the horizontal sums, there might be another one created by the vertical sums, in the same exact range, only in each position.

Only unlike horizontal sums, you already know 4 of the 5 digits for the next sum...

Got to fill the free time created with update automation somehow...

Entry #556

The classification script project much more difficult than it seems

The passed variables work as expected. I even discovered an unused variable that remained from when I used expectancy +/- 1 standard deviation instead of the inter quartile range.

The universal version of the script had worked for the Match 6, Cash 5, and Treasure Hunt data sets, but it is throwing some strange errors when trying to run the pick N game data...

I have a feeling that tracking down these errors will take some stretch of time. Fix one, another, different one appears... good thing my version control has all of the previous working scripts intact!

So far...

Index Out Of Range... that was fixed by passing in the low and high balls for each game. Also by creating a new variable that adds one to the high range as python's range works as range(includes this number, excludes this number) such that for pick N, the range(0,9) only includes 0 to 8, the correct approach for 0 to 9 is range(0,10).

Error processing the frequency list... this worked for the jackpot games, because you ignore zero values to get the distribution of the present numbers... there are no zero values in the pick N games. That was fixed with a peek inside the frequency list with an if statement... if zeroes exist, it is a jackpot game, if not, it is a pick N game.

Had to rework the flowchart to make sure the execution was still following the original design. It is.

So this is another time killer. I have to print the contents of the frequency data ahead of the line that is crashing execution for the pick N games. I need to see that the passed variables are passing and that the frequencies appear as expected. The comment for sections like this is always "TEST PRINTS - REMOVE WHEN RESOLVED" so I do not forget to remove the troubleshooting code.

This is the type of project where resolving the operation to just one function is required because I want the app to run with as few functions as possible. I know it is possible to reduce it to 2 scripts, because that is how the updater works... 1 script for 11 games and 1 script for the 3 with bonus balls as they update an extra bonus ball csv.

I may end up having to do the same with this part as well, one script for the jackpots and one script for the pick N games.

That is the interesting part of kivy, a button click can set up and launch an entire script... so when a game is chosen from the menu, it will choose both the correct script AND the correct settings.

I could be overthinking this...

Never mind... figured it out!

So far tested with 2 distinct game types, pick 4 mid and Match 6... will test the others.

Summarizing the project so far...

Updater... 2 scripts, ALL games.

Classifier... 1 script, ANY game.

Crushed the goals remaining for this year. Onward to GUI development!

Entry #555

Got Kivy installed, almost ready for app development!

With the successful multi game updater and making measurable progress with the parameterization of the classification function, the development process for the GUI app which unifies these scripts will probably start in a week.

Seeing the demo app for Kivy shows many interesting options, such as a progress bar and being able to add a heat map display under the data, so cold, neutral and hot can have different attributes such as background color.

2026 is also the year to work on a formalized process for phase 2. There will be ongoing changes, such as consistent code commenting and generating documentation for the project.

This is more about the completion of a framework than a real effort to find a working system at this point... development and learning are my favorite parts of this hobby anyway, a hit is just icing... phase 2 development is where the next round of ideas will be created and explored.

I can always pull the plug on actually playing at any point, but I do want to have a go at playing a solid year, 2026 is that year. Whether I end up playing out of pocket with zero wins and end up losing $728 or I catch a win or 2 between now and 12/31 to have them fund the ride... it is ON !

Entry #554

Updater file test worked wonderfully!

I let it go a few days to see if it would capture just what was added... and it worked! Updated all 14 games of interest to their respective csv files, including bonus balls for the jackpot games that have them... no extra or missing data!

It seems surreal having to run one script and have it replace what would have been at least an hour of work...

The updates complete in seconds for each game, less than 2 minutes total run time!

The rest of the work I am doing seems to pale in comparison... THIS was my 2025 coding goal and it has been met !!!!!!

Both the builder AND the updater have been built and tested!

There are some considerations in the pick N games, such as now the double draw promotional results are in the data, as well as the infamous 666 fix draw... but since I am not currently using full history, that seems it will not be an issue.

The builder allowed me to capture data from matrix changes by entering the start date, or by just entering a start year for the pick N's

The updater only returned new draws by reading the last date in each csv file and starting from there... it does not matter if it was a daily or run 2 or 3 times in a week.

I also no longer need to maintain a separate spreadsheet file for each game, as the data from a csv is easily imported into a sheet if needed.

Today I am busy identifying and converting variables in the classifier script to be passed into the function call... almost there with one function for any game!

These are the key challenges in the GUI transformation, and the updater part is now fully functional! The universal function for classification is not far behind... with testing maybe a few hours to a few days tops...

Whether or not the classification idea goes anywhere, I have the building blocks for my own software development framework, which can be tweaked for ANY idea!

Like Ice Cube said, it was a good day!

Entry #553

There are many things left to learn with classification

Although I have been recently obsessed with creating this classification system into a mobile app, that does not take away the fact that this is a data presentation tool, and still needs to be explored for interpretation.

The simple process of sorting the results by frequency took the emphasis off of the digits and put it on the frequency patterns directly. These tables should be saved and compared with each cycle. One such analysis tool could be to capture the classification table, then overlay the 21 cycle draws to determine if trends are present or absent. For the next cycle, though only playing 1 combo, I will record and track the performance of the 10 combos generated by reading across the table, and also the floor and ceiling of the neutrals, if not contained in those 10 combos. Simple as recording the combos and with each, a simple x in the next 21 cells across to indicate a hit. Paper play on a larger scale to help find the best performing area of the result grid. I do not have to wait for the GUI to start capturing data, and paper play pick 3 can be run regardless of which game is in the current cycle.

Once the steps are known, then a back test can be figured out. So there will still be plenty of experimentation ahead...

Entry #552

And the updater works!

Testing it on the old csv files was the right idea, because everything updated as expected!

Pointing the script to the new csv files was quick, but just watching it run... less than 2 minutes to update 14 games... THAT is why I do this stuff! Even if I have no intention of playing all of these games, it was the sense of accomplishment... having a clear vision and coding it into reality that makes the coding sessions so enjoyable.

So the trick was to put calls to the update functions in the main section, where it would iterate the first 11 games, and then call the second function iteratively for the bonus ball games. I even deleted the old pick 3 to June of 2024 to see if it wrapped the year change correctly, and it did!

That was enough for one day! Gotta work for the weekend and get back to the list on my next day off... I might even start coding the GUI this year.

Entry #551

Planning today's coding session

Got some definitive goals today!

1. Finish up the update script, test it out and validate the results. I made new csv files with a specific naming convention just for this project, but in true pack rat style, I kept the old ones and did not update them intentionally, so I could first target the old csv files to test it out. If it updates the old ones properly, then it will update the new ones properly.

2. List the differences in the pick N classifier script for both the pick N games and the jackpot games, make sure the common items are reworked as variables that can be passed via the function call. Make the coding changes.

3. Test the changes by running the new script against the old scripts to see that the data output is identical.

I do not know how far I will get on this list today, but these are the short term goals until completed. Once these goals are met, I am finally ready to start turning it into an app! So I must also install Kivy into the Python set up.

Based on my 2026 plan, I really only needed to update the pick 3, pick 5 and Match 6, but grabbing all the data adds some options... a cycle of Match 6 could be replaced with a cycle of Cash 5 for the same cost, or a cycle of the treasure hunt at HALF of the cost! Could also swap a cycle of pick 5 for pick 4 for the same cost as well.

Right now, updates and classification require being tethered to the laptop. The ultimate end goal is to be able to update and run the analysis on the go, and the pathway is becoming clearer!

Entry #550

Afterburners today! Coding goals are being met!

I figured out how to create a script just for bonus ball games! That means I have entire (current matrix) histories for Cash 4 Life, Power Ball and Mega Millions!

As a kicker, the bonus balls are diverted to a second csv file as the expectancy changes in a 5 in 60 vs. a 1 in 4 (cash4life example)

I have even figured out and started the update version of the scripts. They handle the edge case of last updating in December of one year and not running it again until January of the next year! Done by reading the last date in the csv and comparing it to the date you run the update.

14 games, full history... current to today! About an hour of coding... zero errors to debug!

In case anyone is playing along at home, here is when each game (current matrix, if applicable) began...

PA Pick 3 Evening: 1977

PA Pick 4 Evening: 1980

PA Pick 3 and Pick 4 Mid Day: 2003

PA Match 6: 2004

PA Treasure Hunt: 2007

PA Cash 5 (current matrix): 2/1/2008

PA Pick 2 (mid & eve), Pick 5 (mid &eve) and Cash 4 Life: 2015

Power Ball (current matrix): 10/17/2015

Mega Millions (current white ball matrix): 10/31/2017

Time to finish the update scripts and begin working on the classifier script so it can be modified to run all games instead of separate versions for dailies and jackpot games. That will also need to be modified further when moving to the GUI to remove my side by side output logic, as each column's data will be assigned to a display widget dynamically based on the number of columns in the csv file.

From hours to minutes... can't believe the decades of manual updates have finally been automated! That was always the least fun aspect of this hobby.

Entry #549

PA lottery uses aspx file to present tables of draw history.

Microsoft Active Server Pages, the technology used by the PA lottery to present 1 year of draw history for any of their games.

There are 2 main variables passed via query string parameters, the id, tells it which game to display, and year, so you can choose the year.

I can fetch a requested page by passing in variables for the task at hand.

For example, this is the full url to display the PA pick 3 evening results for 2025...

https://www.palottery.state.pa.us/Games/Print-Past-Winning-Numbers.aspx?id=2&year=2025&print=1

I forget exactly how I found it, it was originally buried somewhere on their results pages. Through experimentation, I found the following game ID codes...

1. Pick 3 Mid Day

2. Pick 3 Evening (the example above)

3. Pick 4 Mid Day

4. Pick 4 Evening

5. Pick 5 Mid Day

6. Pick 5 Evening

7. Treasure Hunt

8. Cash 5

11. Match 6

12. Power Ball

15. Mega Millions

27. Pick 2 Evening

31. Pick 2 Mid Day

35. Cash 4 Life

(There are codes for games that have ended like 10 for Mix & Match and 14 for Super 7, but if the game ended.. why bother?)

By replacing the id value with a number from above and the year value desired, you can see any full calendar year of PA lottery game history.

The Pick 3 evening is the longest running PA game and goes back to 1977 !

In my build script, I loop each ID through the years for each game ID and create a csv file for each.

The update version that I plan on running once a week will go through ALL of the game IDs for the current year, updating all 14 games in one shot.

I am still stuck on 3 games... all have a bonus ball. Since the other 11 games all work, I am going to create a separate script that can handle bonus balls... working on that today... the idea is to only grab numeric data and only use the first 6 in the list, this will effectively capture all of the 3 games and ignore the power play and double draw data. Also will skip the megaplier pre $5 era.

Also note that if importing via spreadsheet, the date is in a text date format. I handle that conversion into an actual date type in Python, but it is also able to do this in a spreadsheet by using "paste special" and choosing unformatted text, then for the date column, select "Date M/D/Y"... problem solved... text dates don't sort properly, particularly if mixed with standard dates!

So that is where I am at with the coding project, just thought I would share the URL I found to make gathering history easier for any other PA players out there.

Entry #548

Tomorrow ends the first week of the pick 3 cycle

No win yet. This cycle it was the highest all neutral line used. Since each cycle needs to complete before picking another phase 2 plan, this rides for 2 more weeks. Next cycle will see a use of the ceiling, or the highest neutral in each column, regardless of if they line up.

May run 2 weeks of a cycle to finish the year using the floor of the Neutrals (lowest).

Starting January 1 will be the full year plan, the 10 week cycle starting with 4 weeks of Match 6. Whichever data interpretation scheme does the best on Pick 3 will be chosen for the Match 6.

Going into 2026 not counting on a hit for the rest of this year... the Cycle play and the $14/week budget is going well.

I refreshed my entire set of game histories so I know there are zero transcription errors on my part... well except for bonus ball games, but I feel confident that tomorrow's coding marathon will result in solving that issue and also see the implementation of the update script, so I will save an hour a week updating draw histories!

I had some good research into potential issues with permissions in both windows and android. These are being considered and the path forward with developing a GUI that will be able to run the app on both windows and Android resulted in a final design decision to develop with Python's Kivy framework from the start. Looks like I have a ton of documentation to read over before that gets started.

Making one function provide the classification data is going well also. Passing the game variables along with the function call does the trick. This way, when a game is chosen in the game function, it passes the correct csv file, the correct sample size, the expectancy and the proper loop variables for display for each game. The bonus ball games have a call for a second run of the function with their particular settings, and their output will be dynamic as well, appearing beside the white ball data as if it was all processed at once.

Then I will have to use it for awhile to see where improvements and upgrades will fit into the development cycle.

2026 is looking like an interesting year for this hobby...

Entry #547