hypersoniq's Blog

Won't 280 billion of any operation take too long to realistically execute?

If the core algorithm took even 1 second to run, the process would take almost 8,879 years to run... fortunately the algorithm, both checking for a match and checking the top 10 can be performed thousands of times in 1 second.

An accurate estimate still needs to be made with a timer on a limited set of maybe 10,000

The time of the algorithm is of utmost importance, so the time consuming initial operations are only done once per run... loading a csv file into a pandas data frame... the actual match and sort are the heartbeat of the operation and are being optimized to be run thousands of times per second. 

Even with the optimizers in place, the estimate is now measured in months. How many months depends on my ability to make the code as streamlined as possible.

Entry #278

Productive coding session this morning!

Testing the puzzle pieces before assembling the final script.

Verified that the counter list mechanism works. Did that by printing out the first 1,000.

Verified that the draw history file reading does what is expected by using a column of test data with a known number of hits. Also verified it appends the correct hit count at the end. This was crucial as it is the back test part. Without this there could be no forward progress!

Currently working with a test script to create a heap data structure that returns the top 10 lists by hit count. This part has to go smooth as it will refine 10,000,000,000 tests to 10 lists. It is how such a large project can stay within memory range to run on a Raspberry Pi 5.

After that, it will be a simple matter of putting the pieces together.

Entry #277

Raspberry Pi will arrive Thursday!

Getting geared up for the mother of all frequency analysis programs, the hardware is on the way!

Going to add a timer to the head and tail of the program, calibrated for days. A few weeks of tweaking and testing then the giant program can be launched. Going to run the entire series of games at once, with each top 10 list of lists recorded to it's own .csv file.

The timer will record the time stamp of the program start and program end.

I always end my scripts the same way, printing out the elapsed time and the phrase "SoniQ BOOM!"

8 files of column lists will be small, smaller than even my current set of history files. The home network will enable upload and download via secure FTP, and monitoring the headless Pi via PuTTY.

Also included functionality to send me a text message when the run completes, as this should run for days if not weeks.

All of the code is optimized for the Pi device limitations, with dual recursion being what keeps the project within memory specs. Not that memory management in Python is as big of an issue as memory management in C or C++. I have even solved the screen buffer issue by just outputting what game is being analyzed and the current list, using the functionality of the print statement that can wipe and reuse the current line.

Excited for the big show, probably less so with the expected results. It is definitely about the journey more so than the destination on this run... 280 BILLION iterations of an algorithm of my design. The ultimate back test of the entire genre of replacement tables all at once.

It will be mildly disappointing when the first weeks of testing continue to output losing numbers, but the journey...

Entry #276

Building blocks in Python

Toward the new experiment, pieces of the puzzle have been unlocked!

A loop to increment a list as if it were a counter has been tested. It works!

A passing of a list from one function to the other, along with a data frame column has also worked.

Appending the hit count to the end of the list has also worked.

The current challenge is to create the running top 10 list of lists and have it populate and pop off the low list (by hit count). Going to work with the Python heapq (heap queue) library to enable this functionality.

Using a loop to increment the list has the advantage of being able to run the top 10 comparison within the loop, then print the top 10 list to both the screen and to a .csv file when done iterating.

The next challenge will be to figure out how to get the right comparison such that it looks at the row before,  finds the number in the list at that index, and compares it to the next row to see if it was a match.

Then the final challenge is to put all of these parts together and solve the puzzle.

Then the hardware...

The Raspberry Pi 5 has an 8GB version that ships with a case that has a cooling fan, a 128gb mmicro sd card, a power supply and a giant heat sink. These single board computers can run constantly for decades... it is what they were designed to do! The board itself is $80, but the kit with all the parts runs $170. I already have a raspberry pi version 3b, that only has 1gb of ram. Both have a tiny form factor, about the size of a double deck of cards.

When the experiment is finished, i can use this new Pi to act as a better server for learning web technologies, so the investment has use beyond the lottery number crunching hobby. I can also put the GPIO board on the older Pi and learn some IoT electronics.

The use of a loop to iterate the lists has one more advantage... I can set the start and end points, therefore a limited test can be run to check the validity of the software BEFORE getting into the 10 billion loops. 

Did I mention that the Pi can run headless? That means no monitor, keyboard or mouse required. Once set up I can log in remotely with a program called PuTTY to check progress and download results.

At the end of this, the largest coding project I have ever undertaken, if it fails to produce hits it will still have been worth the effort since the puzzle pieces learned here can be used to solve other problems. I sort of look at this as an exhaustive backtest. A brute force effort to run through 280 billion possibilities (all possible iterations of the replacement scenario of which mirrors are but one) and I can truly say I have tried everything in the attempt at a single pick straight shooter system.

Entry #275

Considering a break in play while the new program is built and tested.

After some thought, I think it is best to hit the pause button on the dailies until the next phase is complete.

The spreadsheet has already proven that it can pick a winner, and knowing there is more work to do which will require funding outside of playing tickets makes it seem like a good time to put daily play on the shelf while development progresses.

The Raspberry PI setup will cost $170. At $8 a day, it would take 22 plays to cover it. Will continue when the coding and program runs are done.

The draw histories will need to be maintained, the code needs writing and testing, and I will still continue with the match 6.

The spreadsheets are already done and the dual recursion framework will be applied.

Not sure how long it will take, but not cutting any corners... full on software engineer mode!

Entry #274

Best List Of Ten Billion Observations Tested: aka BLOT BOT

Gotta give a new system a name, with that the acronym BLOT BOT jumped out. That is 10 billion observations per column of 28 columns across entire draw histories, and the core developments have already begun. 

The biggest decision now is whether to pick up a Raspberry PI 5 or to run it on the cloud.

I have what I call a $5 server on a could plan that costs around $5 a month. Mostly used for different python script testing. I wrote a script that took samples every minute of the network fees on the Stellar cryptocurrency blockchain for an entire week. The usage was about $3. The results gave an interesting set of data about fee spikes, so if you were to do transactions, you could see when the cheapest times of day were.

I may go either way, but I kind of want to get a raspberry pi 5 anyway, to practice web development on the back end as well. I have a pi 3 that I used as a server for some ruby on rails tutorials. 

Actually, the divide and conquer approach could be employed... run the day games on the PI and the night games on the server! Will cut the run time in half!

Currently working on the list iteration part, the use of pandas data frames from the follower system will handle the data storage for the history files.

One internal recursive function will handle the iterations and top 5 lists, and another will process each column for the zero counts.

Running from the command line will allow the print function to overwrite and stay on one line, that way for the status it will show the currently processed column and list, so when I monitor the output I can figure out the progress.

Long road ahead, but the wheels are currently turning!

As my favorite sidekick ChatGPT says at the end of every session... Happy Coding!

Entry #273

Planning the ultimate search for the top replacement values.

I had started with raw follower data, moved to a sequentially indexed "mirror" system, then moved to followers within that indexed system.

The basic premise, given the digit in the last draw, which one has the highest history of making a match?

With follower data plugged in, so far the zero count (where there is a match) is the highest, but there is yet to be a hit.

The next thing to do would be to check ALL possibilities of replacement values for each digit to find the best one for each column of each game.

For the basic mirror system, the replacement scheme is

0=5

1=6

2=7

3=8

4=9

5=0

6=1

7=2

8=3

9=4

On one side is the index for the last drawn number, on the other, the number to replace it with.

The total possible combinations of replacement values comes out to ten Billion!

So, why not test them all?

For each column, that means recording the number of "hits", and that will result in running through 28 columns for the mid and eve PA pick 2 through pick 5 data, 280 billion iterations of a recursive algorithm in total... 

How do we do that without running out of memory? By only recording the top 5 sets for each column.

How to do it? Lists!

Starting with [0, 0, 0, 0, 0, 0, 0, 0, 0] and ending with [9, 9, 9, 9, 9, 9, 9, 9, 9]

Each iteration will append a zero count, so that one of the top 5 output lists might look like [5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 750] and each of these lists will be stored in a "list of lists" which will be sorted by the appended zero count and compared to the most recent one, if it has a higher zero count it goes into the list and the lowest count gets popped off of the list.

The output will be the sorted list of lists, containing the 5 highest zero counts. 140 total lists. The zero counts will allow sorting into the top 5 without needing to store them all.

The main design challenges...

Writing a list generation loop to generate the lists.

Looping through the history files and counting the "hits", and appending that number to the current list.

Sorting and ranking logic to maintain a running top 5 list of lists.

Perhaps writing output to a csv file in case the program crashes.

Setting up a system to run continuously. (Raspberry Pi version 5 should do the trick)

Fitting the whole system into my current recursive framework.

Will take some time and small tests to get the logic right. But that is the ultimate rabbit hole dive for any replacement system.

Happy Coding!

Entry #272

How this new system actually operates

The system has 2 components, a Python script and spreadsheets. There is a spreadsheet for each of the 8 games (P2, P3, P4, p5 mid and eve).

There are also 8 .csv files containing a copy of all of the draw histories. This is what feeds the Python script with data. This script reads in the data and processes each column to find out what the follower distribution is for each digit.

There are 28 columns to process across the 8 games. The output is a distribution frequency for each column for each of the 10 digits. The inner functions run a total of 280 times over a total run time of 45 seconds. The output is used in the spreadsheet lookup tables.

Moving to the spreadsheets, the draw dates and draw history runs down the far left. Then it is the estimation function, which is the "guess" that is done by looking at the last draw and replacing the numbers with the corresponding values in the lookup table.

Moving over, there is a hit counter that checks to see if the guess matched all digits in the next draw. This is followed by the error function, which tells me how far off each guess was, and in what direction. A -1 means you guessed one too high, and a zero means a match.

Finally is the lookup tables for each column, populated with the 10 digits and the 10 replacements. Entering the replacement values allows me to see how the zero count changes.

In order to play the system, I have a picture of each of the lookup tables so picks are as easy as opening up the lottery website and processing the last draw of each game.

The values in the lookup column are composed of the most frequent follower for each digit based on the output of the python script.

Since follower data changes slowly, it is not required to run an update (roughly an hour process) before every draw. 

I was going to post an image of the pick 2 mid table, but the blog won't take it (probably too big).

This system should suffice until I graduate in July and get some free time back. I probably learned more about actual coding on lottery projects than I did in classes, with the exception of the intro to programming class where I discovered Python was my favorite language.

The areas I am looking to improve are 

1. Updating the draw histories by parsing the PA lottery RSS feed or scraping their results pages. Shame they can't just have an API like some other states, JSON data is much easier to handle in Python.

2. Making the python script auto populate the lookup tables on the spreadsheets.

3. Coding an Android app where I can set up all of the tables, pull in the last draws and generate picks for all games with one click.

In a nutshell my system takes follower data and reduces it to a simple substitution of the last draw digits as easy as one would apply a mirror system.

Happy Coding!

Entry #271

Plugging follower data for all numbers into a "mirror" type replacement strategy

It only took a short time to modify the existing python script to loop through all digits for all pick n games and gather the top followers.

Took a bit longer to run through the process of counting the zeros (hits) in each column, but because the follower data created so many extra zeros compared to the modified mirror system it looks like a valid upgrade of the existing software.

The pick 2 was the obvious start point, by the time the first game (mid day) was done I had established an updated work flow that reduced the previous update process by 80%. Script run to finished spreadsheets for all games in less than 1 hour. The longest process was draw history updates and making pick 2 .csv files for the Python part.

The year recap so far...

January saw the follower script in it's original form that had to be run daily to get picks. This system was winless for the majority of the month.

End of January through yesterday saw the new "mirror" system in play. This greatly reduced the need for updates and also picked a winner straight on the pick 3.

Today... the merging of the 2 systems. Keeping with the machine learning concepts of seeking a global minima for each column (highest number of zeros) and the ease of use of that mirror system while integrating the observed frequency of follower data. This will be tested throughout the rest of this month and March.

That pick 3 hit basically returned all of my money spent on both previous systems and will still fund the first 10 attempts of this new system.

Still looking for that elusive pick 5 straight hit. I believe that this progression will get me closer to that goal.

The ultimate refinement of this software structure would be to go through all 10 billion iterations of possible lookup table values but I realize my coding ability is not quite there yet. One day...

Happy Coding!

Entry #270

First "hot week" passes with no hits...

No wins, but reevaluation of the parameters leads to a shorter duration (4 days rather than 7) and only going 5x on the pick 5, 2x on the others.

Hoping to get the sheets updated this weekend and re run the zero counts. 

Follower table python script should be ready to mix in the testing.

Ever forward...

Entry #269

System first hit! And it was not the pick 2...

Mid day draw in PA was 0 2 2, my pick for that draw was also 0 2 2. This is the first hit of the system! $1 pays $500.

This will trigger (and fund) a "hot week" where the plays are 5x for the next 7 plays (not days as I don't play 7 days a week).

I can also take back what I spent so far and finish up this month on house money.

The reality check is that this may be the only win produced by this system, but it also could have produced zero, so the work put in was justified outside of just learning experience.

$42 a week for 4 weeks (counting the Match 6) now was $168 spent. A "hot week" has a price tag of $280, leaving $52 of house money. I shouldn't really count the match 6 in this system because that is played on a ticket good for 26 days that was already paid for and is good until the first week of march, but I am counting out of pocket expenses to include all tickets. So that $52 will go further as it will just buy daily games.

The easiest way to play is to keep the win on a voucher and use that to play until I get down to the amount spent then cash it in.

Here is hoping the other games heat up during the hot week!

Happy Coding!

Entry #268

The plan for incorporating follower data into the current system.

There are many steps, so may as well make myself a checklist...

1. Update ALL of the draw files, from pick 2 mid to pick 5 eve

2. Rewrite the main section of a copy of the python program (cheapest revision control ever) to input the "last draw" as all zeroes, through all 9's. This is the way to see all follower data all at once. With a proper loop, this can be done in one big run. Output reduced to distribution list, with the top being the most frequent. This will require updating CSV files for all games, as well as adding the pick 2. Long step!

3. Modify the spreadsheets where I track the zero counts of each possible lookup table. Currently there are 10 that can be applied to day or night. This mod would add 1 follower table for day and one follower table for night.

4. Re run the zero count tests incorporating the follower data. If a follower group outperforms the other groups on a particular column in a particular game, then it becomes the new lookup for that column. Zero count is the goal, so if they are not the highest, they get rejected and the work was for nothing but coding practice.

To date there have been exactly zero wins, but I have stuck with it so far. Starting for this week tonight.

Yesterday was the last for the most recent match 6 ticket, so playing a new one tonight also (for 26 draws).

Any big jackpot games will be a la carte, as they will be infrequent in terms of the year.

The current system maps the digits 0 through 9 based on the last draw. I am currently cycling through 10 combinations to dial in the zero count, with 2 more planned. For that map, there are actually 10 Billion combos! What I should do is write code that can capture the data I am catching manually and run a full fledged zero count on all 10 billion possible values for that side of the lookup table... not sure how to approach that one YET, but it is definitely on the back burner!

How do I get 10 billion? Just like the odds for the pick 3... 3 numbers that can be from 0 to 9 (10 digits). So the possible combos equal the odds. 10x10x10 =1,000 (10 to the third power)

For 10 positions each having 10 possible values, that becomes 10 to the 10th power = 10,000,000,000.

Granted some will be impractical, like 0 0 0 0 0 0 0 0 0 0 or similar ones that eliminate too many numbers, but that would reflect in the zero count.

I am thinking that a loop which increments each position in a list, performs a substitution and counts the zeros, placing the list with the current high zero count into a variable.

The running output of the program could be the number of the loop iterator, using a "overwrite to the same line" trick in Python to get an accurate running count without running into screen buffer overflow issues would do the trick, with a final output of the list(s) that scored highest. After that, take the top 10 and plug them into the existing system and run the manual tests as before.

No idea is ever done, there are always tweaks, improvements, and what-ifs to keep one occupied. My concepts of modular coding and even with the spreadsheets being flexible enough to model basic machine learning algorithms will keep me busy for some time... until the next "big idea". Except for wins, I got what I wanted in a lottery system

1. Easy estimation function

2. Implementation of an error function

3. Use of the data in the error function to alter the estimation function with a goal to column zeroes instead of past wins.

4. Super easy to play on the go. No need to pull out the laptop before every draw (a limiting factor in the original follower script).

For years I have tried and scrapped many system ideas, some were just bad and some failed to be fully realized because I did not really know how to write programs (or decent macros). No more! The modular (reusable) code base grows with each new experiment. By the time I finish my last three classes, I should be able to do rapid prototyping of new ideas because I will get TIME back.

20 years later and still searching...

Happy Coding!

Entry #267

What would success look like?

Systems and strategies come and go, I have worked on several over the years.

With any system, what does success look like?

On a new add to my system, the simple pick 2 is a 1 in 100 chance of being right. Once could be merely coincidence. How many hits would be needed in what time frame to consider the system a success?

It is different for jackpot games as that only needs to work once.

This is where I fall back to profitability.

I am currently playing all PA daily games both mid and eve draws, roughly 4 days a week. Since I can use that to calculate total possible loss, I can also use that to calculate success scenarios.

Breaking it down to individual games, the success scenario would be as follows...

Pick 2

To cover it's own cost would need 11 wins this calendar year.

To "carry" the whole system it would need 77

Pick 3

To cover it's own cost would need 2 wins this calendar year.

To carry the whole system would require 8

The pick 4 would require but 1 to carry the whole system, as would the pick 5.

The match 6 would require 4 [5 of 6] hits to carry the whole system, though that game also has the n of 18 3 line count, which could pay up to $2,500. A jackpot win on that would be life changing of course, but also cover the system.

The first week brought zero wins, so to date the system has not been effective, but it is only the beginning of February.

So I think I would define success to be any combination of wins across all games that cover the expense of playing to be good, and any combination that also generates a profit to be great.

Seems overly simplistic, considering the odds of all games are as posted for 1 pick per game.

Pick 2 is 1:100

Pick 3 is 1:1,000

Pick 4 is 1:10,000

Pick 5 is 1: 100,000

Match 6 jackpot is 1:4,661,272 (according to the PA lottery website) but is actually 1:13,983,816. They doctor the odds due to giving 2 quick pick lines for the one you play. Making those odds a more realistic 3:13,983,816

Each of the above odds are present in each game no matter how many times I play the system.

So, got a way to pick, got a betting strategy and budget, got an idea of how it might be measured as successful, all that is left to do is win...

Now if only PA would stop picking the wrong numbers!

Entry #266

The purpose of each game in the wagering system

The lottery "plan" for 2024 is covering a total of 9 games...

PA Pick 2 mid and eve ($50 on $1 straight) - the purpose is to increase exposure to playing on their money. No box plays.

PA Pick 3 mid and eve ($500 on $1 straight) - while also a great opportunity to play on house money longer, it will also trigger a "hot week", where the wagers go from $1 straight to $5 straight across the pick n games. No box plays

PA Pick 4 mid and eve ($5,000 on $1 straight) - just 1 hit covers the entire cost of the system for the entire year AND leaves a decent profit. Allowing for $1 box play if and only if the pick is 2 pair. $800 ($400 on each 50 cent ticket)

PA Pick 5 mid and eve ($50,000 on $1 straight) - the ultimate goal! Allowing for box play if and only if the pick is 5 unique digits ($425 on a $1 box)

PA Match 6 (minimum $500,000 jackpot if 1 winner) - a daily chance at life changing money. This one is played in a 26 day rotation, as that is as far as PA lets you play this one in advance. The ticket cost is $2, so that is $52 per ticket a little over 14 times per year. Most of the time the small winners either defray the cost of the next ticket or cover it entirely.

Part of the plan, including box tickets under strict conditions, is in place because the amount is high enough to trigger a hot week, but under the $600 "claim form" requirement.

With a plan in place, a total budget can be made.

The Match 6 will cost $730 in this leap year. $14 per week.

The daily games estimate is based on a worst case scenario of all picks triggering box plays for a total daily cost of $12, which on a worst case scenario 5 days of play per week instead of the usual 4 places the potential cost at $60 per week.

Therefore the high end of the potential budget will be $3,850 for the year.

This can be eliminated by just 1 straight hit on the pick 4 or pick 5 games, one jackpot hit on the match 6 (or four 5 out of 6 matches), or 8 hits on the pick 3.

If I were to eliminate all but the goal of the pick 5, that cost would drop significantly to $520 for the year at 5 days a week, but what fun is that?

Since I already know the total potential cost, I only need to track wins this time and not expenses as with past systems. Also, without a clear plan, I never stuck with any of the older systems more than a few weeks.

This is my comprehensive plan for 2024. A new picking system (error corrected mirrors) and a new wagering system with a play strategy and a known potential loss. Keeping in mind I usually only play 4 days a week and most pick 4 and pick 5 selections do not meet the box criteria, I know I will not reach that maximum -$3,850 loss potential even with zero wins. (which, unfortunately is the expected scenario in a one pick straight shooter system where a rare hit is coincidental at best)

Luck will still be the most elusive independent variable. 

Happy Coding, and best of luck in your picks!

Entry #265

Thinking about budgets when it comes to the lottery

The only constant I had last year was playing the Match 6. I spent $728 on tickets last year and won about $1200, with $1000 coming with 5 out of 6 hit. I will pay ~$250 in tax for that win so that means the net profit was $222.

This year, the same strategy for the match 6 will cost $730 (leap year). The cost of the system I have developed, extended to a high scenario of 5 days a week would project to an additional $2,912 for a total planned budget of $3,642 (high end). That would require a pick 4 straight hit or 8 pick 3 straights to cover.

Looking at annual expenditures is an eye opener for sure. But I think that amount is not that bad to have a daily chance at at least $500,000 on the match 6 and at least 4 chances a week at $55,550. I don't know how to make it cheaper.

What does your budget look like?

Does that seem excessive or am I in the ball park with most other players?

Entry #264