hypersoniq's Blog

Suspending play until the next system is worked out.

The odds vs what is eliminated with the followers creates too many losses between hits. Even sticking to the pick 2 only is yielding nothing.

Over the next 4 days will be the last test stretch of the follower system in any form. 8 picks for 8 games (4 mid, 4 day... total cost $8)

The system under development may take quite some time to prepare, so I suppose I am heading back to the dormant phase for play for the foreseeable future.

3 weeks to go on the current class, a week off then my final class before graduating with a BSCS on August 28th... 8 year journey at part time, but the only way to pursue a degree and work full time.

The possibilities of the upcoming vector system that make it interesting is that the actual draw numbers are not directly used...

From draw to draw there exists 10 possible outcomes when dealing with digits. When you pick a follower, for instance, you immediately eliminate 90% of your choices.

The draw was a 2, the follower is a 6... you pick 6, the next draw is a 5... you lost. You can learn that it is not always the most frequent number that ends up following, but you are left with little else than an error number.

What we are looking for with vectors are such things as information from the previous draw that could be used to predict the next vector.

Is there a connection with the angles?

Is there a connection with the magnitudes?

Can an equation be built that takes the last few vectors as input?

It will no longer exclusively rely on the numbers, they are just used as start and end points on the grid.

If you can imagine the grid to be the numbers 0 to 9 on the +Y axis and the draw dates on the +X axis, the whole problem space is confined to the first quadrant. The distance between draws is always 1. A repeat would have 0 degrees and a magnitude of 1. A 4 to a 5 would rotate clockwise on the x axis and result in a positive angle. A 6 to a 5 would rotate counter clockwise and have a negative angle. If the last draw was a 9, the grid constraints prevent a positive angle to the next draw. If the last draw was a 0, the grid constraints prevent a negative angle to the next draw.

This can all be easily visualized with plotting the graph.

There are 100 possible vectors from any previous draw to the next. Many are repeated, such as a 5 to a 4 is the same exact vector as a 1 to a 0 or a 3 to a 2.

End goal is the same as every other system I worked on (and subsequently abandoned) which is one single best guess. 

Designing systems is fun, trying out systems is also fun... constantly seeing no positive results is not.

Back to the sidelines...

Entry #303

Thanks to LP member Caspernina for the angle idea!

The analog pen and paper work of Caspernina is inspiring this next project.

The draw to draw angle is one component of a vector, which is made up of (angle, magnitude[length])

This should be a fun exercise because of the tools that can be used... trigonometry, physics, maybe even a bit of calculus in the mix.

Some times a bit of inspiration is all that is needed, so thanks again!

Entry #302

Artificial Neural Networks

The more I learn in class, the less likely it appears that Artificial Neural Networks will be applicable to picking winning lottery numbers.

I still have 2 weeks left and then my final class in AI, but under the hood is a bunch of summations and activation functions, biases and averages. Good for solving certain problems, but not ours.

Big game matrix changes are designed to keep us data starved, and small games with large histories don't fit the proper input format because of the near uniform distribution.

If someone wins with a neural net, it will be as coincidental as any other system.

Disappointing looking inside black box tech, but will save tons of time going down fruitless rabbit holes.

Speaking of fruitless rabbit holes, making progress in the "thought" phase of working with draw to draw angles. More accurately draw to draw vectors. Will soon be drafting a flow chart...

Entry #301

What exactly are we missing in prediction?

We have experimented with frequency, highest/"hot" and lowest/"cold". Yet somehow most of the results come from the numbers between them on a distribution chart.

Computing can only take us so far if we are not asking the right questions... but what are those questions?

For maybe 10 draws I have tried bot the most frequent AND least frequent numbers on the PA pick 2 with no wins. (Least frequent were paper play, but still...) One number here and there.

Still at the flowchart stage of developing a program to look at draw to draw vectors... there is a free program called Dia that is like a CAD program for flowcharts, decision trees and entity relationship (ER) diagrams that has helped with planning a project before coding.

It is currently beyond me to sort out the reason why a near normal distribution like a pick n lottery picks from the "middle" (and not always the middle) on a daily basis... but I am trying to figure that piece of the puzzle out.

System play stays at the pick 2 until a win, which seems like it will never get to the next level.

1:100 should not be this difficult, right?

Entry #300

R Studio, a free Integrated development environment

R is an older programming language that has mathematics, and in particular Statistics as a first class feature (no need to import a ton of libraries like in Python).

R Studio is a free IDE for R.

Is it easy to learn? Not really, but anything worthwhile has a learning curve.

My class in Data Mining and Machine Learning is done in R.

If you are curious about statistics, I highly recommend this free tool. It even reads the same csv files that Python does!

More to come on this...

Entry #299

An idea... angles!

If we were to create a graph, one that contains all 49 numbers of the PA match 6 on the Y axis, with the draws separated by a fixed distance on the x axis... and this were put into a grid like formation... then angles could be determined from draw to draw.

I am just trying to formulate some sort of direction to work on jackpot type non replacement type multiple number draws...

From the 6 numbers in the last draw to the 6 numbers in the next draw there would be 6 angles from each number to the next numbers for a total of 36 angles per draw.

If each number drawn had it's own color, then we would have 6 distinct visible pathways from draw to draw. Each line segment would be a vector, having an angle and a length. This could allow trigonometry to be added to the mix!

Not sure yet where this will lead, but for now the object would be to create this data and then graph it, so that is a starting point. Since the angle and distance to be calculated are fixed upon computation, a tuple seems like the right data structure to hold the data.

Written to a csv file would be the date and 6 digits of each draw followed by the 36 tuples of vector information between each draw and the previous draw. This should allow for graphing all or just part of the data, such as a date range or just the low or high numbers...

So the needs would be

1. A script to batch encode the data from the history file

2. A script for graphing data from the resulting csv file. Thinking a GUI for this.

3. A script to append next draws to the data.

That seems like enough to work on for now, while I figure out the analysis portion as I go.. 

The possibilites... vector analysis, triangular patterns... same goal, one best guess.

Entry #298

Now what?

So, since my big idea fizzled out I think I will return to python and finally figure out how to read PAs pathetic RSS feed and turn it into a way to automatically update all my draw files. I think I will also include the current cash 5 matrix just to have the data in case a jackpot game idea comes along.

So, I guess play wise I will start back with the followers, because of the 10,000,000,000 possible replacement strategies... they were #1.

Using the pick 2 system, this limits out of pocket expenses to $2 per day played, the other games have to wait until a hit to get action... how can hitting a 1:100 game prove to be so difficult?

I firmly believe they could have a pick 1 and I would be wrong 90% of the time...

Parsing arcane data structures from the PA lottery and making them usable in python to write output to 12 different csv files... good times 🙄

Happy Coding!

Entry #297

What was learned...

After the analysis of the top 3 lists in the one pick 2 game that ran, not only was the follower list the top list, but I already had lists 2 and 3 in the distribution for 2nd and 3rd place in the follower output anyway.

I learned that the math behind the followers already was the answer to the question of how to generate the most column matches in a game.

The logic behind followers, what number most follows any given number drawn, was the only choice, as it gave the most frequent follower for each digit, therefore the top performing list of ten billion was the one that had the most matches and this is all based on frequency.

So, I basically saved enough money by not playing to cover the cost of the raspberry pi 5. The down side is I thought this was the possibility. Follower frequency analysis was no magic wand... it only takes you from the expected 10% chance on a number to maybe 14% best case.

Today I will start clean slate with the pick 2 per the revised betting strategy. Not expecting much as so far I have not won the pick 2.

What did I learn outside of the answer I had a feeling that I already had?

1. Code optimization. I learned plenty about what kind of overhead python can have, particularly with variable typing.

2. Code conversion. I had a successful conversion from Python to C.

3. Connecting remotely to another device, transferring files and controlling from a remote SSH connection.

4. How to use, debug and tweak the gnu c compiler. Only had one segmentation fault caused by missing a memory cleanup at the end of the program.

5. How to recreate the functionality of a pandas data frame in C.

It was an interesting few months for sure. Who would have thought that the original program I wrote for followers already held the answers. I do not have to run the full script on all of the games, because the fact that I had vision to all 3 of the top lists from a program I wrote in January in a few days that runs in python in about 90 seconds.

Because I play around 4 days a week, this will actually be cheaper than the match 6 game which I play every day (by playing a ticket for 26 draws at a time)

It was fun, but now I need ideas for the jackpot games... I can write fast python programs and now can convert those to C if they require heavy calculation... I am just out of ideas!!!

Gladly entertaining all wild speculations and ideas!

Entry #296

Rebooting the follower replacement system with a newer cheaper system.

So, as mentioned earlier, the whole out of pocket expense comes from the pick 2... that's it. Any other plays will be based on a pick 2 hit.

After working out some scenarios, it seems like the better bet is to completely eliminate the pick 4, even on their money.

Also, want to mix in the pick5 faster, so the 4 draws after a pick 2 win would be 

Eve... 1 pick 2, 4 pick 3's, 1 pick 5 ($6)

Mid... 1 pick 2, 4 pick 3's, 1 pick 5 ($6)

4 draws at that rate is $48, so the last $2 from a pick 2 win would go to the next "only pick 2" play.

That puts the whole system as the cheapest yet, well under the estimated $730 match six ticket throughout the year.

That's all I have for now.... still no clue where to start to analyze the BIG games... deflated yet again...

Entry #295

C speeds things up by almst 4,000%, verdict is in...

After the c conversion, the speed boost was over 4,000% from python and guess what the top list was... it exactly matched that list I generated from follower data...

So for the first hypotheses we can say that you might be able to bump your luck up from 10% to 14%, which means we can probably accept the null hypothesis that numbers systems do not increase your odds of winning in any meaningful way.

We can also reject the null hypothesis that the best results do NOT come from follower data. Of 10,000,000,000 iterations of lists against winning numbers, the best list matched the follower data exactly... I mostly went through this exercise for nothing but the education.

It took 28 hours without overclocking and only analyzing the pick 2 evening data. Same result in python would have taken 13 years...

On to the next idea... awaiting inspiration.

Happy coding!

Entry #294

Getting a first look at c conversion of the python program today.

Got ahead on homework for the class, I am able to carve out a few hours before work to start setting up for the conversion to c.

I will need to start by finding the import path for the Pandas c headers, which I found in a local folder. Already have an idea of the csv read mechanism, have to figure out csv write in c, also calculating the malloc() c function to allocate just the right amount of memory.

Have to explicitly declare the variable types, nothing requiring more than a short INT. I would go character but the appended match count can be as high as 1,800 (or more)

Whatever garbage collection I need to implement and whatever pointers I need to iterate through each column will be focal points today.

Compiling and clearing errors will probably then be the workflow until a successful run of the test program happens. Only then can I convert the part of the program that counts to 100 and modify it to count to 10 billion, and also remove the timers. The outputs will be active as well. I disabled the csv write for the last timer test but re enabled them to make sure that the tests are complete.

This is one of those times where I am glad I first did a flow chart so the logic can be converted into another language easier.

Not going to be a quick journey like developing in python, but I am ready for that.

Entry #293

Pandas can be used in C !!

There are C extensions for the Pandas library! This means no need to reinvent the wheel (or the data frame that holds the draw histories).

That was found in the pandas documentation, so I just have to download the pandas c extensions and put the directory into the include path when compiling.

This is huge as the initial algorithm can stay intact.

There are also csv file write commands that are functionally equivalent to their python counterparts. 

Researching the operations to have a better shot at first time success with the translation from Python to C. Not as scary as first imagined.

Entry #292

The Python to C roadmap...

It looks daunting...

Memory management is the first part that has me concerned. The fix looks to be divide and conquer...

Garbage collection... that should be fun.

Pointers... this will most likely be the mechanism for column traversal.

Solution for the test program is to run individual tests with data of the same length. So we would start with the pick 2, as it has the fewest operations. Will split into eve and mid as well.

Production software will be 8 programs with the memory allocations to match the data size of each game.

Runs will most likely be sequential rather than in parallel since it will eliminate the possibility of threads vying for the same memory blocks and corrupting the data.

These can be added to execute in sequence using the built in Linux CRON job scheduler.

The order will consider the play strategy which has been modified to represent the least possible expense.

Pick 2 Mid

Pick 2 Eve

Pick 3 Mid

Pick 3 Eve

Pick 5 Mid (the goal)

Pick 5 Eve (the other goal)

Pick 4 Mid

Pick 4 Eve

The new play strategy...

Pick 2 only until a hit.

On a pick 2 hit, next 4 plays are 

Pick 2 x 1

Pick 3 x 5

That is $12 per mid/eve cycle, leaving $2 to get back to just the pick 2

On a pick 3 win we deal in the pick 4 and 5

Pick 2 x 1

Pick 3 x 5

Pick 4 x 1

Pick 5 x 20

That is $54 per mid/eve cycle for the next 4 played cycles... $216 on house money then it drops down to 4 cycles of the pick 2 win strategy for a total cost of $266, taken from the pick 3 hit profit of $2,500 that leaves over 2k.

The ONLY out of pocket expenses will be $2 for the pick 2 cycle! That is a 75% expense reduction in regular play.

It reduces greatly the exposure of the plays on the pick 5, which is the target, but part of the exercise was to develop the system into an entire play strategy that minimizes out of pocket expenses while still having the potential of decent profit. If it can't beat the 1:100 odds then there is not much point in going after 1:100,000

Entry #291

Double checking the math...

I do not have a way of knowing how many operations will be expected, but I can compute clock cycles based on the CPU speed and the run time of the test.

As written in Python, the elapsed time of the run was 190 seconds. In one second, a 2.4GHz processor goes through 2,400,000,000 cycles.

So, 190 x 2.4 Billion = 456 billion cycles.

To extrapolate that into the full run, since the test only did the first 100 iterations of 10 billion, we need to multiply that answer by 100,000,000.

The answer is a staggering 456 Quintillion cycles... again, as written in pure python.

That is 456,000,000,000,000,000,000 !

That is why the run time estimate is 554 years.

That is why Python and its seemingly minuscule overhead when running short scripts is the WRONG tool for the job.

Why?

1. It is interpreted. To have a chance at a run, this needs a language compiled at the processor level.

2. Dynamic data typing... python infers the data type, this makes it very flexible but that comes at a compute cycle cost. What this project needs is a static data type system where we can explicitly set the data types. Will save a ton of overhead vs constant re evaluation of the same variables.

3. The algorithm is tested and optimized as far as I can take it and still get the desired results.

 

The two leaders in a new language for this project are c and rust. However, c seems like the most likely candidate for the job.

What are the drawbacks of using c?

1. I am not very familiar with c, outside of a few programming exercises in school and using an Arduino, whose sketch programs are c like.

2. Memory management, including allocation and release will now be on me rather than an interpreter.

3. I also have to deal with pointers and manual garbage collection.

4. I have to manually create a data frame structure since I will not have access to the pandas library.

5. I still won't know the run time until the test program is run in c, this could all be for nothing.

There are many challenges ahead, but many have already been met, a working algorithm is created already, the flow chart will be of great use in converting to c. The rest of the system, the spreadsheets for validation and implementation already exists.

All I can do now is move toward the next solution. I have no idea how long it will take, but giving up when faced with that ludicrous cycle count is not an option... this is fascinating stuff!

The kicker is even with a successful run, it will still probably not help pick winning numbers. One plus is the memory situation was already reduced to bare minimum when deciding to use the Raspberry Pi 5, the generated csv files are small and the main loop does not hold data from each pass, only the incrementing of a single integer variable to count matches. When the next column is scanned, the variables reset.

The decision to completely move from Python to c was not taken lightly, and would not have been made if the algorithm did not work.

Might be on that borderline where hobby meets obsession...

Happy Coding!

Entry #290

A formal statement of the problem the project aims to address.

The BLOTBOT project is a thorough attempt to analyze the per digit replacement system of selecting lottery numbers. One such popular variant is the mirror system.

The null hypothesis: there is no statistical advantage to be gained by studying past draws to predict future results using per digit replacement in a pick N lottery game.

The alternative hypothesis is that there IS a statistical advantage to be gained by studying past draws to predict future results using per digit replacement in a pick N lottery game.

Where we are planning to deviate from the scientific method is by exhaustively testing all possible variations rather than sampling.

There is a second hypothesis also being tested.

For this part...

The null hypothesis is that direct follower data will not be the highest performing list in the 10 billion lists possible for each column of each game.

The alternative hypothesis is that direct follower data WILL be the highest performing list in the 10 billion lists possible for each column of each game.

So, with one massive test, we can make an honest attempt at answering both hypotheses.

If the first hypothesis is accepted, then there really is no point in continuing on the current path. That would be the indicator to maybe back away from the daily games for good. Not sure yet. It may be the indicator that I am just not smart enough to beat the lottery at their game, and should employ some other techniques like unsupervised machine learning to help find patterns that I fail to see.

On the other hand, working to get to this point has allowed me the opportunity to put some of the theory I learned in classes like algorithm design and software engineering into direct use. I am not the type to refuse to admit I was wrong, i have been studying the lottery for decades and have spent more time thinking about the problem than actually playing.

It is that burning desire to solve problems that kept the chase alive so far. I want to know, even if it means confirming that the chase is a waste of time and I should move on to something else as a hobby. As always, time will tell.

Entry #289
Page 1 of 21