hypersoniq's Blog

Day one of a week long pick 5 test.

Had to tweak the spreadsheets to do a frequency count to know which of the 120 permutations has the highest overall value of hits per digit in place, as that is how the straight combo is picked. So it totals them all and then displays a MAX value of the set to highlight the highest valued combo. A little conditional formatting makes it even quicker to spot.

Pick generation...

1. Run the python script that counts both straight and permutation hits.

2. Open the csv file that holds the numbers, the count of straight hits and the count of permutation hits. Use sorting to first sort by the boxed hits, then by the straight hits. The first number appearing within the results that has 5 unique digits is the pick for the week

3. Pop the pick into the spreadsheet to both validate the program results and to find the highest scoring permutation based on per column frequency

That then becomes the pick for the week, one for mid day pick 5 and one for evening pick 5.

Play each game (mid and eve) with it's resultant pick $1 straight and $1 boxed for 7 days (the limit of advanced daily game bets on the PA self serve kiosks)

The next big thing? Doubtful... but it is a starting point. Why not just focus on the pick 3? Because the pick 5 is the goal! Anything learned during this experiment series can be directly applied to the pick 3 later...

At $28 per week I will probably not go on long with zero results, so here is hoping this system catches an early hit... day one was not so lucky, 2 numbers in the mid day and 3 in the evening, but hey, it wasn't zero!

There are not high hopes because the event of 5 unique digits only covers a little over 30% of the possible outcomes. I do think that this is the cheapest way possible to have a proper go at it though.

Entry #351

The lure of the PA pick 5

Here in PA, they started the pick 5 (aka Quinto) for both day and evening on the same date.

Same game, same rules, DIFFERENT pick mechanisms... day is computer based while the night still uses the ping pong ball mechanical method.

Pick 2 was done the same way, on the same day. The pick 3 and pick 4 have years of history on the evening games over the later introduction of the computer generated day game counterparts.

When looking at the particular case of unique digits...

Pick 2 has 90 of the possible digits with unique numbers, and since each unique combo has only 2 permutations, the set of possibilities can be set at 45. See how the house advantage works... if you play all 45 combos at $1 boxed, the cost is $45 with a 90% chance of winning $25.

Pick 3 has 720 unique combos, when considering the 6 permutations, this comes down to 120. Playing all 120 at 50 cents costs $60 for a 72% chance of winning $40.

Pick 4 has 5,040 unique digit combos, and when considering the 24 permutations reduces thus to 210 combos for roughly a 50% chance at winning $100 by spending $105 on 50 cent box tickets.

Finally, the pick 5 has 30,240 unique combos, permutations reduce this list to 252. Playing all of these would cost $252 boxed to return $425, however... unique digit draws only cover 30% of the outcomes... always a house advantage.

So, obviously this indicates that there is no sure fire way to get an angle on the game without eliminating combinations.

On the surface, there is no way to tell if the system is genuine or if they gravitate to paying out the low bets. On one hand they limit the sale of combos... it has happened many times where a pick 3 like 777 was "sold out". They will never publish the sales data.

We have the history, but the history of random numbers is not supposed to yield any actionable intel because it is random.

Every lottery system is based on assumptions, from the simple workouts to the big game wheels. So here is where the hypothesis starts... the system in PA could be set up to mostly pay out the low played numbers because it is run by humans and humans are greedy... If you remember the infamous 666 drawing where latex paint was injected into the balls, PA does not have a stellar track record of integrity.

So now we turn our eyes back to the history files, only using it to see winning draws as the result of this manipulation. Following the money instead of just the raw data... they DO publish the payouts for the daily games... this could be used to take a guess at the low played numbers. Look at the difference in a payout for a 777 vs just about any non triple... this is where I completely missed the boat by focusing on individual numbers vs combos.

So now I have developed sheets and scripts to specifically count hits for combos AND their permutations, which covers not only looking at straight hit trends but boxed as well.

Pick 5 day and pick 5 evening, though drawn by different methods, have similarities...

Each one has a 5 unique digit combo that was drawn 3 times straight... each one has several 5 unique digit numbers that were drawn 14 times boxed. NOT the same combos.

The betting system for this would be as straightforward as the pick 3... pick one combo and run it straight and boxed for a week... cost $28 (for both day and night). Instead of focusing on the top straight combo, playing the combo that has the most boxed hits for each game. The straight ticket will be decided by the overall frequencies of each digit in each position. Hoping to play that as soon as the data is put together. Sure it is twice as expensive as a pick 3 round, but a box hit pays out over 10x vs pick 3 box.

After this I will be sapped of any new ideas, so here js hoping that some good results emerge.

Entry #350

Getting the steps down for a pick 5 attempt.

The list of 5 unique digits can be boiled down to 252 combos. I can count the hits with permutations. This gives a handful of combos that have hit more than others. Each combo has itself and 119 permutations... so, if picking one, which is the right way to play it straight?... the one out of 120...

Perhaps super simple is the way to go... get a digit count for each column and play the combo in question that has the most probable chance of coming out in that order... would not have to leave the spreadsheet for that one...

Might be worth a shot. Looking to update the sheets and give it a test next week. Dropping the pick 3 after 10/31's draw.

Entry #349

Let's see if my math is "mathing"...

The pick 5 in PA is a combination of numbers from 0 to 9, that is 10 possible digits in each of the 5 places...

Fact 1... there are 100,000 possible combos. Based on 10x10x10x10x10 = 100,000. They can be from 00000 to 99999

Fact 2... IF we are only interested in the combinations made from 5 unique digits, our choices grow smaller with each successive position... 10,9,8,7,6 = 30,240 combinations of 5 unique digits.

Fact 3... Since each combination of 5 unique digits contains itself and 119 permutations then that leads to there being a key set of combinations that would cover each eventuality, boxed... that would be 30,240 / 120 = 252

Does that sound right?

It is definitely not a revelation, as there still remains 69,760 combos that could fall on any given night... but it does seem to indicate that if you reduce the pick pool down to 252, you would have a better chance at catching a box hit... of course these need to be counted and I am stuck on exactly that program where all the permutations will be accurately counted (they count now but not accurately).

Does that sound like a reasonable assessment of the pick 5 situation?

I created a section in a spreadsheet where I can enter any 5 digit combo and it will count the number of straight hits, plus sum the hits on the 119 permutations. I am thinking that maybe the top 3 to 5 performers in this abbreviated list of 252 might be worth a shot at playing... but I need to get those accurate counts. Need to work on that program some more today.

I am relatively new to dealing with direct combos, and I have this program working with the pick 3, but not so with the pick 5... straight hits count accurately but it counts too many permutation hits that fail when the number is analyzed on the spreadsheet (for example, one such combo says it has 120 box hits, but when validated it only has 5).

Even the 252 list will only account for around 30% of the draws, but there is a gut feeling that there may be something worthwhile pursuing on this combo track.

Entry #348

Pick 3 permutation counter works, pick 5 version not so much...

The pick 3 permutation counter ran successfully and validating the box hit was easy. The pick 5 version is correct in the straight hit counter, but the box or permutation counts are not passing validation, the program looks like it worked but the boxed counts do not match the number observed during validation... now to figure out why... I think it has to do with the uniqueness cases... there are 3 possibilities on the pick 3;

3 unique digits 

One pair

Triples

The pick 5 adds to that possibility matrix...

5 unique digits

One pair

Two pair

Three of a kind

"Full house" (3 of the same digit and a different pair)

Four of a kind

5 of a kind.

That is 7 potential outcome patterns vs 3. I wrote the pick 3 version with 3 separate conditional statements to handle each case, and I wrote the pick 5 version with all 7 of the above conditional patterns in mind, so it must be in that section where the disconnect is happening...

So will be skipping a planned test of the pick 5 and instead rolling with 095 and 667 again in the pick 3 through Oct. 31

Entry #347

Spreadsheets are super useful at validating program output.

I find myself working on a spreadsheet modification that will allow me to type in any 5 digit number and automatically show the permutation hits as well... this is ONLY for the 5 unique digit case, as this has 119 permutations of the number of interest. This is needed to verify the permutation counts in the python program. I have validated accuracy in the pick 3 version, but the data from the pick 5, though correct on the straight hit count, seems a little too symmetrical to be accurate. The feature may only be used one or two times, but it is extremely important to get accurate data from a program, as this hobby is difficult enough.

Entry #346

I know what I need to learn next.. User Interfaces

All of the scripts and spreadsheets sit in a folder on the desktop (and a backup on a flash drive). Each file is either a spreadsheet, a .csv file, or a Python script.

They all require specific knowledge of how they were created and how to use them... there are no slick GUI apps...

I think it is time to learn how to turn a script into a finished stand alone desktop application.

Also, unrelated, I plan on evaluating on line courses to find a thorough crash course in full stack web development. Wanting to take the skills learned in school and apply them to something worth putting on a resume... I don't think 25 lottery specific algorithms are suitable for a portfolio...

Entry #345

While the permutation counter is being built, here are predictions for PA pick 3 for the next week

Based solely on the highest number of hits, here are the next 7 days guess for PA pick 3...

Mid day = 095

Evening = 667

Running these 0.50 straight and 0.50 box.

Still working out the details of interpreting the permutation counts, but these are the most frequent straight combos across the draw history... almost 17,000 night draws and almost 8,000 day draws.

Entry #344

Permutation counter design considerations

As this project moves through the planning stages, it is important to have an idea of the following;

1. Input. This will be the data structure that will be read into the script and stored in a certain configuration within the program that can be used as intended. Here it might be advantageous to use the "list of tuples" structure I have created for the vertical horizon project. There is a specific plus of already dealing with leading zero numbers, because 001 is not the same as 1.

2. Processing. What do we want to do with the data? 

3. Output. This will be the list of ALL combos (1,000 for the pick 3 and 100,000 for the pick 5) followed by their straight hit count, their permutation hit count and their combined total past winnings, a dollar value based on the combined straight and box hits. It will print to the screen so I can be sure it is working, but will also write to a .csv file for further sorting and filtering using spreadsheet tools.

Logic dictates that the highest dollar value is the best paying combo, and the top combos will change over time, so a betting strategy will need to be in place. I think this lends itself better to playing the combo for a week rather than a day, so the strategy is

Pick 3 .50 straight / .50 box, total cost for a week is $14. Re run the program once per week to get a pick for the next week. Continue until a hit, then add in the pick 5, though here we will be looking specifically at the combos that have 5 unique digits so a box hit stays under the taxable claim form level. That cost at $1 straight and $1 box would be $28. Since 1 box hit on the pick 3 would be $40 or $80 (depending on if there was a pair in the top combo) the pick 5 play will wait until the pick 3 has a box hit.

On a $40 box hit, the 5 will replace the 3, keeping the cost at $28 and holding the $12 to offset the cost of the next week's pick 3. 

A box hit on the pick 5 with 5 unique numbers pays $75 less than a $500 pick 3 straight, no claim form required as it is less than $600. A win there opens up the possibility to play both the 3 and the 5 for a month and still have some profit left.

So now that the input data structure is decided and I have an idea of what the output should be, I can focus on processing to get the project underway. This will entail counting directly and counting permutations. The process is far from being up and running, so if it turns out that the permutation functions in pandas works better than the functions in python, the data structure may also change a bit, but reading in rows across columns will be the input method. As this means my history .csv files do not require any additional modifications.

Happy Coding!

Entry #343

Next up, pick 3 permutation counter.

Going to create a counter for permutations of the pick 3 combos.

It stands to reason that the combo that won the most times straight is not necessarily the one with the most appearances boxed as well.

This one will count all of the combos for straight hits, but then also for hits on any permutations. Of course the triples will be kept out, so that is going to count the rest of the possible combos, all 900 from 001 to 998.

This time I am aiming to generate a report rather than just save to a csv file for further processing. (Though that will also be done).

Output format will be sorted by historic winnings...

NNN.   # straight hits.   # permutation hits (boxed)   $calculated winnings based on history

There are libraries in Python that make dealing with permutations orders of magnitude easier than the long and complex spreadsheet formulas.

Why all the extra work? Because this process gets a great deal more difficult when scaling up to the pick 5! 

Pick 3, 3 unique digits has 1 combo with 5 additional permutations... pick 5, 5 unique digits has 1 combo with 119 additional permutations!

Plus, leaving out the 5 of a kind combos, there remain 99,990 combos to sift through.

Same work flow as prior attempts, perfect the design on the pick 3 then just scale up...

Should be relatively easy, right?

Entry #342

The Pick 3 combo plan.

So, Today I will start looking at the pick 3 with the whole combo rather than by individual columns.

Pick 3 is easy to start with.

1. Count the most frequent number to have appeared straight.

2. Count all of the permutations of the combo fetched from step one, which would be box hits of that straight number.

3. Try it out on 0.50/0.50 tickets for maybe a month.

Different number combo expected from the day and night histories.

I can probably hammer that out in under an hour. There are at most 5 additional permutations to cover when looking at a pick 3 straight number comprised of all unique digits.

This balloons to 119 additional permutations if looking at a pick 5 number, so that will happen later.

The difference between the most frequent combo and the next most frequent combo will determine how frequently the process must be run again.

Next step, write a python program to search the database for the most frequent combo when all permutations are summed. This will obviously take longer, and will auto scale between pick 3 files and pick 5 files.

The only thing new will be testing the viability of box play. I can even create new CSV files to check combo followers distribution... nothing in the code base so far has been truly abandoned.

Entry #341

The big show, setting up the seed test for MM and PB.

After watching the results in the ongoing seed test on the PA Match 6 favoring the 6 highest numbers over all vs the most frequent per column, there are some differences that will need to be taken into account for the next round of seed tests...

1. There is  no "bonus" ball in the match 6. All 6 columns were used together to generate the top 6. In the big games, it will be reduced to a field of the first 5 columns, with the bonus ball handled using straight up columnar frequency.

2. There are 2 free QP lines given on each Match 6 ticket that will not be present in the big games. These QP lines factor into the prizes won so far, which is why the seed test must be re run for the mega and PB.

3. There is not as much history under the current big game matrices... this could lead to a data starved situation when using the seeds later to generate lines. Should this condition occur, the decision will be to use a QP number for any missing numbers in a given line. On the kiosks in PA, if you enter too few numbers on a ticket, it will offer to fill in the rest of the ticket with QP numbers.

The tickets are also a greater expense, so the seed test will be limited to the same 10 draws as the Match 6. The one that has the most numbers appearing will move on to become the input seed for a vertical horizon test for that game.

The VH test will be a one off for each game because of the expense of buying 7 tickets vs the usual 1 per draw.

Still struggling to make auto updates happen, but still working on it. It is not just the parsing of the RSS feed, but also the selection of the first draw NOT in history and encoding the data correctly. I realize I have no choice but to manually update the PB and MM databases from where I gave up on them last year, but hopefully I get a breakthrough on the update automation process. I have been reading a free online book titled "Automate the boring stuff with Python" that provides some solid starting points for such a task.

With the dailies I am looking into gathering some stats on the whole combos rather than individual numbers in columns. This should be able to open up pathways to not just count winning straight combos, but also to count their permutations to get a top list of number combos that perform the highest, that might help the synchronization disconnect when just looking at individual digits.

Rather than give up again and wait for inspiration, I am going to just push through the idea drought and expand on more coding techniques to extract more info from the code base I have already created.

Happy Coding!

Entry #340

What if we have been taking the wrong approach this whole time?

Attempt after attempt has been made trying to make order from chaos. If anyone has figured it out, odds are that they take that secret to their grave.

What if, instead, a better random number generator were built? Not a pseudo random number generator (PRNG), but a complete source of randomness used to generate the "picks". API calls to random.org, or even taking those and applying irreversible encryption similar to how the Bitcoin block chain uses double applications of the SHA256 one way encryption algorithm...

Set up the parameters for each run, such as 3 digits ranging from 0 to 9 in a pick n game to 6 digits from 1 to 49 in a 6/49 game...

Instead of the frustration of dead ends, it is like playing a quick pick, but without the lottery computers taking part in the process...

This is intriguing...

This is achievable...

This might be the next move instead of giving up again!

Entry #339

Seed vs seed in Match 6, day 2

The seed only test (also $28) over 7 days is already showing a bias in favor of picking the top 6 numbers regardless of position vs. The top number in each column.

The first day had 0 hits, the second day saw a $5 hit on the newer method (top 6 over all). 5 days to go.

I will not have a day off until Wednesday, so that will probably be my first opportunity to update the history files of the PB and MM.

After the Match 6 test expires, might be worth a short run head to head seed test on the big games, maybe 5 draws of each.

I wonder if something similar could be applied at smaller scale to the daily games? Now that I have finally expanded from being a columnar isolationist... 

Also cooking up an idea about using hidden Markov chains, but that is still far away from any type of test implementation.

Entry #338

Tonight will be the second test on the PA Match 6.

Since the last test ended without a clear cut winner, going to try again tonight. The winnings from the last attempt will make this one cost $10.

Same setup...

Generate seed draws using the spreadsheet

One using the most frequent number by column and one using the 6 most frequent numbers regardless of position.

Run each seed through the vertical horizon script, which will generate a line of the most frequent numbers to appear with each seed digit in it's position... this adds 6 lines to each.

Result is 14 separate tickets.

Winner determined by which system of picked lines have the most numbers. The QPs (2 per ticket) are not factored into measuring the result.

If this ends up being another test without a clear cut winner, then we will just move into playing just the seed numbers for a week on the jackpot games and see which has the most matches in 2 (MM) or 3 (PB) plays.

Still have some calculating to do regarding the bonus ball concept, but it hopefully provides a direction in which to move forward.

Entry #337