hypersoniq's Blog

The data is the difference

Even in noisy real world bioinformatics data sets there is a structure of biological rules present in the data, even if separated by millions of years of mutations and evolution... lottery data has no guardrails... no underlying truths to build upon.

After seeing it work right in front of me on DNA nucleotides, i know that IF the lottery data could be solved, Markov transition chains would have found it...

Entry #666

$8 on the last PB (1+1)

That has been the only hit so far after dropping out of everything else. The $12/wk budget is the best part... not winning anything is not so interesting. Not wasting my time on dreaming up some system that might help solve the puzzle is super dull... but when you finally see the truth you can't unsee it. My app now is basically a heavily over engineered QP generator.

Finally got to use a Markov chain in bioinformatics to calculate transition probabilities for Mendellian inheritance, in that domain it is a wonderful tool to have that gets genuine results. I have 39 problems to go in the bio algorithm challenges, they should be introducing hidden Markov models soon. I never quite got past Markov chains with lottery data, so this will be an interesting learning opportunity.

Also, bioinformatics will lead to biostatistics, where I will be getting some good foundational practice using the R language.

Sort of hoping to be inspired to bring something back to this domain, but so far I have seen nothing applicable.

I do stand by my decision to cut back from $14/wk to $12/wk... instantly shaves $104/year off the budget and it is somehow less discouraging to have the chance and miss millions vs missing $500 repeatedly... and it only needs to work once.

Still here, still playing, still playing responsibly.

Entry #665

Goofy number properties

Taking any number from 2 to n digits, 

1. A sample with 37621, sum the digits (19)

2. Subtract the sum from the original number (37621-19) to get 37602.

3. Sum the digits in the result (18), they will ALWAYS be a multiple of 9! 

Tried this with up to 9 digits and it works every single time.

But what, if anything, can this be used for?

Entry #664

So what may be the next frontier for lottery prediction?

Straight statistical analysis does not work because the history is random, bias, if it exists, is so slight it avoids repeatable detection. 

Most of the systems in the systems thread are variations on the same type of theme... even the "new idea" is 5 years old...

There is not much going on in the math forum.

Outside of a few interesting posts, it is as if there is a lull in the truly novel approaches...

Which leads to the original title... what could possibly be next?

Entry #663

Quick win on a rosalind.info tree challenge

Crazy how the strange tree theorems I learned in college have an actual applied use outside of homework...

So the problem sets up this long winded explanation of unrooted tree theories and wants you to determine the total number of "internal leaves" on the tree... for an unrooted tree, this is returned as n-2... that's it! So basically I downloaded their challenge number, subtracted 2 then resubmitted... this was one of those moments where the CS degree came in handy. I could not imagine having to come at this from the other side, Bio to CS... I am picking up the biology on an as needed basis.

I also learned that the python pow() function accepts a modulo as a third argument... sometimes these biological problems would quickly flood your RAM if you are not thinking of efficiency from the start. Many of the combinatoric solutions want you to use modulo 1,000,000 to keep from freezing your cpu.

Entry #662

Found a great biology text!

"Molecular Biology of the Cell" 7th edition by Alberts et al.

Normally $265 for this hardcover 1,552 text considered the "bible" of cellular biology, Amazon had it at 35% off for some reason... may as well grab it! The research on the field of bioinformatics frequently references this book, as do the complimentary fields of biostatistics and computational biology.

It has been amazing moving from the statistics of useless lottery data to a field where the stats have meaning. I am still sticking with python for the first 103 challenges (33 done now) but the next area, the arsenal, is where the move to R for proper statistical analysis is the right move.

combining python for extract/transform/load and R for statistics and visuals is how they do it in real bioinformatics labs.

Even the lottery app was not a waste as the lessons learned are already making an impact on understanding why these problem sets have biologic importance and how to process and move data through a pipeline from ingestion to final product.

Who knows, once a real exposure to powerful analysis in R moves along, there may be better solutions to this lottery problem that come to light.

Still carrying on with the power ball, one ticket 3x per week, with no good result over the last month... but indeed it only needs to work once!

Book will be delivered tomorrow, $93 cheaper thanks to the "sale"... strange to see a book like this on sale but I am not arguing...

Entry #661

A new way to group pick N digits

We have even/odd, high/low, so why not try something different.

Set one ("open" numbers) = 1, 2, 3, 5 and 7

Set two ("Closed" numbers) = 0, 4, 6, 8 and 9

Why? Why not?

The open numbers have no enclosed loops, the closed numbers each have 1 enclosed loop, but 8 has 2.

Mixes up vs. High/Low and Even/Odd... could even pair them up like mirrors etc...

Inverted...

1   9

2   8

3   6

5   4

7   0

Have fun...

Entry #660

2 draws of 3 using home made QP generator

The last 2 draws, Saturday and Monday, I managed to match exactly one white ball in the double draw... nothing on the main game.

On the 24th attempted challenge at rosalind.info I submitted an answer that did not pass. That was the first fail, and there was much to learn from it. Last night I took another read of the problem and finally figured out what they were expecting, a recursive depth first search of the space provided... then I passed it! I have not dealt with recursion or DFS since my data structures and algorithms classes. Another win for choosing to learn bioinformatics/biostatistics is getting the chance to apply literally everything I had learned in school but did not use since.

Entry #659

Power ball control test

Going to give my Python powerball QP generator a whirl for this week (Sat, Mon, Wed). It has the same chance as letting the state fill it in...

Entry #658

The current state of play

Since deciding to go with only the power ball from here on, there have been 0 wins (not even the red ball).

But the value is there... for $4 I get a multiplier I opted into and a second chance on the double draw (which I also opted into). So for the price of $12 a week, I get the license to daydream.

Still using the app I made to try different picks, but the plan has been simple, pick a line through the followers and compare their positions on the classifier page.

The slide rule test did not result in anything different, only it was much more work on the spreadsheet side, so back to the app. Super easy, since followers do not work on their own, get 1 pick and play the same combo for 3 draws... then pick a new combo.

The concept of prediction is out the window, the app is more of a statistics displayer anyway... the total pick process is done on the Android version of the app and takes less than 10 minutes (per week!). The only thing that would be quicker would be to add a QP generator to the app. But for the same near zero chance of winning, I will go with what I have.

Lost, however, is the dream of "cracking the code" and "beating the system"... leaving my app as my "magnum opus" of my involvement with trying to impose order on chaos. I have converted all of my former lottery development time into this bioinformatics pursuit. I have solved 18 of 103 coding challenges in the "Bioinformatics Stronghold" at rosalind.info, and the work I had done in python trying to "solve" the lottery problems has been directly applied on several challenges.

It figures that I reach the end of the search just after solving how to automate draw updates... so much for reverse engineering randomness...

Entry #657

The skills built in this hobby were not wasted time

Even though I have seen the writing on the wall about trying to predict lottery numbers as an impossible task, I DID try for myself and not just give up. As it turns out, many of the skills with coding developed over these last 20 years are mostly directly applicable to this new quest to learn bioinformatics.

The methods used in scraping the PA website for draw updates made working with actual APIs much easier. I was able to develop a FASTA file parser to read bio data in a few minutes and create is as a reusable module. There are a few other common bio data file formats yet to come, and I will be able to incorporate these into a similar solution.

Searching for patterns that do not exist, using NumPy arrays because they are faster directly contribute to search patterns for k-mers and motifs in strings of nucleotides.

While Markov Chains were of no use in the lottery, they are literally everywhere in bioinformatics algorithms. From finding write regions of unknown DNA to predicting how proteins will fold after being built from mRNA.

Also some probability goes into such things as determining the chance of gene expression in the n-th generation of offspring using Mendel's laws.

Path traversals outside of just a CS homework assignment, finding the worst case run scenarios (big O) and planning for memory management when those DNA strings have millions of nucleotides...

Not going to sugar coat it, I have a TON of domain information to learn before I can create anything useful. This will be a challenging time learning this subject and associated knowledge required... thinking I need to brush up on physics, chemistry AND biology along the way... but it feels like a worthwhile pursuit.

And I will still play an occasional ticket, it just won't be prefaced by months or years of seemingly useless development to tell me what I already knew... it is just not possible! One ticket in a game is all it takes to have the license to daydream, and the methods used to choose numbers can get much simpler.

Entry #656

First Picks from "Slide Rule"... PA Pick 2

So, the system is easy to follow in the spreadsheet. The 1/1/2026 back test had no winners (as expected) for the evening, but did manage a straight hit in the mid day... remember, purely coincidental!

Without further buildup, here are the picks for the next 7 days (including today) for the Pennsylvania Pick 2, both mid day and evening... let's see how they do...

MID
04/09/26 4 6
04/10/26 4 4
04/11/26 8 7
04/12/26 4 2
04/13/26 4 8
04/14/26 6 9
04/15/26 1 0

 

EVE
04/09/26 6 1
04/10/26 1 6
04/11/26 5 2
04/12/26 3 0
04/13/26 2 7
04/14/26 5 8
04/15/26 2 7
Entry #655

Working on the "slide rule" system...

So the goal is to develop a sequence of steps, simple to follow, and NOT based on any math or statistics foundations... it will be PURELY coincidental, and cheap by design.

So, step 1 would involve an offset... you need to go back a certain number of draws in each column BEFORE you start counting specific digits. Some initial stagger would be helpful, and could be introduced here. The first option that comes to mind is going back 11 draws in the first column of the game then adding 1 for each subsequent column... this would look like (counting the most recent draw)

Pick 2, go back 11 draws in column 1, 12 draws in column 2

Pick 3, also go 13 draws back in column 3

Pick 4, also go 14 draws back in column 4

Pick 5, also go 15 draws back in column 5

11 as a start is arbitrary, but establishing some kind of order early on makes it easier to generate picks.

In addition to the most recent draw, you will also need to look at the draw before, as this will determine our counts for step 2...

Step 2, starting from your offset position, look at the draw above the most recent, and find the number for the most recent draw... like so

If the last 2 draws in column one were 

64

71

From the offset, find the 7th time a 7 shows up, counting up. Wait, why the 7th time if the previous number was 6? Because we need to handle the edge case of a zero where the 6 is! If the 6 were a zero, we will get around that by using zero based counting, where the zero indicates the first occurrence, so imagine instead this update

64 (plus one on each) = 7 5

7 1

So from the offset in column one, we find the 7th 7 and the 5th 1.

The most extreme edge case, the draw before last is 0 9...

That becomes

0 9 (add 1 to each) 1 10

7 1

So you find the first 7 above the column one offset, and the tenth 1 above the column 2 offset.

Step 3, copy the column from the number you ended up with and the 7 below it (8 total cells) for each target and paste them on a different sheet, side by side. The top number will be an exact match for the most recent draw.

Step 4. Put the date of the most recent draw beside the top row of your newly formed data. If it was last night's pick 2, then that is where you start. Drag the date down to the next 7 rows and boom, there are your picks for the next week...

Sample (not real data)

Last draw was 4/8 so

4/9 = 3 4

4/10 = 6 6

...

4/15 = 8 7

You end up with a list of combos and WHEN to play each one. No statistics needed, no math needed, no grid assembly that generates way too many combos, no -ology. Also, no guarantees, no wheeling, no added expense.

There is genuinely ZERO math involved, it is just counting and following directions.

Of course this assumes you have past draw data in a spreadsheet and that the oldest draws are on the top (so it is date ascending as you scroll down)

Each combo has a 1 in 100 chance (on pick 2, add a zero for each column) and with that the expectancy is that you do NOT win, and even if you do it can be chalked up to coincidence.

So, the play is to bet the combo on the date indicated, NOT playing all of them each draw! On most kiosks they let you play up to a week in advance, one day at a time, so this CAN be done in one trip to the kiosk. If you stick to a $1 bet for each game, the total for a week would be $14 ($7 for mid day, $7 for evening) or, just pick 1 game, then it is even cheaper!

I am going to run exactly one back test per game (mid and eve pick 2 through pick 5), using the draw on 1/1/2026 as the starting point. Then I will run the pick 2 mid and eve for the current draw and post the list here. I will NOT be actually playing... I am sticking to PB for $12/week. And yes, I will be applying this system to the next 3 PB picks (sat, mon, wed). The only difference being that there is no zero in the PB, so zero based counting will not be needed. So technically, it can be used for any game with enough history.

No coding, no fancy formulas, no VBA, no stats... it generates exactly one combo per play. Keeps positions independent, only uses history to grab numbers with zero bias, can be applied to ANY game... checks a good number of boxes!

Be back with the pick 2 test picks in a bit...

Entry #654

First 4 challenges complete on the bioinformatics "stronghold".

This is a fun challenge series.

While there are simple solutions to these problems given their relatively small size, bioinformatics can scale up past your RAM quickly.

I have given myself the extra challenge of making my solutions scalable and mindful of big O (the worst case run scenario). They let you use any language to solve the challenges, and Python is my main one, but I have passed the first 4 challenges using the power of C language subroutines in the Numpy package.

My answers were the same as the simpler solutions, but they can handle scaling. If I had to deal with 1,000,000 nucleotides instead of 1,000, my solution would not break.

Going to keep that momentum going throughout the challenges.

The stronghold section requires you to create the algorithms to solve the problems... in the following section, the "armory", the challenges are to be solved with existing industry wide software packages like Biopython in python and Bioconductor in the R language.

They briefly touched on the power of Numpy to make powerful reductions in big O problems, but the majority of what I learned about the practical application of Numpy and efficient coding came directly from the lottery hobby!

Even though this lottery problem was impossible to solve, the skills learned are literally directly transferred to other domains! Markov models are all over the bioinformatics domain.

Who knows, maybe I will pick up something in this pursuit that can be brought back to the lottery domain...

Entry #653

Since picking from the classifiers turned up nothing

This week for the power ball, it will be strictly the top line of the Markov chain followers.

This is also where we see if it is best to skip the MM completely and just go with PB... mostly because the draws of the MM were nowhere near the numbers from the system, while the PB brought a few matches in the white balls (just not any PAYING ones).

The budget will then be set at $12 per week for 3 shots at the jackpot AND 3 shots at the double draw $10M prize... the MM and it's $5 for almost a guaranteed 2x multiplier is just not as interesting.

Slide rule tests on the pick 2 will be ready shortly, but this week has been about non lottery coding.

Entry #652
Page 1 of 45