Double checking the math...

Published:

I do not have a way of knowing how many operations will be expected, but I can compute clock cycles based on the CPU speed and the run time of the test.

As written in Python, the elapsed time of the run was 190 seconds. In one second, a 2.4GHz processor goes through 2,400,000,000 cycles.

So, 190 x 2.4 Billion = 456 billion cycles.

To extrapolate that into the full run, since the test only did the first 100 iterations of 10 billion, we need to multiply that answer by 100,000,000.

The answer is a staggering 456 Quintillion cycles... again, as written in pure python.

That is 456,000,000,000,000,000,000 !

That is why the run time estimate is 554 years.

That is why Python and its seemingly minuscule overhead when running short scripts is the WRONG tool for the job.

Why?

1. It is interpreted. To have a chance at a run, this needs a language compiled at the processor level.

2. Dynamic data typing... python infers the data type, this makes it very flexible but that comes at a compute cycle cost. What this project needs is a static data type system where we can explicitly set the data types. Will save a ton of overhead vs constant re evaluation of the same variables.

3. The algorithm is tested and optimized as far as I can take it and still get the desired results.

 

The two leaders in a new language for this project are c and rust. However, c seems like the most likely candidate for the job.

What are the drawbacks of using c?

1. I am not very familiar with c, outside of a few programming exercises in school and using an Arduino, whose sketch programs are c like.

2. Memory management, including allocation and release will now be on me rather than an interpreter.

3. I also have to deal with pointers and manual garbage collection.

4. I have to manually create a data frame structure since I will not have access to the pandas library.

5. I still won't know the run time until the test program is run in c, this could all be for nothing.

There are many challenges ahead, but many have already been met, a working algorithm is created already, the flow chart will be of great use in converting to c. The rest of the system, the spreadsheets for validation and implementation already exists.

All I can do now is move toward the next solution. I have no idea how long it will take, but giving up when faced with that ludicrous cycle count is not an option... this is fascinating stuff!

The kicker is even with a successful run, it will still probably not help pick winning numbers. One plus is the memory situation was already reduced to bare minimum when deciding to use the Raspberry Pi 5, the generated csv files are small and the main loop does not hold data from each pass, only the incrementing of a single integer variable to count matches. When the next column is scanned, the variables reset.

The decision to completely move from Python to c was not taken lightly, and would not have been made if the algorithm did not work.

Might be on that borderline where hobby meets obsession...

Happy Coding!

Entry #290

Comments

This Blog entry currently has no comments.

Post a Comment

Please Log In

To use this feature you must be logged into your Lottery Post account.

Not a member yet?

If you don't yet have a Lottery Post account, it's simple and free to create one! Just tap the Register button and after a quick process you'll be part of our lottery community.

Register