Testing data for randomness / predictabilityPrev Topic Next Topic

New Topic New Poll

RL-RANDOMLOGIC

United States
Member #59,352
March 13, 2008
5,626 Posts
Offline

Jan 21, 2017, 10:59 pm

Hi all

In the past I have tested many secondary data sets to measure it's predictability using my own tools

but a few weeks ago I ran across a battery of tools that may work better than what I am using. Here

is a link to the NIST webpage http://csrc.nist.gov/groups/ST/toolkit/rng/stats_tests.html The National

Institute of Standards and Technology.

First I would like to know if anyone has a link to a free C compiler that will compile the tools available for

download. I have turbo C++ which is capible of compiling C code but I get a number of errors when I try

to run makefile. I would like to find something that will run the code without making any changes except

the two lines in makefile that need to be linked ti the compiler being used. The setup instructions are

located in section 5 of the PDF which is very well documented. The PDF can be downloaded by clicking the

top link on the download page.

The software can accept a string of ascii such as 111010101001010011110000101010 and evaluate it

for it's randomness. The tool can run all 15 test or just a few depending on ones needs. The 15 test are

shown on the main page with a brief explanation.

I always try to test the data I analyze to see if it's somewhat predictable and these test should be very useful

as I am working on a new predictor but want to know up front what I am up against. Below is a sample of the

data I am attempting to predict/forecast the next value. The two columns of data below will be formatted into

strings, top to bottom for 14 total strings. Each string will then be analyzed using the 15 test to help decide

which prediction algorithm would work best. The date here is far less random then the actual numbers they

represent. What I am trying to do is calculate which type of analysis would work best. I am thinking that some

of the test will indicate the data is random while others will find it's not. Lets say that runs or (1) or runs of (0)

show the data to be random but another test shows some weakness. The predictor code should be built around

the test showing the weakest which should improve effectiveness of the predictor used.

I don't know if others have used this type of analysis before but would very much like to know your thoughts or

results on this method. I guess one could sum it up to making predictions based on the weakest random elements

of the data being analyzed.

RL

0101101 0001111
0011001 0011011
0110010 0101100
1100001 0100100
1000110 0001110
1111111 0111111
0001101 0110000
0000001 1001100
1100101 0011100
0010110 0100010
1010011 0000011
0101100 1010011
0100101 1000001
0111100 0100010
0111010 1001011
1001011 0011011
1011110 0111101
1110000 0001001
1011110 0101100
1011010 0010001
1010100 0001000
0001111 0101101
0010001 0011001
0111010 0111011
0110100 0000011
0000100 0011110
0000101 1001100
1111011 0001101
0110001 0100000
1000011 0001100
0101111 1000111
0011011 0101111
1100001 0000011
0010001 0110010
1100111 0101101
1011101 0000011
0110111 0111100
1000111 0101001
0011000 0111110
1001010 0000100
0011011 0010011
0111010 0100000
0011001 1011101
0011111 0000101
0111000 0110000

....
jimjwright

Crested Butte, CO
United States
Member #69,862
January 18, 2009
1,394 Posts
Online

Jan 21, 2017, 11:40 pm

It looks like the makefile is referencing gcc which is the gnu c compiler so you can try the following link.

https://gcc.gnu.org/

In the past it was a pain to set up gcc on a windows computer but maybe its simpler now. I think you had to install Cygwin first but its been a long time since I installed gcc. I exclusively develop on Mac now for last 3 years. Go to you tube and search for installing gcc on windows if you want to go this route.

I was able to compile as is without any modifications on my Mac computer which has gcc as part of Xcode. A linux installation would also typically have gcc installed.

I am sure you could download the latest Microsoft community edition and enable C++ support and you can tweak the makefile to compile.

https://www.visualstudio.com/vs/community/

Jimmy
garyo1954

Texas
United States
Member #4,549
May 2, 2004
4,228 Posts
Offline

Jan 21, 2017, 11:43 pm

Quote: Originally posted by jimjwright on Jan 21, 2017
It looks like the makefile is referencing gcc which is the gnu c compiler so you can try the following link.

https://gcc.gnu.org/

In the past it was a pain to set up gcc on a windows computer but maybe its simpler now. I think you had to install Cygwin first but its been a long time since I installed gcc. I exclusively develop on Mac now for last 3 years. Go to you tube and search for installing gcc on windows if you want to go this route.

I was able to compile as is without any modifications on my Mac computer which has gcc as part of Xcode. A linux installation would also typically have gcc installed.

I am sure you could download the latest Microsoft community edition and enable C++ support and you can tweak the makefile to compile.

https://www.visualstudio.com/vs/community/

Jimmy

Agree with jim!

I'm probably here unless I'm not.
Dreaming would be a perfectly useless function if it's only purpose was to entertain.
RL-RANDOMLOGIC

United States
Member #59,352
March 13, 2008
5,626 Posts
Offline

Jan 22, 2017, 12:25 am

JW

It's been awhile, I tried downloading the express version of studio 2005 but the link seemed to be dead.

I was hoping to stay away from the gcc but looks like that may be the only way. My borland turbo C++

is older than the hills and think it's too dated. I was sure it would do the job but after editing the makefile

it reported 25 errors. I also tried to compile each test one at a time but not to be. I am thinking of porting

the code to qb64 but need to study it a little more. It would make a nice addon to the CF-251 program and

allow me to test on the fly. I will give the gcc a try but not crossing my fingers, I have spend a couple days

trying to make what I have work without any success. Try to drop in more often, miss your post. Be very

interested in what you think about the overall idea.

RL

....
garyo1954

Texas
United States
Member #4,549
May 2, 2004
4,228 Posts
Offline

Jan 22, 2017, 3:11 am

Quote: Originally posted by RL-RANDOMLOGIC on Jan 22, 2017
JW

It's been awhile, I tried downloading the express version of studio 2005 but the link seemed to be dead.

I was hoping to stay away from the gcc but looks like that may be the only way. My borland turbo C++

is older than the hills and think it's too dated. I was sure it would do the job but after editing the makefile

it reported 25 errors. I also tried to compile each test one at a time but not to be. I am thinking of porting

the code to qb64 but need to study it a little more. It would make a nice addon to the CF-251 program and

allow me to test on the fly. I will give the gcc a try but not crossing my fingers, I have spend a couple days

trying to make what I have work without any success. Try to drop in more often, miss your post. Be very

interested in what you think about the overall idea.

RL

Went ahead and tried Visual Studio Community. Defaults to C#.

After it downloads, installs, does the initial setup, and the initial start up you have the option to install other modules. MO, more headache than any one person should go through.

Go to the Microsoft Download Center. They have Visual Studio 2012 Pro and 2012 Ultimate.

Download Center

I'm probably here unless I'm not.
Dreaming would be a perfectly useless function if it's only purpose was to entertain.
jimjwright

Crested Butte, CO
United States
Member #69,862
January 18, 2009
1,394 Posts
Online

Jan 22, 2017, 4:39 am

Quote: Originally posted by RL-RANDOMLOGIC on Jan 22, 2017
JW

It's been awhile, I tried downloading the express version of studio 2005 but the link seemed to be dead.

I was hoping to stay away from the gcc but looks like that may be the only way. My borland turbo C++

is older than the hills and think it's too dated. I was sure it would do the job but after editing the makefile

it reported 25 errors. I also tried to compile each test one at a time but not to be. I am thinking of porting

the code to qb64 but need to study it a little more. It would make a nice addon to the CF-251 program and

allow me to test on the fly. I will give the gcc a try but not crossing my fingers, I have spend a couple days

trying to make what I have work without any success. Try to drop in more often, miss your post. Be very

interested in what you think about the overall idea.

RL

I modified the makefile to work with Visual Studio 2015 Professional, I assume it will work with Community Edition also. I can share it if anyone needs it.

Visual Studio probably requires at a minimum Windows 7 or later.

As Garyo said once you install Visual Studio Community Edition, you also have to do the following to install C++

https://www.youtube.com/watch?v=U1MRUiJAOq0

I am not familiar with any of the other older C compilers.

As far as your overall goal in theory it is probably worth the research to see if there is any bias/weakness.

Good luck !!!

Jimmy
RL-RANDOMLOGIC

United States
Member #59,352
March 13, 2008
5,626 Posts
Offline

Jan 22, 2017, 7:21 am

JW

Can you send me a link in a PM?. I not sure how much more I can gain using this vs the stuff I wrote but anything

is worth a look. I think the reason I could not download the file was because I was working on a XP machine.

RL

....
Sunglasses

Belgium
Member #173,925
March 26, 2016
1,396 Posts
Offline

Jan 22, 2017, 10:08 am

Quote: Originally posted by RL-RANDOMLOGIC on Jan 21, 2017
Hi all

In the past I have tested many secondary data sets to measure it's predictability using my own tools

but a few weeks ago I ran across a battery of tools that may work better than what I am using. Here

is a link to the NIST webpage http://csrc.nist.gov/groups/ST/toolkit/rng/stats_tests.html The National

Institute of Standards and Technology.

First I would like to know if anyone has a link to a free C compiler that will compile the tools available for

download. I have turbo C++ which is capible of compiling C code but I get a number of errors when I try

to run makefile. I would like to find something that will run the code without making any changes except

the two lines in makefile that need to be linked ti the compiler being used. The setup instructions are

located in section 5 of the PDF which is very well documented. The PDF can be downloaded by clicking the

top link on the download page.

The software can accept a string of ascii such as 111010101001010011110000101010 and evaluate it

for it's randomness. The tool can run all 15 test or just a few depending on ones needs. The 15 test are

shown on the main page with a brief explanation.

I always try to test the data I analyze to see if it's somewhat predictable and these test should be very useful

as I am working on a new predictor but want to know up front what I am up against. Below is a sample of the

data I am attempting to predict/forecast the next value. The two columns of data below will be formatted into

strings, top to bottom for 14 total strings. Each string will then be analyzed using the 15 test to help decide

which prediction algorithm would work best. The date here is far less random then the actual numbers they

represent. What I am trying to do is calculate which type of analysis would work best. I am thinking that some

of the test will indicate the data is random while others will find it's not. Lets say that runs or (1) or runs of (0)

show the data to be random but another test shows some weakness. The predictor code should be built around

the test showing the weakest which should improve effectiveness of the predictor used.

I don't know if others have used this type of analysis before but would very much like to know your thoughts or

results on this method. I guess one could sum it up to making predictions based on the weakest random elements

of the data being analyzed.

RL

0101101 0001111
0011001 0011011
0110010 0101100
1100001 0100100
1000110 0001110
1111111 0111111
0001101 0110000
0000001 1001100
1100101 0011100
0010110 0100010
1010011 0000011
0101100 1010011
0100101 1000001
0111100 0100010
0111010 1001011
1001011 0011011
1011110 0111101
1110000 0001001
1011110 0101100
1011010 0010001
1010100 0001000
0001111 0101101
0010001 0011001
0111010 0111011
0110100 0000011
0000100 0011110
0000101 1001100
1111011 0001101
0110001 0100000
1000011 0001100
0101111 1000111
0011011 0101111
1100001 0000011
0010001 0110010
1100111 0101101
1011101 0000011
0110111 0111100
1000111 0101001
0011000 0111110
1001010 0000100
0011011 0010011
0111010 0100000
0011001 1011101
0011111 0000101
0111000 0110000

You can find c-compilers on Internet, get a free student version that counts for non professional use. I don't use C as I didn't learn it, but it seems to be less complicated than j. What is, where is section 5, which pdf, which link to where? Evaluate binary string for randomness can only be returning its probability. There you are flipping coins, but when the 1111111111111111111111111 happens, don't complain that it isn't in the big middle chunk. Of course you can change the base too, then you get chunks of binary strings. There is an ASCI-table, just which one, what is in it today? 64-bit or 128-bit? Or bigger? Eventually decimal?
lottoswe

Sweden
Member #163,016
January 17, 2015
156 Posts
Offline

Jan 22, 2017, 5:20 pm

Hi RL,

i found something called branch predictor but i don't know to much about it. I try to understand text on wiki. There is two level adaptive predictor sound interesting. If you know something about it, can you explain to me in layman terms how it working?

thank you
RL-RANDOMLOGIC

United States
Member #59,352
March 13, 2008
5,626 Posts
Offline

Jan 22, 2017, 8:58 pm

lottswe

I have played around with many prediction tools and IMHO none of them are useful for predicting lottery numbers.

Random by definition is not predictable and the problem is this. Most prediction tools can work out the the best

solution to play based on the games history but we should not expect random to follow the algorithms output. In

other words the best has nothing to with what will happen next. Yes, some predictors do hit now and then but

even broken clocks show the correct time twice each day.

With all that said I do think ones play can be enhanced and there are many things that will help us play smarter

lines but don't expect any of them to put you in the winners circle anytime soon.

What I have came to conclude is that to make a good prediction we need to work on the the less than random parts

of the data. All data can be broken down into parts and the converted data may contain information that is not random.

It's these parts that can be predicted and while any one may not give much of a reduction in the final output, combining

many of these can have a big effect on the overall possible.

Any data that falls somewhere short of random, predicting it becomes somewhat possible. Lets say we have 60 subsets

of converted data and 20% fall into the weak or less than random category. It's those values that we should be looking

to predict. Here we have another problem, 20% won't reduce the lines to a single prediction so we need many such sub

sets of data. The problem is not insurmountable because we can build more and more subsets to reduce the lines to a

playable amount.

After 25+ years analyzing random this is the sum of my findings. Don't try to predict the unpredictable, work on what is.

All prediction algorithms are useful if matched with the correct data provided the data is not random. To put this in simple

terms is about all I can do as I am not a mathematician. I often find my math skills lacking for many projects and have to hit

the books. I don't use mathematical terminology in my post. I learn what I need to understand the concept but just enough

so that I can code it. Teaching it is beyond my level of understanding.

RL

....
RL-RANDOMLOGIC

United States
Member #59,352
March 13, 2008
5,626 Posts
Offline

Jan 23, 2017, 7:52 am

After a series of test I am thinking the method might work better as a filter. The number of correct

predictions needed to reach the average budget would remove any advantage as any one miss will

remove the top level prize.

RL

....
RL-RANDOMLOGIC

United States
Member #59,352
March 13, 2008
5,626 Posts
Offline

Jan 23, 2017, 3:16 pm

Quote: Originally posted by lottoswe on Jan 22, 2017
Hi RL,

i found something called branch predictor but i don't know to much about it. I try to understand text on wiki. There is two level adaptive predictor sound interesting. If you know something about it, can you explain to me in layman terms how it working?

thank you

lottoswe

I took some time to read the article on the branch predictor and it's not the type of predictor one might experiment

with for lottery predictions. I am sure it could be modified to make some sort of lottery prediction tool but there are

many others that would be better suited to do the job. IMHO the best suited for lottery prediction is the "Association

rule learning" types. The best prediction tool I ever wrote used many association rules that were hard coded which

were then feed into a type of bayesian naive algorithm. Very simple stuff but it worked better than anything else I

ever tried. It stopped working at some point but think the problem was caused by a coding error that I never really

took the time to find. For years I planned to rewrite it but never did.

RL

....
lottoswe

Sweden
Member #163,016
January 17, 2015
156 Posts
Offline

Jan 24, 2017, 12:59 pm

Quote: Originally posted by RL-RANDOMLOGIC on Jan 23, 2017
lottoswe

I took some time to read the article on the branch predictor and it's not the type of predictor one might experiment

with for lottery predictions. I am sure it could be modified to make some sort of lottery prediction tool but there are

many others that would be better suited to do the job. IMHO the best suited for lottery prediction is the "Association

rule learning" types. The best prediction tool I ever wrote used many association rules that were hard coded which

were then feed into a type of bayesian naive algorithm. Very simple stuff but it worked better than anything else I

ever tried. It stopped working at some point but think the problem was caused by a coding error that I never really

took the time to find. For years I planned to rewrite it but never did.

RL

Hi,

as i understand branch predictor collecting history of 00,01,10,11 and have some counter then decide what to choose. In lottery ,in first position is ,let's say this sequence

5,5,4,9,8,2,5,6,1,1,3,2. If 0 = even and 1 = odd then sequence is 1 1 0 1 0 0 1 0 1 1 1 0. Last two are 10 , 0 is first digit in next . There is only 01 or 00 with first digit 0. History counter will decide between 01 and 00.

That's how i see this.

ARL types looks like "prediction professor" Allan Lichtman system with 13 true/false statements. I need to read more about it.

lottoswe
Sunglasses

Belgium
Member #173,925
March 26, 2016
1,396 Posts
Offline

Jan 24, 2017, 2:00 pm

Let's do a game. We play 1 or 0. You bet on the right outcome and you get your bet times 1.1 as win.

Five drawings can be one of those:

00000-00001-00010-00011-00100-00101-00110-00111-01000-01001-01010-01011-01100-01101-01110-01111-10000-10001-10010-10011-10100-10101-10110-10111-11000-11001-11010-11011-11100-11101-11110-11111

If you set up to play another five drawings, it is one of those above. If 00000 was drawn, and you are so pissed off to see no 1 coming up, that doesn't mean that you get a 1 next, no matter what is your imagined expectancy. So, what ever you invent, the outcome can be different.

Let's say, your first bet is 0. I say 1 comes out.
Next you bet 1. I say zero comes out.
... That's what is in your mind and that you get.
lottoswe

Sweden
Member #163,016
January 17, 2015
156 Posts
Offline

Jan 24, 2017, 5:07 pm

Quote: Originally posted by Sunglasses on Jan 24, 2017
Let's do a game. We play 1 or 0. You bet on the right outcome and you get your bet times 1.1 as win.

Five drawings can be one of those:

00000-00001-00010-00011-00100-00101-00110-00111-01000-01001-01010-01011-01100-01101-01110-01111-10000-10001-10010-10011-10100-10101-10110-10111-11000-11001-11010-11011-11100-11101-11110-11111

If you set up to play another five drawings, it is one of those above. If 00000 was drawn, and you are so pissed off to see no 1 coming up, that doesn't mean that you get a 1 next, no matter what is your imagined expectancy. So, what ever you invent, the outcome can be different.

Let's say, your first bet is 0. I say 1 comes out.
Next you bet 1. I say zero comes out.
... That's what is in your mind and that you get.

What?!

You sound like dr san ;-)

New Topic New Poll

Subscribe to this topic

Testing data for randomness / predictabilityPrev TopicNext Topic

Testing data for randomness / predictabilityPrev Topic Next Topic