You will need to replace Florida drawings in the tab labeled Draws.

Replace dates, and draws in cell columns D & H. Adjust the formulas for the dates in cell column D

The Tab labeled DigitPositionAnalysis has the charts.

This Thanksgiving I want to take this opportunity to express my gratitude to Steve winsumloosesum for his generosity in sharing his professional wisdom & expertise with us so often on LP +

You always made a difference

Wishing you the Best of the Best of continued success!!!

United States Member #81314 October 16, 2009 21934 Posts Offline

Posted: November 27, 2014, 4:56 pm - IP Logged

Quote: Originally posted by eddessaknight on November 27, 2014

This Thanksgiving I want to take this opportunity to express my gratitude to Steve winsumloosesum for his generosity in sharing his professional wisdom & expertise with us so often on LP +

You always made a difference

Wishing you the Best of the Best of continued success!!!

If you've studied Statistics, one thing you know is that the laws of probability all start with statements like "In a perfect world" and "Given that all things are equal". These are amazing qualifiers. What they are saying is that in order to achieve the exact results we expect, there can be no variables between what we calculate and the conditions of the trials. In other words, we have to take into account every difference between the balls and each trial in order to calculate the exact probabilities of the outcomes.

This is an extremely important observation for the simple reason that we do not live in a perfect world and that not everything will be equal. No matter how diligently the balls are marked and weighed, each one is going to be a little different from the rest. Further, the conditions under which the trial (drawing) takes place will not be identical from one to the next.

There is something of a raging debate on this point, but it remains that every ball has a different weight and balance. Further, the physics of each ball will vary -- one will be more round than another, slightly different in size and have different frictional characteristics. Likewise, the barometric pressure, temperature and humidity will all vary from one draw to the next, etc.

There are, in short, a host of variables we have not taken into account when calculating the probabilities for the game. The bottom line is that what we expect is based on a perfect world we do not live in and, therefor, the results we expect will not exactly match the results we see. In fact, it is the very realization that we do not live in a perfect world that permits us to compare the calculated expectations with the actual events in order to spot trends and patterns.

For what it is worth, the calculated probabilities are called theoretical probabilities because we base them on the theory of a perfect world where all things are equal. The results of our observation of the game, on the other hand, are called empirical evidence (or probability). This raises the obvious question: "How do we find the empirical probabilities in the game?".

This question leads us back, again, to the definition of probability: The number of things we are looking for divided by the number of things we are looking among, or, the number of correct results divided by the number of possible results. Thus, to find the empirical probability of something happening, we go over the history of the game and count how many times the right result took place and then divide by the number of draws we looked through.

For example, suppose we wanted to know the empirical probability of a one ball appearing in the first position. We would go through the history of the game and every time we see the one ball in the first position we make a mark on a piece of paper. When we're done, we count the number of marks and divide by the number of draws in the history. Thus, if we have eighteen weeks of draws (126 draws) and the one ball appeared in the first position 13 times, the empirical probability of a one ball in the first position is 13:126 or 1:10. This happens to be exactly the case over the last eighteen weeks of the New York Daily (the draws from 12/13/94 through 4/19/95).

However, the same does not hold true of all numbers. If we look at the last eighteen weeks in New York, we find that the zero ball has appeared in the first position nineteen times and the five ball has only appeared in the first position nine times. That is, at a minimum, a very significant difference. The empirical probability of a five ball in the first position is 1:14 while the same for the zero ball is 1:6. Over the same eighteen weeks, we find that the six ball has also appeared in the first position just eight times for an empirical probability of 1:15. It, too, is behaving strangely.

For whatever reason, it would appear that in the real world, the zero is a better bet than the five or six as the first ball. In fact, the probability of a zero in the first position is better than that of the five or six combined (1:6 -vs.- 1:7). We have no way of knowing why this is happening. We can only observe the results and make the best possible use of what we find. The difference might be nothing more than the result of the strange things that random events can do, or, it may be an indication of a clear difference between the one ball's and the five ball's behaviors. Is there some way to differentiate between "The luck of the draw" and an actual physical difference?

Average and Mean

Remember that when we talk about probability, we are talking about what happens "on average". Further, that average is just the number of times we see a particular result divided by the number of trials. Thus, we see the five and six balls in one of every fifteen draws, on average. Likewise, we see the zero ball in one of every six draws.

The average, as we have already seen, is a simple division. It is just the number of times we see the result (frequency) divided by the number of results. A very good example of average would be talking about the average family income in your neighborhood. You might go around the area asking all your neighbors how much they earn a year. When you were done, you would add up all of their answers and divide that by the number of neighbors you had asked (we'll say 20 neighbors). Perhaps the average family income in your neighborhood is $40,000 a year.

Suppose, however, that one of your neighbors recently hit the Super Lotto for $5 million. In this case, their income would be about $250,000 a year. Being down to earth, friendly, folks -- they choose to remain in their home and not let the money change their lifestyle. What does this do to the average income of the neighborhood? Since we has asked 20 people about their income, the lucky families winnings would make the average income $50,000 a year in place of the $40,000. Is the average correct and does it fairly report what is "normal" in the neighborhood?

Obviously, the average is misleading us into believing that the families in this area are better off than they really are. There is, however, another way we can look at the problem that may give a more honest answer. We could use the median income in place of the average income. By definition, the median tells us what is exactly half way between the highest and lowest extremes in the results we've obtained. Therefor, it offers what may be a clearer picture of "normal". Finding the median is, fortunately, not difficult. However, we can not do it with a simple equation. Rather, we'll need paper and pencil. We start by writing down the incomes reported by the neighbors in order from the lowest income to the highest income. We then pick the income that is in the middle. In other words, since we asked 20 families their incomes, we take the one that is tenth or eleventh in the list. As a result, exactly half the incomes reported will be less then (or equal to) the median and exactly half will be more than (or equal to) the median. More often than not, the median will be closer to the truth.

The important thing to realize about the use of the median is that it allows for the unusual or unexpected. In other words, in this neighborhood where one family has an income much higher than the rest, their income does not have a significant effect on the result.

We can also apply the median view to the lottery and, perhaps, answer the question of whether or not the lack of the five and six are just freak events or if there is a real pattern in their unusual behavior. Likewise, we can use the median to see if the zero ball is really as hot as we think, or also a freak event.

In order to do this, we have to do a little more than just count how many times the zero, five or six appear. We must also keep track of how many draws take place between each appearance. We scan through the history of the game and look for the first five. Once we find it, we count how many draws until the five shows again. Then start the count back at zero and count how many times until the next appearance of the five and so on until the end of the history. Once we've found all the occurrences of the five, we rearrange the results from the shortest number of draws between appearances to the longest number of draws between shows. When we're done, we pick the number of draws exactly in the middle of the list. We then would do the same for the six and the zero.

Remember that probability was talking about the average and tells us how often something happens. Thus, if the probability of a five appearing in the first position was 1:14, we expect to see the five in one out of every fourteen draws. On the other hand, when we worked out the median for the five, we actually counted the number of draws between appearances and picked the one in the middle. The important thing is that both the average and the median are talking about the number of draws between appearances.

If the five is behaving in a settled pattern, the average and the median will be nearly the same. if, on the other hand, the five ran in a streak and then stopped entirely (or visa-versa), the average and median will be very different. Likewise, the zero might have turned up in a slew of consecutive draws then stopped drawing. This would account for the higher probability of the zero but mislead us into thinking it was more common than it really is. The key is this: If the average and median are close to one another, the results are likely to be normal behavior. If, on the other hand, they are widely different, the results indicate something unusual took place and the median is a better statement of what is normal.

Returning to the last eighteen weeks of the New York daily, we find several interesting things about the zero, the five and the six. The zero has both an average and a median probability of 1:6. It is, simply, a hot number. Likewise, the six has an average probability of 1:15 and a median of 1:12. It appears to be a cold number. However, the five has an average probability of 1:14 and a median probability of 1:3! Had we not looked at the median, we might have been very misled into thinking the five was a bad play while it appears to be one of the better ones in reality.

More than likely, what happened was the five went for a long stretch without appearing for one reason or another. However, according to the median, it does typically appear more often than the zero does. Therefor, the five is actually a better number to play than the zero. Similar revelations of the median clue us into the fact that the hottest number in the game is not the zero (which occurred most often over the last eighteen weeks) but is actually the five which is followed closely by the nine (which only appeared 13 times). The nine has an average probability of 1:8 (worse than the zero's 1:6) but also has a median of 1:4 (better than the zero's 1:6). We'll also find that the three (drawn 16 times) appears more frequently at 1:5 than the zero. Further, seven has been drawn roughly half as often as the zero (at 11 times) yet has a median of 1:6 which is identical to the zero.

Frequency and Age

What we've really been talking about, when discussing probability, average and mean, is frequency. Again, we've been discussing how often things happen. In other words, the frequency of the zero is 1 in 6 draws, the five is 1:3, the 3 is 1:5, the seven is 1:6 and the six is 1:12. However, if the seven was drawn less often than the zero but both have the same median frequency, the seven must have gone for a spell without drawing. Likewise, the five has been drawn far less than the zero yet tends to appear more frequently. Was that in the beginning of the eighteen weeks, the end of the period or somewhere in the middle?

We find the answer to that question by determining the age of the seven and the five. In other words, how long has it been since the seven or five appeared in the first position? To do this, we simply start at the end of the eighteen weeks and count draws backwards until the seven appears in the first position and then do the same for the five. We find that the current age of the seven is twelve -- there have been twelve draws since the seven appeared. Likewise, the current age of the zero is eight and the current age of the five is zero (the five appeared in the most recent draw).

Since the median number of draws for both the seven and the zero is 1:6, it would appear that the zero is acting about normal (at an age of eight) while the seven is well overdue (it's current age is twice the median). In fact, the seven has a current age that is greater than it's average frequency of 1:10. From this we can conclude that the seven has recently grown cold. At the same time, we'll find that the five appeared in the most recent draw and has also appeared in the same position in the two draws before it. Thus, the five has recently been very hot in the first position and the probability of seeing it, yet again, in the first position are extremely low (1:10,000). We can be fairly sure that the five is not a good play in the first position for the next draw. However, we might want to consider it again in the very near future.

The clue is found in a comparison of the current age and the median. As the current age grows closer to the median, the number is coming due. As the current age passes the median, the number becomes overdue. Finally, when the current age becomes a multiple of the median (i.e. twice the median or three times the median), the number is a particularly good bet. There are, of course, many other factors that come into play and style will vary from one player to the next.

Statistical Summary

What is important to keep in mind is that we are dealing with events that, while not actually random, are close enough for all practical purposes. All of the "facts" we collect about a game (age, average and median probabilities, etc.) are nothing more than generalizations that serve to describe typical behavior in "the long run". None of these permit us to say with certainty that the next draw will be one thing or another. However, if we make careful use of these statistics, we can find ourselves playing with greater accuracy and winning more often than randomly stabbing at the game.

Nice read, but perplexed and disappointed at the last paragraph ' Statistical Summary'.

The author's preceding paragraphs epitomized ' randomness'( real vs theoretical, prior and post trial conditions etc), so how can he/she surmise this statement 'What is important to keep in mind is that we are dealing with events that, while not actually random, are close enough for all practical purposes'.

Randomness is not devoid of PERCEPTION of PATTERNS, the human brain looks for patterns for the sake of orientation and survivability. Our daily activities is pretty much random with a routine, for example, you

drive a certain route to work everyday (routine), do you think the wheels of your car will hit the same spot of route , is the pressure of the tires the same, even it hit the same spot?(conditions differ for each period). We

age and change everyday (routine), our bodies are different each passing second (Hard to accept for most).

The author is not totally convinced of his own statement ' dealing with events that, while not actually random'

What events is he referring to? Real world(random) or theoretical assumptions.

If you've studied Statistics, one thing you know is that the laws of probability all start with statements like "In a perfect world" and "Given that all things are equal". These are amazing qualifiers. What they are saying is that in order to achieve the exact results we expect, there can be no variables between what we calculate and the conditions of the trials. In other words, we have to take into account every difference between the balls and each trial in order to calculate the exact probabilities of the outcomes.

This is an extremely important observation for the simple reason that we do not live in a perfect world and that not everything will be equal. No matter how diligently the balls are marked and weighed, each one is going to be a little different from the rest. Further, the conditions under which the trial (drawing) takes place will not be identical from one to the next.

There is something of a raging debate on this point, but it remains that every ball has a different weight and balance. Further, the physics of each ball will vary -- one will be more round than another, slightly different in size and have different frictional characteristics. Likewise, the barometric pressure, temperature and humidity will all vary from one draw to the next, etc.

There are, in short, a host of variables we have not taken into account when calculating the probabilities for the game. The bottom line is that what we expect is based on a perfect world we do not live in and, therefor, the results we expect will not exactly match the results we see. In fact, it is the very realization that we do not live in a perfect world that permits us to compare the calculated expectations with the actual events in order to spot trends and patterns.

For what it is worth, the calculated probabilities are called theoretical probabilities because we base them on the theory of a perfect world where all things are equal. The results of our observation of the game, on the other hand, are called empirical evidence (or probability). This raises the obvious question: "How do we find the empirical probabilities in the game?".

This question leads us back, again, to the definition of probability: The number of things we are looking for divided by the number of things we are looking among, or, the number of correct results divided by the number of possible results. Thus, to find the empirical probability of something happening, we go over the history of the game and count how many times the right result took place and then divide by the number of draws we looked through.

For example, suppose we wanted to know the empirical probability of a one ball appearing in the first position. We would go through the history of the game and every time we see the one ball in the first position we make a mark on a piece of paper. When we're done, we count the number of marks and divide by the number of draws in the history. Thus, if we have eighteen weeks of draws (126 draws) and the one ball appeared in the first position 13 times, the empirical probability of a one ball in the first position is 13:126 or 1:10. This happens to be exactly the case over the last eighteen weeks of the New York Daily (the draws from 12/13/94 through 4/19/95).

However, the same does not hold true of all numbers. If we look at the last eighteen weeks in New York, we find that the zero ball has appeared in the first position nineteen times and the five ball has only appeared in the first position nine times. That is, at a minimum, a very significant difference. The empirical probability of a five ball in the first position is 1:14 while the same for the zero ball is 1:6. Over the same eighteen weeks, we find that the six ball has also appeared in the first position just eight times for an empirical probability of 1:15. It, too, is behaving strangely.

For whatever reason, it would appear that in the real world, the zero is a better bet than the five or six as the first ball. In fact, the probability of a zero in the first position is better than that of the five or six combined (1:6 -vs.- 1:7). We have no way of knowing why this is happening. We can only observe the results and make the best possible use of what we find. The difference might be nothing more than the result of the strange things that random events can do, or, it may be an indication of a clear difference between the one ball's and the five ball's behaviors. Is there some way to differentiate between "The luck of the draw" and an actual physical difference?

Average and Mean

Remember that when we talk about probability, we are talking about what happens "on average". Further, that average is just the number of times we see a particular result divided by the number of trials. Thus, we see the five and six balls in one of every fifteen draws, on average. Likewise, we see the zero ball in one of every six draws.

The average, as we have already seen, is a simple division. It is just the number of times we see the result (frequency) divided by the number of results. A very good example of average would be talking about the average family income in your neighborhood. You might go around the area asking all your neighbors how much they earn a year. When you were done, you would add up all of their answers and divide that by the number of neighbors you had asked (we'll say 20 neighbors). Perhaps the average family income in your neighborhood is $40,000 a year.

Suppose, however, that one of your neighbors recently hit the Super Lotto for $5 million. In this case, their income would be about $250,000 a year. Being down to earth, friendly, folks -- they choose to remain in their home and not let the money change their lifestyle. What does this do to the average income of the neighborhood? Since we has asked 20 people about their income, the lucky families winnings would make the average income $50,000 a year in place of the $40,000. Is the average correct and does it fairly report what is "normal" in the neighborhood?

Obviously, the average is misleading us into believing that the families in this area are better off than they really are. There is, however, another way we can look at the problem that may give a more honest answer. We could use the median income in place of the average income. By definition, the median tells us what is exactly half way between the highest and lowest extremes in the results we've obtained. Therefor, it offers what may be a clearer picture of "normal". Finding the median is, fortunately, not difficult. However, we can not do it with a simple equation. Rather, we'll need paper and pencil. We start by writing down the incomes reported by the neighbors in order from the lowest income to the highest income. We then pick the income that is in the middle. In other words, since we asked 20 families their incomes, we take the one that is tenth or eleventh in the list. As a result, exactly half the incomes reported will be less then (or equal to) the median and exactly half will be more than (or equal to) the median. More often than not, the median will be closer to the truth.

The important thing to realize about the use of the median is that it allows for the unusual or unexpected. In other words, in this neighborhood where one family has an income much higher than the rest, their income does not have a significant effect on the result.

We can also apply the median view to the lottery and, perhaps, answer the question of whether or not the lack of the five and six are just freak events or if there is a real pattern in their unusual behavior. Likewise, we can use the median to see if the zero ball is really as hot as we think, or also a freak event.

In order to do this, we have to do a little more than just count how many times the zero, five or six appear. We must also keep track of how many draws take place between each appearance. We scan through the history of the game and look for the first five. Once we find it, we count how many draws until the five shows again. Then start the count back at zero and count how many times until the next appearance of the five and so on until the end of the history. Once we've found all the occurrences of the five, we rearrange the results from the shortest number of draws between appearances to the longest number of draws between shows. When we're done, we pick the number of draws exactly in the middle of the list. We then would do the same for the six and the zero.

Remember that probability was talking about the average and tells us how often something happens. Thus, if the probability of a five appearing in the first position was 1:14, we expect to see the five in one out of every fourteen draws. On the other hand, when we worked out the median for the five, we actually counted the number of draws between appearances and picked the one in the middle. The important thing is that both the average and the median are talking about the number of draws between appearances.

If the five is behaving in a settled pattern, the average and the median will be nearly the same. if, on the other hand, the five ran in a streak and then stopped entirely (or visa-versa), the average and median will be very different. Likewise, the zero might have turned up in a slew of consecutive draws then stopped drawing. This would account for the higher probability of the zero but mislead us into thinking it was more common than it really is. The key is this: If the average and median are close to one another, the results are likely to be normal behavior. If, on the other hand, they are widely different, the results indicate something unusual took place and the median is a better statement of what is normal.

Returning to the last eighteen weeks of the New York daily, we find several interesting things about the zero, the five and the six. The zero has both an average and a median probability of 1:6. It is, simply, a hot number. Likewise, the six has an average probability of 1:15 and a median of 1:12. It appears to be a cold number. However, the five has an average probability of 1:14 and a median probability of 1:3! Had we not looked at the median, we might have been very misled into thinking the five was a bad play while it appears to be one of the better ones in reality.

More than likely, what happened was the five went for a long stretch without appearing for one reason or another. However, according to the median, it does typically appear more often than the zero does. Therefor, the five is actually a better number to play than the zero. Similar revelations of the median clue us into the fact that the hottest number in the game is not the zero (which occurred most often over the last eighteen weeks) but is actually the five which is followed closely by the nine (which only appeared 13 times). The nine has an average probability of 1:8 (worse than the zero's 1:6) but also has a median of 1:4 (better than the zero's 1:6). We'll also find that the three (drawn 16 times) appears more frequently at 1:5 than the zero. Further, seven has been drawn roughly half as often as the zero (at 11 times) yet has a median of 1:6 which is identical to the zero.

Frequency and Age

What we've really been talking about, when discussing probability, average and mean, is frequency. Again, we've been discussing how often things happen. In other words, the frequency of the zero is 1 in 6 draws, the five is 1:3, the 3 is 1:5, the seven is 1:6 and the six is 1:12. However, if the seven was drawn less often than the zero but both have the same median frequency, the seven must have gone for a spell without drawing. Likewise, the five has been drawn far less than the zero yet tends to appear more frequently. Was that in the beginning of the eighteen weeks, the end of the period or somewhere in the middle?

We find the answer to that question by determining the age of the seven and the five. In other words, how long has it been since the seven or five appeared in the first position? To do this, we simply start at the end of the eighteen weeks and count draws backwards until the seven appears in the first position and then do the same for the five. We find that the current age of the seven is twelve -- there have been twelve draws since the seven appeared. Likewise, the current age of the zero is eight and the current age of the five is zero (the five appeared in the most recent draw).

Since the median number of draws for both the seven and the zero is 1:6, it would appear that the zero is acting about normal (at an age of eight) while the seven is well overdue (it's current age is twice the median). In fact, the seven has a current age that is greater than it's average frequency of 1:10. From this we can conclude that the seven has recently grown cold. At the same time, we'll find that the five appeared in the most recent draw and has also appeared in the same position in the two draws before it. Thus, the five has recently been very hot in the first position and the probability of seeing it, yet again, in the first position are extremely low (1:10,000). We can be fairly sure that the five is not a good play in the first position for the next draw. However, we might want to consider it again in the very near future.

The clue is found in a comparison of the current age and the median. As the current age grows closer to the median, the number is coming due. As the current age passes the median, the number becomes overdue. Finally, when the current age becomes a multiple of the median (i.e. twice the median or three times the median), the number is a particularly good bet. There are, of course, many other factors that come into play and style will vary from one player to the next.

Statistical Summary

What is important to keep in mind is that we are dealing with events that, while not actually random, are close enough for all practical purposes. All of the "facts" we collect about a game (age, average and median probabilities, etc.) are nothing more than generalizations that serve to describe typical behavior in "the long run". None of these permit us to say with certainty that the next draw will be one thing or another. However, if we make careful use of these statistics, we can find ourselves playing with greater accuracy and winning more often than randomly stabbing at the game.

Thanks for posting this winsumloosesum, and thanks to Scott Pie as well!

I appreciate this, because its a stellar example of the desire and ability to communicate complex ideas effectively, especially for the uninitiated. My favorite college professors were those who could do that. To see a fellow student with an intimidated look in the eyes at the beginning of a semester, and then to see the same student with eyes ablaze at the end was a constant delight for me.

"Study Nature, love Nature, stay close to Nature. It will never fail you."