**Benford’s Law and The Lottery**

*“Benford's law, also called the first-digit law, states that in lists of numbers from many real-life sources of data, the leading digit is ***1*** almost one-third of the time, and larger numbers occur as the leading digit with less and less frequency as they grow in magnitude, to the point that 9 is the first digit less than one time in twenty.** **This counter-intuitive result applies to a wide variety of figures, including electricity bills, street addresses, stock prices, population numbers, death rates, lengths of rivers, physical and mathematical constants, and processes described by power laws (which are very common in nature).** **It is named after physicist Frank Benford, who stated it in 1938, although it had been previously stated by Simon Newcomb in 1881. The first rigorous formulation and proof appears to be due to Theodore P. Hill in 1988.”…*Wikipedia.org (an online encyclopedia)

According to Benford's Law, the leading digits should be distributed (in base 10) according to the following expression:

**LOG((D+1)/D)**

The “D” in the expression simply means a specific Digit (1-9). To see how the digit “1” should be distributed, the formula would look like this: **log((1+1)/1).** This equals .3010, which is 30.10%. For the digit “2”, the formula would be **log((2+1)/2)**…this equals .1761 or 17.61%.

When we use the formula to calculate the distributions for all of the digits 1 through 9, we get the following table:

**Digit** | **Frequency** | ## Percent |

1 | 0.3010 | 30.10% |

2 | 0.1761 | 17.61% |

3 | 0.1249 | 12.49% |

4 | 0.0969 | 9.69% |

5 | 0.0792 | 7.92% |

6 | 0.0669 | 6.69% |

7 | 0.0580 | 5.80% |

8 | 0.0512 | 5.12% |

9 | 0.0458 | 4.58% |

As the table implies, the leading digits found in a large sample of statistical data will be very unevenly distributed…with far more numbers starting with a digit one than with a digit nine! As to exactly why this occurs, it can be chalked up to *the probability of probability.*

Benford’s Law is intriguing in its many possible applications. One of its most interesting uses is that of fraud detection! This is simply done by measuring the values of a large data series against the expected results as calculated by Benford’s Law. Read the following three links to get a better understanding of what Benford’s Law is and how it’s used…..pay particular attention to how these sites imply that Benford’s Law DOES NOT apply to the lottery……Then come back and read the rest of this post. ** **

**http://plus.maths.org/issue9/features/benford/index.html**

**http://ddrive.cs.dal.ca:9999/page/lvl3/13**

**http://www.accountancyireland.ie/dsp_articles.cfm/goto/1101/page/Fraud**

_Detection_with_Benfords_Law.htm

The articles basically insinuate that you shouldn’t be tempted to play combinations that start with the digit 1 just because Benford’s Law says that the digit 1 should occur much more often than the other digits…*“The outcome of the lottery is truly random, meaning that every possible lottery number has an equal chance of occurring. The leading-digit frequencies should therefore, in the long run, be in exact proportion to the number of lottery numbers starting with that digit.”***… PLUS.MATHS.ORG**

If you look at the frequencies of digits based on the actual digit printed on the balls, then yes, they will appear almost evenly across the entire spectrum at a proportion that is equal to the number of combinations that start with that digit. This is a no-brainer to most lottery players. Of course each ball and or combination will appear almost equally over a long period of time! Duh!

However, when one realizes that the numbers on the balls are nothing more than simple pictures and that the time between the occurrences of the pictures is more important than the pictures themselves, then one can understand that the *probability of probability* is more controlling to the game than randomness itself is. With that in mind, Benford’s Law DOES apply to lottery games and here is how we can observe it:

First, you must disregard the actual numbers on the balls, at least to the extent that the frequency of the actual numbers are not what you are tracking. In Pick 3, we don’t expect the digit 1 to be drawn more often than the digit 8. We also don’t expect a combo like 137 to be drawn more than the combo 589 just because the combo 137 starts with a leading digit of 1. What we really want to track is the time between the hits of the number on the balls or combinations that they comprise! In other words, we want to track the skips, not the ball numbers themselves…this is where Benford’s Law of First-Digits exists within the lotto! It’s really all about hits and skips.

When applying Benford’s Law to skips, we first have to outline the possibilities for the skips. This is pretty simple since what we are tracking will be the Leading Digit of a skips end. According to the “Law”, we should see around 30.10% of all skips end with a leading digit value of 1 (if what I’m saying about “The Law” applying to the skips is true). So first we need to figure out how many possible skips there are and where they all end.

Lets suppose that the last straight Pick 3 number drawn was 764. When is the next time it could be drawn? It could be drawn the very next game which is one game later or a back-to-back repeat. It could be drawn two games later. Actually, it may very well wait thousands of or even longer! The point here is that there are certain ranges for skips to end that all contain the same leading digit. I should point out that zero is an impossible leading digit for a skip to end in this environment. If a straight number hits then repeats and hits the very next game, it is not counted as a skip of zero, but rather it is said to have hit exactly one game after its last hit, which is recorded as a 1.

So, for all intent and purposes, the skips will end with the leading digits of 1 through 9. Here is how the ranges for all skips break down in the Pick 3:

### Skips Ending in Leading Digit of 1

1 game later

10 through 19

100 through 199

1,000 through 1,999

Total Possibilities = 1,111

### Skips Ending in Leading Digit of 2

2 games later

20 through 29

200 through 299

2,000 through 2,999

Total Possibilities = 1,111

### Skips Ending in Leading Digit of 3

3 games later

30 through 39

300 through 399

3,000 through 3,999

Total Possibilities = 1,111

### Skips Ending in Leading Digit of 4

4 games later

40 through 49

400 through 499

4,000 through 4,999

Total Possibilities = 1,111

### Skips Ending in Leading Digit of 5

5 games later

50 through 59

500 through 599

5,000 through 5,999

Total Possibilities = 1,111

### Skips Ending in Leading Digit of 6

6 games later

60 through 69

600 through 699

6,000 through 6,999

Total Possibilities = 1,111

### Skips Ending in Leading Digit of 7

7 games later

70 through 79

700 through 799

7,000 through 7,999

Total Possibilities = 1,111

### Skips Ending in Leading Digit of 8

8 games later

80 through 89

800 through 899

8,000 through 8,999

Total Possibilities = 1,111

### Skips Ending in Leading Digit of 9

9 games later

90 through 99

900 through 999

9,000 through 9,999

Total Possibilities = 1,111

In this break down, we have a total of 9 leading digits, which gives us 9 distinct groups of 1,111 possibilities each. These 9 groups combine to give us a total of 9,999 possibilities for any skip of a straight Pick 3 number to end. Realistically, we could extend the ranges for each of the leading digits even further by adding the range 10,000 through 19,999 to the leading digit one ranges and then 20,000 through 29,999 to the leading digit 2 ranges and so on. This really isn’t necessary though because as far as I know, a Pick 3 straight has only made it out past 10,000 games only one time in Pick 3 history.

**Building the Sample**

Testing the Pick 3 for adherence to Benford’s Law is actually a pretty straightforward process. The first state I tested was Ohio. I starting from the very first game ever held, which was on 12/03/79, and included all Pick 3 drawings through the date of testing (10/14/06). This gave me a total of 10,626 consecutive games (midday and evening combined). Each game on the list was given a drawing number, starting with the very first game as drawing #1 and the last or current drawing as #10,626. The results were obviously listed in the order they took place in. Next, the list was sorted by the numeric value of the straight combos in ascending order (…012, then 013, then 014…etc.) and then their corresponding drawing number in descending order.

This allowed the skips of each straight to be calculated in another column by using the simple IF command in Excel. Supposing that the combos appear in column F and the drawing numbers in column G, the formula *=IF(F2=F3,G2-G3,G2)* looks to see if the combo in F2 is the same combo in F3. If it is the same combo, then it subtracts the drawing number of G3 from G2, effectively giving you the skip of the combo in F2 since the time of its last hit. If the combo in F3 does not equal the combo in F2, then the drawing number for the combo in F2 (found in cell G2) is displayed because that means it was the first time the combo hit. Note: the only time when the two combos being compared would not actually equal each other is when the two combos are different. The Fill command is next used to apply the formula to every game on the sorted list. Below is very small portion of the list and how it appears after it has been sorted and the formula has been applied:

Once the formula was filled to the bottom of the list and all the skips were calculated, I copied the entire “Skip” column (column H) and pasted only the values back over themselves in order to remove the formulas from each of the cells but at the same time retaining the values they created. It did this so I could re-sort the list without losing the skip values. Re-sorting the list isn’t really necessary for the test but I have a slight compulsion to keep things in the order that they occurred in. I re-sorted the list by selecting columns B through H and then selecting Data, Sort, and then choosing to sort by “Game” Ascending. This keeps the list sorted from drawing #1 to drawing #10,626 in order of occurrence and shows the number of games ago (skip) that each combo last hit. Below is a sample of the re-sorted list:

Reading off the list above, look at the evening drawing for 10/12/06 (evening draws are in the lighter blue font). The combo drawn was 563, which was drawing #10,623. In the skip column you can see that it was last drawn 11 games prior. You can verify this number simply by counting up 11 cells where you will see that combo 563 was just drawn during drawing #10,612. Now that the list of skips has been created, all we have to do is count the number of skips that fall into our specific skip ranges. I performed this task by creating a different worksheet that applied DCOUNT functions to the list. Once the counts were completed for each of the nine skip ranges, the totals of each were divided by the total games in the sample (10,626). This gives the percentage of total games (of the entire history) that each of the leading digits of the skips account for. The final step in all of this is to simply graph out the data to see just how closely the game follows Benford’s Law. So how closely does Ohio’s Pick 3 follow it? Look at the graph below and see the results for yourself.

As you can see, it’s pretty close! I must say that after WIN D made the first post regarding Benford’s Law that I knew it would apply to lottery games, especially Pick 3. I was already somewhat familiar with the bias, I just didn’t know exactly how to apply it. Ohio is not the only state who’s Pick 3 follows Benford’s Law. Virtually every state that I’ve tested follows it just as closely. Here’s a few more:

There was one state in particular that I was just itching to test. After reading how Benford’s Law can be used to detect fraud, I though it would be interesting to see how closely INDIANAS lottery measured up. There seems to be a lot of suspicion here on LP that the Indiana Pick 3 and Pick 4 are rigged. While I haven’t yet tested their Pick 4 game, their Pick 3 holds up just as good as any of the other states I tested. This surprised me. I was actually hoping to discover (and prove) that it was rigged by using Benford’s Law, which by the way, is something that’s not supposed to even exist in lottery games—LOL!

Aside from the four states shown above, I have also tested the Pick 3’s of Michigan, Pennsylvania and New Hampshire (Tri-State). The graphs for each of these three states look just like those above. I’m sure that the law is universal in every states game and I can just about guarantee that every lotto game, no matter its type or its odds, will also follow Benford’s Law when observing the skips.

There is much more that can be said about Benford’s Law and how it applies to Lottery. I will soon be adding more to this post that will go into more detail explaining exactly why it exists and how it can be used as a possible strategy. I will also be testing how it applies to the individual digits, the pairs and also boxed combinations as well.

**~Probability=Odds in Motion~**