I was scratching two scratchoff games a little while back that won me $20 and $10. The $10 win was from a Bingo style game where there were hidden symbols under a call area and the game grids would actually have the same symbol both above and beneath the latex with the latex simply a convenient means for you to mark the symbols that match. I started thinking about what can I tell about this card before I scratch.
There's the obvious, the top prize $75,000 prize would be by having a diagonal in a 5x5 grid hence you know that the five symbols in that line can not produce another complete line on any of the game grids otherwise the prize would be greater than the top prize. There are many other such relationships as even if you have a multiple win of which there's usually only a $10 + $10 or a $5 + $5 win, some combinations would simply produce totals that do not correspond with an entry on the prize table i.e.: no card will win $9 even though there are both $5 and $4 prize lines. Hence there are a number of qualifying statements that the ticket must match. Of course it would be very unusual for the symbols to be such that you could rule out the ticket as a winner for various prizes so this certainly isn't a reliable method for detecting anything.
However, I then realized that this is really a simple substitution code because of the 34 symbols, 22 are in the call area, 1 is in the bonus area and the remaining 11 are essentially losing symbols that would disqualify the prize for the lines it was in. When decrypting a substitution code, one uses the prevalence of a symbol to assign it a probability as to which symbol it really is. Obviously, since the game grids are not a language, there isn't an equivalent method of assigning probabilities or is there? To win that $75,000 prize, you must get five symbols and the probability of having that $75,000 prize is published in that the number of tickets printed in total and the number of tickets printed with a $75,000 prize are known hence that diagonal line gives you the equation:
P(a) * P(b) * P(c) * P(d) * P(e) = P($75k prize)
Each of the lines present such an equation and such equations can be solved by first taking a logarithm of both sides resulting in equations of the form:
log( P(a) ) + log( P(b) ) + log( P(c) ) + log( P(d) ) + log( P(e) ) = log( P($75k prize) )
There are 48 such lines that can net a prize hence 48 equations and you need 34 equations to solve for 34 symbols. The idea is to rate the symbols by probability that they are in the call area, take the 12 or 13 least likely to be under the latex and evaluate the plausibility of the ticket for the 12 choose 11 or 13 choose 11 possibilities. The system of equations may not be deterministic hence solvable but you only need to order the symbols by priority so approximate solutions are acceptable hence non-linear trial and error techniques can be employed though a 38 dimensional gradient is difficult to envision.
The first problem is that although the top prize of $75k is available only from one line, that's not necessarily true with the lesser prizes. An inquiry to various lottery commissions indicates that some lottery commissions, if contacted directly will tell you how many of the prizes were printed for each game area and even how many were printed for multiple wins and what those wins would be i.e.: a $5 + $5 or a $10 + $10 etc. A quick look at the card shows that that would make it one line per prize probability with the exception of the $500 prize where there were still grids with two ways to win that prize so there is the need to handle an equation of the following form:
P(a) * P(b) * P(c) + P(b) * P(c) * P(d) = P(prize)
This is simplified to represent a case where two lines in a 3x3 grid share the same prize level and two of the same symbols each. In reality, the 3x3 grid have no such shared prize level but the 4x4 and 5x5 grids do and the lines may or may not share some symbols.
Well the way to handle this is to try and factor out as many terms as you can and leave the remainder as a new symbol to solve for i.e.:
P(b) * P(c) * ( P(a) + P(d) ) = P(prize)
giving
log( P(b) ) + log( P(c) ) + log( P(a) + P(d) ) = log( P(prize) )
Hence in theory, it's possible to attribute probabilities to the symbols as to whether or not they are under the latex in the call area in order to consider the most probable combinations as to whether or not they are plausible by heuristic conditions that rule out impossible prize levels. If you look at a game card after you've scratched all the call symbols, you often note that had one more symbol been a call symbol, you would've had a prize but if you look more closely, often if one more had been a call symbol the game grids would describe more than one prize that would add up to a non-existent prize level such as $1,106 hence it's reasonable to believe that combinations can be ruled out till the most probable solution that is still plausible can be found.
This would seem to indicate that it may be possible to write a program where you enter in the symbols of the game grids, allow it to crunch some numbers and have it report back as to how probable it was for the card to be a loser, win $4, $5, $10, $20 etc. Seems perfect for cherry picking in jurisdictions where the tickets are separated and sold from a display case. Granted the system of equations may be time consuming to solve (possibly hours) so the co-operation of the store clerk to allow you the time to analyze the tickets would be needed and the laptop would require a good new battery or there must be a power outlet nearby.
However, while cogitating on this decryption concept, I purchased three more tickets to think about and two of them turned out to have spot for spot identical symbols in the game grids. The symbols in the call area differed so one won $4 while the other was a loser. This means that there's probably a limited number of patterns for the game grids and the call symbols under the latex are arranged according to whether the ticket is meant to be a winner or a loser. The causality is in the wrong direction!!! The call symbols are determined by the game grid symbols and by whether or not they intended the ticket to be a winner or not hence the probability levels that the program would produce as to whether the ticket was a loser or winner of some prize is meaningless. Oh well, so much for dreams of becoming a millionaire through $20 wins. Of course, it may be worth while to save this method for a book to sell to the unwary.