<quote>'m looking for help on the following problem (it's similar to the lottery): I'm following a phenomenon and I'm taking readings 15 times a day. Each reading can be classed A, B or C. So, every day I'm getting a string of 15 readings. The readings are evenly spread during the day and they are independent. They don't influence each other in any way. So far, I have n days and n readings. I need to predict the string (the whole row) on day n+1</quote>
Hey Martor854, I like this problem and started brainstorming some ideas. I have to admit about a half hour in I had more questions then answers. lol. I'm no genius and I still have a lot to learn. But I wanted to ask you about some of what you said, quoted above and below. First off, speaking strictly mathematcs here, I was under the impression that independent events have no influence on each other, thus using past data to predict future data isn't really possible. (Tongue in cheek here since we are on lottery post which is exactly what most people are trying to do, including myself). That being said, my question is, what is a preferred method in the general sense to predict future outcomes of independant events from past independant events? And I'm not being a smart a**, I mean literally what method seems to be preferred? I don't know what a matrix is. Other then basically a list of numbers in a box or tablature style format (as you have listed your results) does "treating the numbers as a matrix" mean something specific? I'm unfamiliar with that idea.
Now for a couple of thoughts... I think columns and rows are each independent in your example and can (and should) be treated as such. In other words, if I wanted to track the frequency of an occorrunce of a reading in particular, I would track it both for a day and for a "time" of day (or position). But I would also try to track every iteration of possible frequency. Which I'm not sure if that is possible or practical, but if we can assume certain things, such as, each value has to occur at least once every day. Then we can safely assume that there will never be more than 13 of a value on a given day, which is a finite number to work with. So I would track the frequency of A followed by B, A followed by C, and A followed by A (ad iterum for B, C) and I would also track occurences of Triples+ (or any occurence when A follows AA or more) and I would track these occurences both throughout a day and a day's position, (or simply rows and columns). As for how many lines should you have in order to be able to make a prediction? Well technically I think if you had a perfect algorithm essentially you would need 3 days. Depending on how much of the previous data is integeral to your algorithm. Perhaps with a perfect algorithm 1 would suffice.
My initial thought was to track occurences (in rows and columns, including the occurences of "following" not just "showing") of each value and track it across time (daily and weekly) using an XmR chart. Perhaps a compound XmR chart with each value overlaid the next with different colors to more aptly see if there is any distinction. Since the results are obtained from some mechanism and mechanisms are not always 100%, there is a good chance that each value has a higher rate of occurence at a specific time or interval. A compound XmR chart will clearly show this if it is the case. With that data in hand, I would tie a value range from .0 to .99 to each time and day and value. At .99 (or close) I would say indicates a high likelihood of that value occuring in that spot. Determining how to increment the value of A, B, and C, on each day and time slot will be a little tricky but I would base my increments predominantly on the data obtained from the frequencies found using multiple XmR charts. (*** It should be noted that this method really is only going to be effective if there are charecteristics of events that determine the "readings". If these events are strictly speaking random, then chances are this approach is not going to be very effective. If these events aren't Random, only unknown because there is too much data to crunch then this would be my preferred method)
That's how I would START the problem. Not saying it would SOLVE the problem. Essentially the charts will either indicate nothing of significance, a little something of significance or a great deal of significance. From there I would try to extrapolate an algorithm for the occurence of each value in and of itself. With all that data maybe a solution would begin to show itself. That's my two cents. :)
<quote>Shall I treat this as a matrix? Shall I treat each column independently bearing in mind that the readings are independent? How many lines should I have in order to be able to make a prediction? Any ideas would be apreciated. Or even a nudge in the right direction would do me so I don't waste any time. I can do a little bit of programming in QBasic (a bit obsolete, no graphics, but it works fine). I'm ready to help in exchange for good suggestions.</quote>