You last visited June 19, 2013, 4:11 pm All times shown are Eastern Time (GMT-5:00) | Help with Prediction NeededEdinburgh United Kingdom Member #97862 September 24, 2010 20 Posts Offline | | Posted: July 3, 2012, 2:11 am - IP Logged | |
Hi, I'm looking for help on the following problem (it's similar to the lottery): I'm following a phenomenon and I'm taking readings 15 times a day. Each reading can be classed A, B or C. So, every day I'm getting a string of 15 readings. The readings are evenly spread during the day and they are independent. They don't influence each other in any way. So far, I have n days and n readings. I need to predict the string (the whole row) on day n+1. Here's an example of what I'm getting: Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 Y10 Y11 Y12 Y13 Y14 Y15 ========================================================================== B C A B A B B C A C B C C B C C C A B B C B C C A A C B A A C A B B A C B C C B C A C A C C B B C B A B A C A C C B A B B A C C B C C B A C A A C B A A A B A B B A C B A B C A A A B A A C B A B B A A B A A B A C B A B C C C C B B C B A A A A A C B A B B C C C A C B A B A B A B C B C B C A A B C C B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B C C C A A B C C B C A C B B C B C C B A C B A C B A C A A B C C C A C C A A B A B B C A A B C A B A C A A C A B C A A C B B B B C C B C B A A B B A C B B A B C B A C B C C A B B A B B C C B B C C A B C C C B
Shall I treat this as a matrix? Shall I treat each column independently bearing in mind that the readings are independent? How many lines should I have in order to be able to make a prediction? Any ideas would be apreciated. Or even a nudge in the right direction would do me so I don't waste any time. I can do a little bit of programming in QBasic (a bit obsolete, no graphics, but it works fine). I'm ready to help in exchange for good suggestions. Thanks. martor854 | | |
Aruba Member #123712 February 27, 2012 1799 Posts Offline | | Posted: July 3, 2012, 12:17 pm - IP Logged | |
Hi, I'm looking for help on the following problem (it's similar to the lottery): I'm following a phenomenon and I'm taking readings 15 times a day. Each reading can be classed A, B or C. So, every day I'm getting a string of 15 readings. The readings are evenly spread during the day and they are independent. They don't influence each other in any way. So far, I have n days and n readings. I need to predict the string (the whole row) on day n+1. Here's an example of what I'm getting: Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 Y10 Y11 Y12 Y13 Y14 Y15 ========================================================================== B C A B A B B C A C B C C B C C C A B B C B C C A A C B A A C A B B A C B C C B C A C A C C B B C B A B A C A C C B A B B A C C B C C B A C A A C B A A A B A B B A C B A B C A A A B A A C B A B B A A B A A B A C B A B C C C C B B C B A A A A A C B A B B C C C A C B A B A B A B C B C B C A A B C C B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B C C C A A B C C B C A C B B C B C C B A C B A C B A C A A B C C C A C C A A B A B B C A A B C A B A C A A C A B C A A C B B B B C C B C B A A B B A C B B A B C B A C B C C A B B A B B C C B B C C A B C C C B
Shall I treat this as a matrix? Shall I treat each column independently bearing in mind that the readings are independent? How many lines should I have in order to be able to make a prediction? Any ideas would be apreciated. Or even a nudge in the right direction would do me so I don't waste any time. I can do a little bit of programming in QBasic (a bit obsolete, no graphics, but it works fine). I'm ready to help in exchange for good suggestions. Thanks. martor854 What does Y stand for? What do A, B and C stand for? What does the word day stand for? | | |
Edinburgh United Kingdom Member #97862 September 24, 2010 20 Posts Offline | | Posted: July 3, 2012, 1:11 pm - IP Logged | |
Hi SergeM, Thanks for taking the time to have a look at my problem. Here are your answers: Y1, Y2,... = Column headers, instead of ordinal numerals (I, II,... or #1, #2,...). A, B, and C are readings. Imagine a meter that shows A at the moment of the first reading, C for the second, B for the third, a.s.o. This meter can only show A, B or C. The readings are independent of each other. It's like a random generator problem. "day" is a day (24 hours or a working day). I don't think it's relevant to solving the problem. The point is that reading always start at the same moment (your choice) in the day and the 15 readings are evenly spaced (0.5 hour, 1 hour, a.s.o., again, your choice). In other words, 1 day = 1 row in our table/matrix. Since readings are independent of each other, I would treat each column separately, like a time series. I wonder whether Fourier transforms can be applied to linear time series? Or, perhaps, some software? Or any other method. Ideas, welcome! Regards, martor854 | | |
Aruba Member #123712 February 27, 2012 1799 Posts Offline | | Posted: July 3, 2012, 1:25 pm - IP Logged | |
Hi SergeM, Thanks for taking the time to have a look at my problem. Here are your answers: Y1, Y2,... = Column headers, instead of ordinal numerals (I, II,... or #1, #2,...). A, B, and C are readings. Imagine a meter that shows A at the moment of the first reading, C for the second, B for the third, a.s.o. This meter can only show A, B or C. The readings are independent of each other. It's like a random generator problem. "day" is a day (24 hours or a working day). I don't think it's relevant to solving the problem. The point is that reading always start at the same moment (your choice) in the day and the 15 readings are evenly spaced (0.5 hour, 1 hour, a.s.o., again, your choice). In other words, 1 day = 1 row in our table/matrix. Since readings are independent of each other, I would treat each column separately, like a time series. I wonder whether Fourier transforms can be applied to linear time series? Or, perhaps, some software? Or any other method. Ideas, welcome! Regards, martor854 A, B and C have to show the readings. What did you measure with A, B and C? | | |
mid-Ohio United States Member #9 March 24, 2001 16130 Posts Offline | | Posted: July 3, 2012, 3:59 pm - IP Logged | |
Sounds like something I did years ago when working as an industrial engineering technician taking work samples. You need to chart your data to see if any occurs most often or can be associated with a particular time. You may come up with even more ideas once you start charting the data. * The fundamentals of winning a lottery jackpot * * play a lottery you can win *
| | |
Edinburgh United Kingdom Member #97862 September 24, 2010 20 Posts Offline | | Posted: July 4, 2012, 1:12 am - IP Logged | |
Hi RJOh, Sounds interesting. Can you suggest anything? Any direction I can start searching? Thanks. Regards, martor854 | | |
United Kingdom Member #71134 February 7, 2009 734 Posts Offline | | Posted: July 5, 2012, 12:30 pm - IP Logged | |
Hi, I'm looking for help on the following problem (it's similar to the lottery): I'm following a phenomenon and I'm taking readings 15 times a day. Each reading can be classed A, B or C. So, every day I'm getting a string of 15 readings. The readings are evenly spread during the day and they are independent. They don't influence each other in any way. So far, I have n days and n readings. I need to predict the string (the whole row) on day n+1. Here's an example of what I'm getting: Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 Y10 Y11 Y12 Y13 Y14 Y15 ========================================================================== B C A B A B B C A C B C C B C C C A B B C B C C A A C B A A C A B B A C B C C B C A C A C C B B C B A B A C A C C B A B B A C C B C C B A C A A C B A A A B A B B A C B A B C A A A B A A C B A B B A A B A A B A C B A B C C C C B B C B A A A A A C B A B B C C C A C B A B A B A B C B C B C A A B C C B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B C C C A A B C C B C A C B B C B C C B A C B A C B A C A A B C C C A C C A A B A B B C A A B C A B A C A A C A B C A A C B B B B C C B C B A A B B A C B B A B C B A C B C C A B B A B B C C B B C C A B C C C B
Shall I treat this as a matrix? Shall I treat each column independently bearing in mind that the readings are independent? How many lines should I have in order to be able to make a prediction? Any ideas would be apreciated. Or even a nudge in the right direction would do me so I don't waste any time. I can do a little bit of programming in QBasic (a bit obsolete, no graphics, but it works fine). I'm ready to help in exchange for good suggestions. Thanks. martor854 Hi martor854, I have sent you a sample worksheet, if that is not what you are looking for please advise. Regards, billybouy... Sometimes we can't see the woods for tree's, " so we have to clear a path " | | |
Edinburgh United Kingdom Member #97862 September 24, 2010 20 Posts Offline | | Posted: July 6, 2012, 2:16 pm - IP Logged | |
Thanks billibuoy, Got it. I'll have a look and get back to you. Keep well and keep up the good work. Best regards, martor854 | | |
mid-Ohio United States Member #9 March 24, 2001 16130 Posts Offline | | Posted: July 7, 2012, 6:26 am - IP Logged | |
Hi RJOh, Sounds interesting. Can you suggest anything? Any direction I can start searching? Thanks. Regards, martor854 Without knowing more about the phenomenon that's giving you those readings, it would be hard to say. * The fundamentals of winning a lottery jackpot * * play a lottery you can win *
| | |
Edinburgh United Kingdom Member #97862 September 24, 2010 20 Posts Offline | | Posted: July 7, 2012, 12:38 pm - IP Logged | |
Hi RJOh, It all started from a real phenomenon but I'm after a generally applicable algorithm. Regards, martor854 | | |
mid-Ohio United States Member #9 March 24, 2001 16130 Posts Offline | | Posted: July 13, 2012, 10:20 pm - IP Logged | |
Hi RJOh, It all started from a real phenomenon but I'm after a generally applicable algorithm. Regards, martor854 Then your best bet is to backward engineer each result and see if one algorithm comes up more than any other. This won't be 100% but I think if it's concerning lotteries then even 10% would be excellent. * The fundamentals of winning a lottery jackpot * * play a lottery you can win *
| | |
Edinburgh United Kingdom Member #97862 September 24, 2010 20 Posts Offline | | Posted: July 14, 2012, 2:24 am - IP Logged | |
Hi RJOh, That is the best answer yet. It should be applicable to almost anything of the sort, including the lottery. Could you give me an idea on how I should start? Some indication on what to read, maybe? Also, we can on it together. I can do a bit of programming in QBasic. It may not be much but it's simple and fast. Thank you. martor854 | | |
Pittsburgh, PA United States Member #130606 July 20, 2012 37 Posts Offline | | Posted: July 30, 2012, 12:09 am - IP Logged | |
<quote>'m looking for help on the following problem (it's similar to the lottery): I'm following a phenomenon and I'm taking readings 15 times a day. Each reading can be classed A, B or C. So, every day I'm getting a string of 15 readings. The readings are evenly spread during the day and they are independent. They don't influence each other in any way. So far, I have n days and n readings. I need to predict the string (the whole row) on day n+1</quote> Hey Martor854, I like this problem and started brainstorming some ideas. I have to admit about a half hour in I had more questions then answers. lol. I'm no genius and I still have a lot to learn. But I wanted to ask you about some of what you said, quoted above and below. First off, speaking strictly mathematcs here, I was under the impression that independent events have no influence on each other, thus using past data to predict future data isn't really possible. (Tongue in cheek here since we are on lottery post which is exactly what most people are trying to do, including myself). That being said, my question is, what is a preferred method in the general sense to predict future outcomes of independant events from past independant events? And I'm not being a smart a**, I mean literally what method seems to be preferred? I don't know what a matrix is. Other then basically a list of numbers in a box or tablature style format (as you have listed your results) does "treating the numbers as a matrix" mean something specific? I'm unfamiliar with that idea. Now for a couple of thoughts... I think columns and rows are each independent in your example and can (and should) be treated as such. In other words, if I wanted to track the frequency of an occorrunce of a reading in particular, I would track it both for a day and for a "time" of day (or position). But I would also try to track every iteration of possible frequency. Which I'm not sure if that is possible or practical, but if we can assume certain things, such as, each value has to occur at least once every day. Then we can safely assume that there will never be more than 13 of a value on a given day, which is a finite number to work with. So I would track the frequency of A followed by B, A followed by C, and A followed by A (ad iterum for B, C) and I would also track occurences of Triples+ (or any occurence when A follows AA or more) and I would track these occurences both throughout a day and a day's position, (or simply rows and columns). As for how many lines should you have in order to be able to make a prediction? Well technically I think if you had a perfect algorithm essentially you would need 3 days. Depending on how much of the previous data is integeral to your algorithm. Perhaps with a perfect algorithm 1 would suffice. My initial thought was to track occurences (in rows and columns, including the occurences of "following" not just "showing") of each value and track it across time (daily and weekly) using an XmR chart. Perhaps a compound XmR chart with each value overlaid the next with different colors to more aptly see if there is any distinction. Since the results are obtained from some mechanism and mechanisms are not always 100%, there is a good chance that each value has a higher rate of occurence at a specific time or interval. A compound XmR chart will clearly show this if it is the case. With that data in hand, I would tie a value range from .0 to .99 to each time and day and value. At .99 (or close) I would say indicates a high likelihood of that value occuring in that spot. Determining how to increment the value of A, B, and C, on each day and time slot will be a little tricky but I would base my increments predominantly on the data obtained from the frequencies found using multiple XmR charts. (*** It should be noted that this method really is only going to be effective if there are charecteristics of events that determine the "readings". If these events are strictly speaking random, then chances are this approach is not going to be very effective. If these events aren't Random, only unknown because there is too much data to crunch then this would be my preferred method) That's how I would START the problem. Not saying it would SOLVE the problem. Essentially the charts will either indicate nothing of significance, a little something of significance or a great deal of significance. From there I would try to extrapolate an algorithm for the occurence of each value in and of itself. With all that data maybe a solution would begin to show itself. That's my two cents. :) <quote>Shall I treat this as a matrix? Shall I treat each column independently bearing in mind that the readings are independent? How many lines should I have in order to be able to make a prediction? Any ideas would be apreciated. Or even a nudge in the right direction would do me so I don't waste any time. I can do a little bit of programming in QBasic (a bit obsolete, no graphics, but it works fine). I'm ready to help in exchange for good suggestions.</quote> | | |
Edinburgh United Kingdom Member #97862 September 24, 2010 20 Posts Offline | | Posted: July 30, 2012, 2:59 am - IP Logged | |
Hi AlgorithmGuru, Thank you for your contribution. I find it very fair and sensible. I have my doubts, too, that past, independent occurrences can help with predicting new ones. So, all that remains is probability. But, maybe I’m wrong. Your view is interesting but I don’t know what an XmR chart is. I’d appreciate it if you could give an example or indicate where I can get one. Intuitively, I’m inclined to go for Markov chains/model (like somebody else suggested) as I think our problem is similar to predicting the weather. I’m only scratching the surface. My lack of training in maths will make it difficult, though. I must find a “hands-on” example. If I make any progress, I'll keep everybody posted.
Best regards, martor854 | | |
Pittsburgh, PA United States Member #130606 July 20, 2012 37 Posts Offline | | Posted: July 31, 2012, 12:43 am - IP Logged | |
Hey martor. There is a wealth of information online about XmR charts or "Individual moving range" charts. It basically means you are charting data that has one source, (for instance daily sales, or in your case, daily frequency of an event and NOT an average recording, such as the average temperature today). It's a great method because it not only tracks the individual readings but also the movement between readings. Which is what can help to show patterns. If all the numbers plot within the limits of the XmR chart, then the "process" is considered stable. There are specific formulas for plotting the limits on an XmR chart. I encourage you to look them up online. I'm not able to post links to websites yet, but I can tell you if you google "Xmr on Excel" there is a great explanation of how to set up an XmR chart in excel (It's the first hit) and also information on how to understand the data and further methods of examining the chart once you have it constructed. As a side note, I was first introduced to XmR charts by a book Called "Understanding Variation - The Key to Managing Chaos" written by Donald J. Wheeler. The book is in fact nothing more then a practical introduction to XmR charts and why other charts fail in the same setting. It gives a lot of insight on understanding the data. It was published in 2000 by SPC press. At any rate. Good luck. :) | | |
|