Welcome Guest
Log In | Register )
You last visited January 22, 2017, 9:22 pm
All times shown are
Eastern Time (GMT-5:00)

Help with Prediction Needed

Topic closed. 18 replies. Last post 4 years ago by martor854.

Page 1 of 2
PrintE-mailLink
Avatar
Edinburgh
United Kingdom
Member #97833
September 24, 2010
41 Posts
Offline
Posted: July 3, 2012, 2:11 am - IP Logged

Hi,

I'm looking for help on the following problem (it's similar to the lottery): I'm following a phenomenon and I'm taking readings 15 times a day. Each reading can be classed A, B or C. So, every day I'm getting a string of 15 readings. The readings are evenly spread during the day and they are independent. They don't influence each other in any way. So far, I have n days and n readings. I need to predict the string (the whole row) on day n+1.

Here's an example of what I'm getting:

Y1   Y2   Y3   Y4   Y5   Y6   Y7   Y8   Y9   Y10  Y11  Y12  Y13  Y14  Y15
==========================================================================
B    C    A    B    A    B    B    C    A    C    B    C    C    B    C
C    C    A    B    B    C    B    C    C    A    A    C    B    A    A
C    A    B    B    A    C    B    C    C    B    C    A    C    A    C
C    B    B    C    B    A    B    A    C    A    C    C    B    A    B
B    A    C    C    B    C    C    B    A    C    A    A    C    B    A
A    A    B    A    B    B    A    C    B    A    B    C    A    A    A
B    A    A    C    B    A    B    B    A    A    B    A    A    B    A
C    B    A    B    C    C    C    C    B    B    C    B    A    A    A
A    A    C    B    A    B    B    C    C    C    A    C    B    A    B
A    B    A    B    C    B    C    B    C    A    A    B    C    C    B
.    .    .    .    .    .    .    .    .    .    .    .    .    .    .
.    .    .    .    .    .    .    .    .    .    .    .    .    .    .
.    .    .    .    .    .    .    .    .    .    .    .    .    .    .
B    C    C    C    A    A    B    C    C    B    C    A    C    B    B
C    B    C    C    B    A    C    B    A    C    B    A    C    A    A
B    C    C    C    A    C    C    A    A    B    A    B    B    C    A
A    B    C    A    B    A    C    A    A    C    A    B    C    A    A
C    B    B    B    B    C    C    B    C    B    A    A    B    B    A
C    B    B    A    B    C    B    A    C    B    C    C    A    B    B
A    B    B    C    C    B    B    C    C    A    B    C    C    C    B

Shall I treat this as a matrix? Shall I treat each column independently bearing in mind that the readings are independent? How many lines should I have in order to be able to make a prediction? Any ideas would be apreciated. Or even a nudge in the right direction would do me so I don't waste any time. I can do a little bit of programming in QBasic (a bit obsolete, no graphics, but it works fine). I'm ready to help in exchange for good suggestions.

Thanks.

martor854

    SergeM's avatar - slow icon.png
    Economy class
    Belgium
    Member #123700
    February 27, 2012
    4035 Posts
    Offline
    Posted: July 3, 2012, 12:17 pm - IP Logged

    Hi,

    I'm looking for help on the following problem (it's similar to the lottery): I'm following a phenomenon and I'm taking readings 15 times a day. Each reading can be classed A, B or C. So, every day I'm getting a string of 15 readings. The readings are evenly spread during the day and they are independent. They don't influence each other in any way. So far, I have n days and n readings. I need to predict the string (the whole row) on day n+1.

    Here's an example of what I'm getting:

    Y1   Y2   Y3   Y4   Y5   Y6   Y7   Y8   Y9   Y10  Y11  Y12  Y13  Y14  Y15
    ==========================================================================
    B    C    A    B    A    B    B    C    A    C    B    C    C    B    C
    C    C    A    B    B    C    B    C    C    A    A    C    B    A    A
    C    A    B    B    A    C    B    C    C    B    C    A    C    A    C
    C    B    B    C    B    A    B    A    C    A    C    C    B    A    B
    B    A    C    C    B    C    C    B    A    C    A    A    C    B    A
    A    A    B    A    B    B    A    C    B    A    B    C    A    A    A
    B    A    A    C    B    A    B    B    A    A    B    A    A    B    A
    C    B    A    B    C    C    C    C    B    B    C    B    A    A    A
    A    A    C    B    A    B    B    C    C    C    A    C    B    A    B
    A    B    A    B    C    B    C    B    C    A    A    B    C    C    B
    .    .    .    .    .    .    .    .    .    .    .    .    .    .    .
    .    .    .    .    .    .    .    .    .    .    .    .    .    .    .
    .    .    .    .    .    .    .    .    .    .    .    .    .    .    .
    B    C    C    C    A    A    B    C    C    B    C    A    C    B    B
    C    B    C    C    B    A    C    B    A    C    B    A    C    A    A
    B    C    C    C    A    C    C    A    A    B    A    B    B    C    A
    A    B    C    A    B    A    C    A    A    C    A    B    C    A    A
    C    B    B    B    B    C    C    B    C    B    A    A    B    B    A
    C    B    B    A    B    C    B    A    C    B    C    C    A    B    B
    A    B    B    C    C    B    B    C    C    A    B    C    C    C    B

    Shall I treat this as a matrix? Shall I treat each column independently bearing in mind that the readings are independent? How many lines should I have in order to be able to make a prediction? Any ideas would be apreciated. Or even a nudge in the right direction would do me so I don't waste any time. I can do a little bit of programming in QBasic (a bit obsolete, no graphics, but it works fine). I'm ready to help in exchange for good suggestions.

    Thanks.

    martor854

    What does Y stand for?
    What do A, B and C stand for?
    What does the word day stand for?

      Avatar
      Edinburgh
      United Kingdom
      Member #97833
      September 24, 2010
      41 Posts
      Offline
      Posted: July 3, 2012, 1:11 pm - IP Logged

      Hi SergeM,

      Thanks for taking the time to have a look at my problem. Here are your answers:

      Y1, Y2,... = Column headers, instead of ordinal numerals (I, II,... or #1, #2,...).

      A, B, and C are readings. Imagine a meter that shows A at the moment of the first reading, C for the second, B for the third, a.s.o. This meter can only show A, B or C. The readings are independent of each other. It's like a random generator problem.

      "day" is a day (24 hours or a working day). I don't think it's relevant to solving the problem. The point is that reading always start at the same moment (your choice) in the day and the 15 readings are evenly spaced (0.5 hour, 1 hour, a.s.o., again, your choice). In other words, 1 day = 1 row in our table/matrix.

      Since readings are independent of each other, I would treat each column separately, like a time series. I wonder whether Fourier transforms can be applied to linear time series? Or, perhaps, some software? Or any other method. Ideas, welcome!

      Regards,

      martor854

        SergeM's avatar - slow icon.png
        Economy class
        Belgium
        Member #123700
        February 27, 2012
        4035 Posts
        Offline
        Posted: July 3, 2012, 1:25 pm - IP Logged

        Hi SergeM,

        Thanks for taking the time to have a look at my problem. Here are your answers:

        Y1, Y2,... = Column headers, instead of ordinal numerals (I, II,... or #1, #2,...).

        A, B, and C are readings. Imagine a meter that shows A at the moment of the first reading, C for the second, B for the third, a.s.o. This meter can only show A, B or C. The readings are independent of each other. It's like a random generator problem.

        "day" is a day (24 hours or a working day). I don't think it's relevant to solving the problem. The point is that reading always start at the same moment (your choice) in the day and the 15 readings are evenly spaced (0.5 hour, 1 hour, a.s.o., again, your choice). In other words, 1 day = 1 row in our table/matrix.

        Since readings are independent of each other, I would treat each column separately, like a time series. I wonder whether Fourier transforms can be applied to linear time series? Or, perhaps, some software? Or any other method. Ideas, welcome!

        Regards,

        martor854

        A, B and C have to show the readings.

        What did you measure with A, B and C?

          RJOh's avatar - chipmunk
          mid-Ohio
          United States
          Member #9
          March 24, 2001
          19903 Posts
          Offline
          Posted: July 3, 2012, 3:59 pm - IP Logged

          Sounds like something I did years ago when working as an industrial engineering technician taking work samples.  You need to chart your data to see if any occurs most often or can be associated with a particular time.  You may come up with even more ideas once you start charting the data.

           * you don't need to buy more tickets, just buy a winning ticket * 
             
                       Evil Looking       

            Avatar
            Edinburgh
            United Kingdom
            Member #97833
            September 24, 2010
            41 Posts
            Offline
            Posted: July 4, 2012, 1:12 am - IP Logged

            Hi RJOh,

            Sounds interesting. Can you suggest anything? Any direction I can start searching?

            Thanks. Regards,

            martor854

              Avatar

              United Kingdom
              Member #70630
              February 7, 2009
              734 Posts
              Offline
              Posted: July 5, 2012, 12:30 pm - IP Logged

              Hi,

              I'm looking for help on the following problem (it's similar to the lottery): I'm following a phenomenon and I'm taking readings 15 times a day. Each reading can be classed A, B or C. So, every day I'm getting a string of 15 readings. The readings are evenly spread during the day and they are independent. They don't influence each other in any way. So far, I have n days and n readings. I need to predict the string (the whole row) on day n+1.

              Here's an example of what I'm getting:

              Y1   Y2   Y3   Y4   Y5   Y6   Y7   Y8   Y9   Y10  Y11  Y12  Y13  Y14  Y15
              ==========================================================================
              B    C    A    B    A    B    B    C    A    C    B    C    C    B    C
              C    C    A    B    B    C    B    C    C    A    A    C    B    A    A
              C    A    B    B    A    C    B    C    C    B    C    A    C    A    C
              C    B    B    C    B    A    B    A    C    A    C    C    B    A    B
              B    A    C    C    B    C    C    B    A    C    A    A    C    B    A
              A    A    B    A    B    B    A    C    B    A    B    C    A    A    A
              B    A    A    C    B    A    B    B    A    A    B    A    A    B    A
              C    B    A    B    C    C    C    C    B    B    C    B    A    A    A
              A    A    C    B    A    B    B    C    C    C    A    C    B    A    B
              A    B    A    B    C    B    C    B    C    A    A    B    C    C    B
              .    .    .    .    .    .    .    .    .    .    .    .    .    .    .
              .    .    .    .    .    .    .    .    .    .    .    .    .    .    .
              .    .    .    .    .    .    .    .    .    .    .    .    .    .    .
              B    C    C    C    A    A    B    C    C    B    C    A    C    B    B
              C    B    C    C    B    A    C    B    A    C    B    A    C    A    A
              B    C    C    C    A    C    C    A    A    B    A    B    B    C    A
              A    B    C    A    B    A    C    A    A    C    A    B    C    A    A
              C    B    B    B    B    C    C    B    C    B    A    A    B    B    A
              C    B    B    A    B    C    B    A    C    B    C    C    A    B    B
              A    B    B    C    C    B    B    C    C    A    B    C    C    C    B

              Shall I treat this as a matrix? Shall I treat each column independently bearing in mind that the readings are independent? How many lines should I have in order to be able to make a prediction? Any ideas would be apreciated. Or even a nudge in the right direction would do me so I don't waste any time. I can do a little bit of programming in QBasic (a bit obsolete, no graphics, but it works fine). I'm ready to help in exchange for good suggestions.

              Thanks.

              martor854

              Hi martor854,

              I have sent you a sample worksheet, if that is not what you are looking for please advise.

              Regards,

              billybouy...

              Sometimes we can't see the woods for tree's, " so we have to clear a path "

                Avatar
                Edinburgh
                United Kingdom
                Member #97833
                September 24, 2010
                41 Posts
                Offline
                Posted: July 6, 2012, 2:16 pm - IP Logged

                Thanks billibuoy,

                Got it. I'll have a look and get back to you. Keep well and keep up the good work.

                Best regards,

                martor854

                  RJOh's avatar - chipmunk
                  mid-Ohio
                  United States
                  Member #9
                  March 24, 2001
                  19903 Posts
                  Offline
                  Posted: July 7, 2012, 6:26 am - IP Logged

                  Hi RJOh,

                  Sounds interesting. Can you suggest anything? Any direction I can start searching?

                  Thanks. Regards,

                  martor854

                  Without knowing more about the phenomenon that's giving you those readings, it would be hard to say.

                   * you don't need to buy more tickets, just buy a winning ticket * 
                     
                               Evil Looking       

                    Avatar
                    Edinburgh
                    United Kingdom
                    Member #97833
                    September 24, 2010
                    41 Posts
                    Offline
                    Posted: July 7, 2012, 12:38 pm - IP Logged

                    Hi RJOh,

                    It all started from a real phenomenon but I'm after a generally applicable algorithm.

                    Regards,

                    martor854

                      RJOh's avatar - chipmunk
                      mid-Ohio
                      United States
                      Member #9
                      March 24, 2001
                      19903 Posts
                      Offline
                      Posted: July 13, 2012, 10:20 pm - IP Logged

                      Hi RJOh,

                      It all started from a real phenomenon but I'm after a generally applicable algorithm.

                      Regards,

                      martor854

                      Then your best bet is to backward engineer each result and see if one algorithm comes up more than any other.    This won't be 100% but I think if it's concerning lotteries then even 10% would be excellent.

                       * you don't need to buy more tickets, just buy a winning ticket * 
                         
                                   Evil Looking       

                        Avatar
                        Edinburgh
                        United Kingdom
                        Member #97833
                        September 24, 2010
                        41 Posts
                        Offline
                        Posted: July 14, 2012, 2:24 am - IP Logged

                        Hi RJOh,

                        That is the best answer yet. It should be applicable to almost anything of the sort, including the lottery. Could you give me an idea on how I should start? Some indication on what to read, maybe? Also, we can on it together. I can do a bit of programming in QBasic. It may not be much but it's simple and fast.

                        Thank you.

                        martor854

                          AlgorithmGuru's avatar - avatar
                          Pittsburgh, PA
                          United States
                          Member #130598
                          July 20, 2012
                          37 Posts
                          Offline
                          Posted: July 30, 2012, 12:09 am - IP Logged

                          <quote>'m looking for help on the following problem (it's similar to the lottery): I'm following a phenomenon and I'm taking readings 15 times a day. Each reading can be classed A, B or C. So, every day I'm getting a string of 15 readings. The readings are evenly spread during the day and they are independent. They don't influence each other in any way. So far, I have n days and n readings. I need to predict the string (the whole row) on day n+1</quote>

                          Hey Martor854, I like this problem and started brainstorming some ideas.  I have to admit about a half hour in I had more questions then answers.  lol.  I'm no genius and I still have a lot to learn.  But I wanted to ask you about some of what you said, quoted above and below.  First off, speaking strictly mathematcs here, I was under the impression that independent events have no influence on each other, thus using past data to predict future data isn't really possible.  (Tongue in cheek here since we are on lottery post which is exactly what most people are trying to do, including myself).  That being said, my question is, what is a preferred method in the general sense to predict future outcomes of independant events from past independant events?  And I'm not being a smart a**, I mean literally what method seems to be preferred?  I don't know what a matrix is.  Other then basically a list of numbers in a box or tablature style format (as you have listed your results) does "treating the numbers as a matrix" mean something specific?  I'm unfamiliar with that idea. 

                          Now for a couple of thoughts... I think columns and rows are each independent in your example and can (and should) be treated as such.  In other words, if I wanted to track the frequency of an occorrunce of a reading in particular, I would track it both for a day and for a "time" of day (or position).  But I would also try to track every iteration of possible frequency.  Which I'm not sure if that is possible or practical, but if we can assume certain things, such as, each value has to occur at least once every day.   Then we can safely assume that there will never be more than 13 of a value on a given day, which is a finite number to work with.  So I would track the frequency of A followed by B, A followed by C, and A followed by A (ad iterum for B, C) and I would also track occurences of Triples+ (or any occurence when A follows AA or more) and I would track these occurences both throughout a day and a day's position, (or simply rows and columns).  As for how many lines should you have in order to be able to make a prediction?  Well technically I think if you had a perfect algorithm essentially you would need 3 days.  Depending on how much of the previous data is integeral to your algorithm.  Perhaps with a perfect algorithm 1 would suffice.

                          My initial thought was to track occurences (in rows and columns, including the occurences of "following" not just "showing") of each value and track it across time (daily and weekly) using an XmR chart.  Perhaps a compound XmR chart with each value overlaid the next with different colors to more aptly see if there is any distinction.  Since the results are obtained from some mechanism and mechanisms are not always 100%, there is a good chance that each value has a higher rate of occurence at a specific time or interval.  A compound XmR chart will clearly show this if it is the case.  With that data in hand, I would tie a value range from .0 to .99 to each time and day and value.  At .99 (or close) I would say indicates a high likelihood of that value occuring in that spot.  Determining how to increment the value of A, B, and C, on each day and time slot will be a little tricky but I would base my increments predominantly on the data obtained from the frequencies found using multiple XmR charts.  (*** It should be noted that this method really is only going to be effective if there are charecteristics of events that determine the "readings".  If these events are strictly speaking random, then chances are this approach is not going to be very effective.  If these events aren't Random, only unknown because there is too much data to crunch then this would be my preferred method)

                          That's how I would START the problem.  Not saying it would SOLVE the problem.  Essentially the charts will either indicate nothing of significance, a little something of significance or a great deal of significance.  From there I would try to extrapolate an algorithm for the occurence of each value in and of itself.  With all that data maybe a solution would begin to show itself.  That's my two cents.  :)

                           

                          <quote>Shall I treat this as a matrix? Shall I treat each column independently bearing in mind that the readings are independent? How many lines should I have in order to be able to make a prediction? Any ideas would be apreciated. Or even a nudge in the right direction would do me so I don't waste any time. I can do a little bit of programming in QBasic (a bit obsolete, no graphics, but it works fine). I'm ready to help in exchange for good suggestions.</quote>

                            Avatar
                            Edinburgh
                            United Kingdom
                            Member #97833
                            September 24, 2010
                            41 Posts
                            Offline
                            Posted: July 30, 2012, 2:59 am - IP Logged

                            Hi AlgorithmGuru,

                            Thank you for your contribution. I find it very fair and sensible. I have my doubts, too, that past, independent occurrences can help with predicting new ones. So, all that remains is probability. But, maybe I’m wrong.

                            Your view is interesting but I don’t know what an XmR chart is. I’d appreciate it if you could give an example or indicate where I can get one.

                            Intuitively, I’m inclined to go for Markov chains/model (like somebody else suggested) as I think our problem is similar to predicting the weather. I’m only scratching the surface. My lack of training in maths will make it difficult, though. I must find a “hands-on” example. If I make any progress, I'll keep everybody posted.

                            Best regards,

                            martor854

                              AlgorithmGuru's avatar - avatar
                              Pittsburgh, PA
                              United States
                              Member #130598
                              July 20, 2012
                              37 Posts
                              Offline
                              Posted: July 31, 2012, 12:43 am - IP Logged

                              Hey martor.  There is a wealth of information online about XmR charts or "Individual moving range" charts.  It basically means you are charting data that has one source, (for instance daily sales, or in your case, daily frequency of an event and NOT an average recording, such as the average temperature today).  It's a great method because it not only tracks the individual readings but also the movement between readings.  Which is what can help to show patterns.  If all the numbers plot within the limits of the XmR chart, then the "process" is considered stable.  There are specific formulas for plotting the limits on an XmR chart.  I encourage you to look them up online.  I'm not able to post links to websites yet, but I can tell you if you google "Xmr on Excel" there is a great explanation of how to set up an XmR chart in excel (It's the first hit) and also information on how to understand the data and further methods of examining the chart once you have it constructed.  As a side note, I was first introduced to XmR charts by a book Called "Understanding Variation - The Key to Managing Chaos" written by Donald J. Wheeler.  The book is in fact nothing more then a practical introduction to XmR charts and why other charts fail in the same setting.  It gives a lot of insight on understanding the data. It was published in 2000 by SPC press.  At any rate.  Good luck.  :)