Author 
Message 
zartan2k@comcast.net science forum beginner
Joined: 23 Jun 2006
Posts: 3

Posted: Fri Jun 23, 2006 10:49 pm Post subject:
Sequence prediction problem



I would like some advice on how to solve the following problem. I
observe a system that typically sends a series of data to me as
follows.
1,2,3,4,5,6,1,2,3,4,5,6,1,2, ...
Only the values 1,2,3,4,5 and 6 can show up in the series. And in the
absence of any disturbances the numbers are always sequential. However
on occasion one or more of these numbers can drop out of the series.
For example:
1,2,3,5,6,1,2,3, ... (the 4 was dropped)
Another form of disturbance that can occur is that a random number
(with values from 1 to 6) can also be interjected into the series. For
example:
1,2,3,4,1,5,6,1,2,3, ... (a 1 was randomly inserted)
Note that both forms of disturbance can occur at the same time:
1,2,3,4,1,6,1,2,3, ... (a 5 was dropped and a 1 was randomly inserted)
What I would like to do is label each data point in a series with an
approximate probability that the given number was randomly inserted
into the data stream. Note that the length of these sequences will grow
over time and that the probabilities can be updated as new numbers are
observed. The starting point of a sequence doesn't have to be 1. There
is no additional information available about the system (e.g.
probablilities of drop outs or insertions).
Thanks,
zartan2k 

Back to top 


Bruce Reistle science forum beginner
Joined: 25 Jun 2006
Posts: 1

Posted: Sun Jun 25, 2006 7:34 pm Post subject:
Re: Sequence prediction problem



<zartan2k@comcast.net> wrote in message
news:1151102986.055374.288070@y41g2000cwy.googlegroups.com...
Quote:  I would like some advice on how to solve the following
problem. I
observe a system that typically sends a series of data to
me as
follows.
1,2,3,4,5,6,1,2,3,4,5,6,1,2, ...
Only the values 1,2,3,4,5 and 6 can show up in the series.
And in the
absence of any disturbances the numbers are always
sequential. However
on occasion one or more of these numbers can drop out of
the series.
For example:
1,2,3,5,6,1,2,3, ... (the 4 was dropped)
Another form of disturbance that can occur is that a
random number
(with values from 1 to 6) can also be interjected into the
series. For
example:
1,2,3,4,1,5,6,1,2,3, ... (a 1 was randomly inserted)
Note that both forms of disturbance can occur at the same
time:
1,2,3,4,1,6,1,2,3, ... (a 5 was dropped and a 1 was
randomly inserted)
What I would like to do is label each data point in a
series with an
approximate probability that the given number was randomly
inserted
into the data stream. Note that the length of these
sequences will grow
over time and that the probabilities can be updated as new
numbers are
observed. The starting point of a sequence doesn't have to
be 1. There
is no additional information available about the system
(e.g.
probablilities of drop outs or insertions).
Thanks,
zartan2k

This sounds like a fun problem. Is this a totally
fictitious problem? If not, could you send me some of the
data, or generate some for me?
Bruce R 

Back to top 


zartan2k@comcast.net science forum beginner
Joined: 23 Jun 2006
Posts: 3

Posted: Mon Jun 26, 2006 10:21 pm Post subject:
Re: Sequence prediction problem



Bruce Reistle wrote:
Quote:  zartan2k@comcast.net> wrote in message
news:1151102986.055374.288070@y41g2000cwy.googlegroups.com...
I would like some advice on how to solve the following
problem. I
observe a system that typically sends a series of data to
me as
follows.
1,2,3,4,5,6,1,2,3,4,5,6,1,2, ...
Only the values 1,2,3,4,5 and 6 can show up in the series.
And in the
absence of any disturbances the numbers are always
sequential. However
on occasion one or more of these numbers can drop out of
the series.
For example:
1,2,3,5,6,1,2,3, ... (the 4 was dropped)
Another form of disturbance that can occur is that a
random number
(with values from 1 to 6) can also be interjected into the
series. For
example:
1,2,3,4,1,5,6,1,2,3, ... (a 1 was randomly inserted)
Note that both forms of disturbance can occur at the same
time:
1,2,3,4,1,6,1,2,3, ... (a 5 was dropped and a 1 was
randomly inserted)
What I would like to do is label each data point in a
series with an
approximate probability that the given number was randomly
inserted
into the data stream. Note that the length of these
sequences will grow
over time and that the probabilities can be updated as new
numbers are
observed. The starting point of a sequence doesn't have to
be 1. There
is no additional information available about the system
(e.g.
probablilities of drop outs or insertions).
Thanks,
zartan2k
This sounds like a fun problem. Is this a totally
fictitious problem? If not, could you send me some of the
data, or generate some for me?
Bruce R

This is a realworld problem. I don't have any actual data as yet, but
I may be simulating some in the near future. Have you thought about any
general approaches as yet? It seems that as long as dropouts remain
releative low, that something can be done. It's not clear to me however
how one would detect when the data stream becomes too degraded, via
dropouts, to allow reasonable prediction.
zartan2k 

Back to top 


A.G.McDowell science forum beginner
Joined: 17 Mar 2005
Posts: 4

Posted: Tue Jun 27, 2006 5:40 pm Post subject:
Re: Sequence prediction problem



In article <1151360481.885716.146440@b68g2000cwa.googlegroups.com>,
zartan2k@comcast.net <zartan2k@comcast.net> writes
Quote: 
Bruce Reistle wrote:
zartan2k@comcast.net> wrote in message
news:1151102986.055374.288070@y41g2000cwy.googlegroups.com...
I would like some advice on how to solve the following
problem. I
observe a system that typically sends a series of data to
me as
follows.
1,2,3,4,5,6,1,2,3,4,5,6,1,2, ...
Only the values 1,2,3,4,5 and 6 can show up in the series.
And in the
absence of any disturbances the numbers are always
sequential. However
on occasion one or more of these numbers can drop out of
the series.
For example:
1,2,3,5,6,1,2,3, ... (the 4 was dropped)
Another form of disturbance that can occur is that a
random number
(with values from 1 to 6) can also be interjected into the
series. For
example:
1,2,3,4,1,5,6,1,2,3, ... (a 1 was randomly inserted)
Note that both forms of disturbance can occur at the same
time:
1,2,3,4,1,6,1,2,3, ... (a 5 was dropped and a 1 was
randomly inserted)
What I would like to do is label each data point in a
series with an
approximate probability that the given number was randomly
inserted
into the data stream. Note that the length of these
sequences will grow
over time and that the probabilities can be updated as new
numbers are
observed. The starting point of a sequence doesn't have to
be 1. There
is no additional information available about the system
(e.g.
probablilities of drop outs or insertions).
Thanks,
zartan2k
This sounds like a fun problem. Is this a totally
fictitious problem? If not, could you send me some of the
data, or generate some for me?
Bruce R
This is a realworld problem. I don't have any actual data as yet, but
I may be simulating some in the near future. Have you thought about any
general approaches as yet? It seems that as long as dropouts remain
releative low, that something can be done. It's not clear to me however
how one would detect when the data stream becomes too degraded, via
dropouts, to allow reasonable prediction.
zartan2k
If you just want a general approach, I would look up Hidden Markov 
Models and the EM algorithm. There is a hidden model with 6 states, 1,
2, 3, 4, 5, 6. It usually goes from state n to state n+1 mod 6, but
sometimes skips a state and sometimes moves forward two states. You
usually observe the hidden state, but sometimes you get junk instead.
See e.g. exercise (7) at http://www.inference.phy.cam.ac.uk/mackay/itila
/ExtraExercises.html or http://www.cis.hut.fi/ahonkela/dippa/node36.html

A.G.McDowell 

Back to top 


Google


Back to top 



The time now is Fri Sep 21, 2018 4:10 am  All times are GMT

