FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   PreferencesPreferences   Log in to check your private messagesLog in to check your private messages   Log inLog in 
Forum index » Science and Technology » Math » Probability » Prediction
Sequence prediction problem
Post new topic   Reply to topic Page 1 of 1 [2 Posts] View previous topic :: View next topic
Author Message
zartan2k@comcast.net
science forum beginner


Joined: 23 Jun 2006
Posts: 3

PostPosted: Fri Jun 23, 2006 10:38 pm    Post subject: Sequence prediction problem Reply with quote

I would like some advice on how to solve the following problem. I
observe a system that typically sends a series of data to me as
follows.

1,2,3,4,5,6,1,2,3,4,5,6,1,2, ...

Only the values 1,2,3,4,5 and 6 can show up in the series. And in the
absence of any disturbances the numbers are always sequential. However
on occasion one or more of these numbers can drop out of the series.
For example:

1,2,3,5,6,1,2,3, ... (the 4 was dropped)

Another form of disturbance that can occur is that a random number
(with values from 1 to 6) can also be interjected into the series. For
example:

1,2,3,4,1,5,6,1,2,3, ... (a 1 was randomly inserted)

Note that both forms of disturbance can occur at the same time:

1,2,3,4,1,6,1,2,3, ... (a 5 was dropped and a 1 was randomly inserted)

What I would like to do is label each data point in a series with an
approximate probability that the given number was randomly inserted
into the data stream. Note that the length of these sequences will grow
over time and that the probabilities can be updated as new numbers are
observed. The starting point of a sequence doesn't have to be 1. There
is no additional information available about the system (e.g.
probablilities of drop outs or insertions).

Thanks,
zartan2k
Back to top
dave@autobox.com
science forum beginner


Joined: 16 Feb 2006
Posts: 12

PostPosted: Thu Jul 06, 2006 3:47 pm    Post subject: Re: Sequence prediction problem Reply with quote

zartan2k@comcast.net wrote:
Quote:
I would like some advice on how to solve the following problem. I
observe a system that typically sends a series of data to me as
follows.

1,2,3,4,5,6,1,2,3,4,5,6,1,2, ...

Only the values 1,2,3,4,5 and 6 can show up in the series. And in the
absence of any disturbances the numbers are always sequential. However
on occasion one or more of these numbers can drop out of the series.
For example:

1,2,3,5,6,1,2,3, ... (the 4 was dropped)

Another form of disturbance that can occur is that a random number
(with values from 1 to 6) can also be interjected into the series. For
example:

1,2,3,4,1,5,6,1,2,3, ... (a 1 was randomly inserted)

Note that both forms of disturbance can occur at the same time:

1,2,3,4,1,6,1,2,3, ... (a 5 was dropped and a 1 was randomly inserted)

What I would like to do is label each data point in a series with an
approximate probability that the given number was randomly inserted
into the data stream. Note that the length of these sequences will grow
over time and that the probabilities can be updated as new numbers are
observed. The starting point of a sequence doesn't have to be 1. There
is no additional information available about the system (e.g.
probablilities of drop outs or insertions).

Thanks,
zartan2k


Z.

I would like to commendate you on the clarity of your problem. I have
waited until now to see if there were any other posters , but it
appears not. The problem you have is pattern recognition in time
series.

We have been developing statistical application software that focuses
on model identification in the presence of anomalies. Please see
http://www.autobox.com and use the Google Search button for the term
"outlier".

The problem is that you can't catch an outlier without a model (at
least a mild one) for your data. Else how would you know that a point
violated that model? In fact, the process of growing understanding and
finding and examining outliers must be iterative. This isn't a new
thought. Bacon, writing in Novum Organum about 400 years ago said:
"Errors of Nature, Sports and Monsters correct the understanding in
regard to ordinary things, and reveal general forms. For whoever knows
the ways of Nature will more easily notice her deviations; and, on the
other hand, whoever knows her deviations will more accurately describe
her ways."

Some analysts think that they can remove outliers based on abnormal
residuals to a simple fitted model sometimes even "eye models". If the
outlier is outside of a particular probability limit (95 or 99), they
then attempt to locate if there is something missing from model. If
not, it's gone. This deletion or adjustment of the value so that there
is no outlier effect is equivalent to augmenting the model with a 0/1
variable where a 1 is used to denote the time point and 0's elsewhere.
This manual adjustment is normally supported by visual or graphical
analysis ... which as we will see below often fails. Additionally this
approach begs the question of "inliers" whose effect is just as serious
as "outliers" . Inliers are " too normal or too close to the mean" and
if ignored will bias the identification of the model and its
parameters. Consider the time series 1,9,1,9,1,9,5,9 and how a simple
model might find nothing exceptional whereas a slightly less simple
model would focus the attention on the exceptional value of 5 at time
period seven.

Your problem is IMHO in the same "ballpark". Whereas our software
identifies and treats the anomlies, you it appears want the identified
missing value to be inserted ans assigned a very smal (0.0) probability
of having been put there by the original system i.e. the observed
values.

You can pusue threads like Intervention Detection, ARIMA , Box-Jenkins,
Signal Detection , Data Mining etc and all will find their solution at
http://www.autobox.com.

If you would like to chat , please call

Dave Reilly
Automatic Forecasting Systems
http://www.autobox.com
215-675-0652 in the US
Back to top
Google

Back to top
Display posts from previous:   
Post new topic   Reply to topic Page 1 of 1 [2 Posts] View previous topic :: View next topic
The time now is Sat Jan 10, 2009 3:43 am | All times are GMT
Forum index » Science and Technology » Math » Probability » Prediction
Jump to:  

Similar Topics
Topic Author Forum Replies Last Post
No new posts help on problem brb003 Math 0 Mon Aug 28, 2006 3:31 am
No new posts fraction problem mikerule Research 0 Thu Aug 24, 2006 5:10 am
No new posts Mod computer problem William Elliot Math 4 Fri Jul 21, 2006 12:07 pm
No new posts Divine apparitions in the tethered go... jpalmour@gmail.com Math 6 Thu Jul 20, 2006 8:26 pm
No new posts possible to use Generalized Method of... comtech Math 1 Thu Jul 20, 2006 12:49 am

WesternUnion | Loans | Birthday Gifts | Credit Cards | Bankruptcy
Copyright © 2004-2005 DeniX Solutions SRL
Other DeniX Solutions sites: Electronics forum |  Medicine forum |  Unix/Linux blog |  Unix/Linux documentation |  Unix/Linux forums


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.1544s ][ Queries: 16 (0.0685s) ][ GZIP on - Debug on ]