thr3ads.net - R help - [R] Classification of Multivariate Time Series [May 2013]

If this information is useful, please help other people find it:
Share via:

Lorenzo Isella

2013-May-27 11:34 UTC

[R] Classification of Multivariate Time Series

Dear All,
Apologies for not posting a code snippet, but I really need a pointer about
a methodology to look at my data and possibly some R package which can ease
my task.
I am given a set consisting of several multivariate noisy time series,
let's call it {A}.
Each A_i in {A}, in turn, consists of several numerical time series.
Then I have another set of shorter time series {B}.
Now, for every B_j in {B}, I need to determine the time series A_i where
most likely B_j comes from (A_i is not just a subset of B_j).
In other words, I need to determine the distance between A_i and B_j.
I was thinking about the Mahalanobis distance described here.

http://en.wikipedia.org/wiki/Mahalanobis_distance

However, I have several questions in my head
1) With the Mahalanobis distance, do I lose the info about the time
structure of the data? I am not just comparing some distributions, but some
time series and the ordering of the data is important.
2) Even if the use of the Mahalanobis distance was appropriate, it involves
the calculation of a covariance matrix and a mean.
Should I average A_i or B_j (or a subset of B_j having the same length as
A_i)? And should I use a correlation matrix based on A_i or B_j?

Any suggestion is welcome.

Lorenzo

	[[alternative HTML version deleted]]

Emre Sahin

2013-May-27 12:12 UTC

head link

[R] Classification of Multivariate Time Series

Did you have a look at Dynamic Time Warping and dtw package?

Best, E. 

On Mon, May 27, 2013 at 01:34:42PM +0200, Lorenzo Isella
wrote:> Dear All,
> Apologies for not posting a code snippet, but I really need a pointer about
> a methodology to look at my data and possibly some R package which can ease
> my task.
> I am given a set consisting of several multivariate noisy time series,
> let's call it {A}.
> Each A_i in {A}, in turn, consists of several numerical time series.
> Then I have another set of shorter time series {B}.
> Now, for every B_j in {B}, I need to determine the time series A_i where
> most likely B_j comes from (A_i is not just a subset of B_j).
> In other words, I need to determine the distance between A_i and B_j.
> I was thinking about the Mahalanobis distance described here.
> 
> http://en.wikipedia.org/wiki/Mahalanobis_distance
> 
> However, I have several questions in my head
> 1) With the Mahalanobis distance, do I lose the info about the time
> structure of the data? I am not just comparing some distributions, but some
> time series and the ordering of the data is important.
> 2) Even if the use of the Mahalanobis distance was appropriate, it involves
> the calculation of a covariance matrix and a mean.
> Should I average A_i or B_j (or a subset of B_j having the same length as
> A_i)? And should I use a correlation matrix based on A_i or B_j?
> 
> Any suggestion is welcome.
> 
> Lorenzo
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Roy Mendelssohn - NOAA Federal

2013-May-27 13:39 UTC

head link

[R] Classification of Multivariate Time Series

Look at:

State - Space Discrimination and Clustering of. Atmospheric Time Series Data.
Based on Kullback Information Measures. Thomas Bengtsson

If you Google the topic, there are  host of other papers too, but the one meshes
with exiting star-space methods.

-Roy

On May 27, 2013, at 4:34 AM, Lorenzo Isella <lorenzo.isella at gmail.com>
wrote:
> Dear All,
> Apologies for not posting a code snippet, but I really need a pointer about
> a methodology to look at my data and possibly some R package which can ease
> my task.
> I am given a set consisting of several multivariate noisy time series,
> let's call it {A}.
> Each A_i in {A}, in turn, consists of several numerical time series.
> Then I have another set of shorter time series {B}.
> Now, for every B_j in {B}, I need to determine the time series A_i where
> most likely B_j comes from (A_i is not just a subset of B_j).
> In other words, I need to determine the distance between A_i and B_j.
> I was thinking about the Mahalanobis distance described here.
> 
> http://en.wikipedia.org/wiki/Mahalanobis_distance
> 
> However, I have several questions in my head
> 1) With the Mahalanobis distance, do I lose the info about the time
> structure of the data? I am not just comparing some distributions, but some
> time series and the ordering of the data is important.
> 2) Even if the use of the Mahalanobis distance was appropriate, it involves
> the calculation of a covariance matrix and a mean.
> Should I average A_i or B_j (or a subset of B_j having the same length as
> A_i)? And should I use a correlation matrix based on A_i or B_j?
> 
> Any suggestion is welcome.
> 
> Lorenzo
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
**********************
"The contents of this message do not reflect any position of the U.S.
Government or NOAA."
**********************
Roy Mendelssohn
Supervisory Operations Research Analyst
NOAA/NMFS
Environmental Research Division
Southwest Fisheries Science Center
1352 Lighthouse Avenue
Pacific Grove, CA 93950-2097

e-mail: Roy.Mendelssohn at noaa.gov (Note new e-mail address)
voice: (831)-648-9029
fax: (831)-648-8440
www: http://www.pfeg.noaa.gov/

"Old age and treachery will overcome youth and skill."
"From those who have been given much, much will be expected" 
"the arc of the moral universe is long, but it bends toward justice"
-MLK Jr.

R help - May 2013 - Classification of Multivariate Time Series

[R] Classification of Multivariate Time Series

[R] Classification of Multivariate Time Series

[R] Classification of Multivariate Time Series