thr3ads.net - R help - [R] convert tabular time series into a matrix format [Jul 2009]

If this information is useful, please help other people find it:
Share via:

Young Cho

2009-Jul-01 17:35 UTC

[R] convert tabular time series into a matrix format

Hi, thanks everyone for any help in advance.

I found myself dealing with a tabular time-series data formatted  each row
like [ time stamp,   ID, values]. I made a small examples:

X = data.frame(t=c(1,1,1,2,2,2,2,3,3,3,4,4,4,5,5),id
c('a','b','c','c','b','d','e','b','a','e','a','b','d','b','c'))
X$x = rnorm(15)

't' is time stamp, 'id' is identifier, 'x' is time
series values. They are
not necessarily ordered and have sometimes missing values. In order to do
any analysis, I used to convert this type of data into a matrix form :

Y = matrix(NA,length(unique(X$id)),length(unique(X$t)))
rownames(Y) = sort(unique(X$id))
colnames(Y) = sort(unique(X$t))
for(i in 1:nrow(Y)){
   xi = X[ X$id == rownames(Y)[i], ]
   Y[i, match(xi$t, colnames(Y)) ] = xi$x
}

Then, run any R operations on Y. Now, this conversion gets very painfully
slow as my data gets substantially larger. I was wondering if there is some
better ways to convert a table like 'X' into a matrix like 'Y',
or even
better ways to re-format data, not necessarily matrix form.

Young

	[[alternative HTML version deleted]]

jim holtman

2009-Jul-01 20:52 UTC

head link

[R] convert tabular time series into a matrix format

Is this what you want:
> X = data.frame(t=c(1,1,1,2,2,2,2,3,3,3,4,4,4,5,5),id +
c('a','b','c','c','b','d','e','b','a','e','a','b','d','b','c'))
> X$x = rnorm(15)
> X   t id          x
1  1  a -0.6264538
2  1  b  0.1836433
3  1  c -0.8356286
4  2  c  1.5952808
5  2  b  0.3295078
6  2  d -0.8204684
7  2  e  0.4874291
8  3  b  0.7383247
9  3  a  0.5757814
10 3  e -0.3053884
11 4  a  1.5117812
12 4  b  0.3898432
13 4  d -0.6212406
14 5  b -2.2146999
15 5  c  1.1249309> require(reshape)Loading required package: reshape
Loading required package: plyr> X.m <- melt(X, measure='x')
> cast(X.m, id ~ t, sum)  id          1          2          3          4         5
1  a -0.6264538  0.0000000  0.5757814  1.5117812  0.000000
2  b  0.1836433  0.3295078  0.7383247  0.3898432 -2.214700
3  c -0.8356286  1.5952808  0.0000000  0.0000000  1.124931
4  d  0.0000000 -0.8204684  0.0000000 -0.6212406  0.000000
5  e  0.0000000  0.4874291 -0.3053884  0.0000000 
0.000000>

On Wed, Jul 1, 2009 at 1:35 PM, Young Cho<young.stat at gmail.com>
wrote:> Hi, thanks everyone for any help in advance.
>
> I found myself dealing with a tabular time-series data formatted ?each row
> like [ time stamp, ? ID, values]. I made a small examples:
>
> X = data.frame(t=c(1,1,1,2,2,2,2,3,3,3,4,4,4,5,5),id >
c('a','b','c','c','b','d','e','b','a','e','a','b','d','b','c'))
> X$x = rnorm(15)
>
> 't' is time stamp, 'id' is identifier, 'x' is time
series values. They are
> not necessarily ordered and have sometimes missing values. In order to do
> any analysis, I used to convert this type of data into a matrix form :
>
> Y = matrix(NA,length(unique(X$id)),length(unique(X$t)))
> rownames(Y) = sort(unique(X$id))
> colnames(Y) = sort(unique(X$t))
> for(i in 1:nrow(Y)){
> ? xi = X[ X$id == rownames(Y)[i], ]
> ? Y[i, match(xi$t, colnames(Y)) ] = xi$x
> }
>
> Then, run any R operations on Y. Now, this conversion gets very painfully
> slow as my data gets substantially larger. I was wondering if there is some
> better ways to convert a table like 'X' into a matrix like
'Y', or even
> better ways to re-format data, not necessarily matrix form.
>
> Young
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

jim holtman

2009-Jul-01 20:54 UTC

head link

[R] convert tabular time series into a matrix format

This may be closer; forgot about the NAs
> cast(X.m, id ~ t, function(x) if (length(x)==0) NA else sum(x))  id          1          2          3          4         5
1  a -0.6264538         NA  0.5757814  1.5117812        NA
2  b  0.1836433  0.3295078  0.7383247  0.3898432 -2.214700
3  c -0.8356286  1.5952808         NA         NA  1.124931
4  d         NA -0.8204684         NA -0.6212406        NA
5  e         NA  0.4874291 -0.3053884         NA       
NA>

On Wed, Jul 1, 2009 at 1:35 PM, Young Cho<young.stat at gmail.com>
wrote:> Hi, thanks everyone for any help in advance.
>
> I found myself dealing with a tabular time-series data formatted ?each row
> like [ time stamp, ? ID, values]. I made a small examples:
>
> X = data.frame(t=c(1,1,1,2,2,2,2,3,3,3,4,4,4,5,5),id >
c('a','b','c','c','b','d','e','b','a','e','a','b','d','b','c'))
> X$x = rnorm(15)
>
> 't' is time stamp, 'id' is identifier, 'x' is time
series values. They are
> not necessarily ordered and have sometimes missing values. In order to do
> any analysis, I used to convert this type of data into a matrix form :
>
> Y = matrix(NA,length(unique(X$id)),length(unique(X$t)))
> rownames(Y) = sort(unique(X$id))
> colnames(Y) = sort(unique(X$t))
> for(i in 1:nrow(Y)){
> ? xi = X[ X$id == rownames(Y)[i], ]
> ? Y[i, match(xi$t, colnames(Y)) ] = xi$x
> }
>
> Then, run any R operations on Y. Now, this conversion gets very painfully
> slow as my data gets substantially larger. I was wondering if there is some
> better ways to convert a table like 'X' into a matrix like
'Y', or even
> better ways to re-format data, not necessarily matrix form.
>
> Young
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

Jorge Ivan Velez

2009-Jul-01 20:59 UTC

head link

[R] convert tabular time series into a matrix format

Dear Young,
Try this:

with(X, tapply(x, list(t,id), function(y) y ))

HTH,

Jorge


On Wed, Jul 1, 2009 at 1:35 PM, Young Cho <young.stat@gmail.com> wrote:
> Hi, thanks everyone for any help in advance.
>
> I found myself dealing with a tabular time-series data formatted  each row
> like [ time stamp,   ID, values]. I made a small examples:
>
> X = data.frame(t=c(1,1,1,2,2,2,2,3,3,3,4,4,4,5,5),id >
c('a','b','c','c','b','d','e','b','a','e','a','b','d','b','c'))
> X$x = rnorm(15)
>
> 't' is time stamp, 'id' is identifier, 'x' is time
series values. They are
> not necessarily ordered and have sometimes missing values. In order to do
> any analysis, I used to convert this type of data into a matrix form :
>
> Y = matrix(NA,length(unique(X$id)),length(unique(X$t)))
> rownames(Y) = sort(unique(X$id))
> colnames(Y) = sort(unique(X$t))
> for(i in 1:nrow(Y)){
>   xi = X[ X$id == rownames(Y)[i], ]
>   Y[i, match(xi$t, colnames(Y)) ] = xi$x
> }
>
> Then, run any R operations on Y. Now, this conversion gets very painfully
> slow as my data gets substantially larger. I was wondering if there is some
> better ways to convert a table like 'X' into a matrix like
'Y', or even
> better ways to re-format data, not necessarily matrix form.
>
> Young
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Gabor Grothendieck

2009-Jul-01 21:01 UTC

head link

[R] convert tabular time series into a matrix format

Can't say whether its any faster but read.zoo in the devel version of
zoo can do this using the split= argument where split=2 in the
example below says to split it into time series defined by the
second column.

Lines <- '"t" "id" "x"
1 "a" -1.71941257904109
1 "b" 1.33629503083329
1 "c" 1.61337372076629
2 "c" 1.34439849784170
2 "b" 0.167475882421629
2 "d" -0.447868997085645
2 "e" -0.592497543309015
3 "b" 0.952268366091281
3 "a" -0.532804723204108
3 "e" -1.20064709102901
4 "a" 2.10892119828104
4 "b" -0.0550779510849278
4 "d" 1.98864434974374
5 "b" 1.59258204616364
5 "c" 0.65185104628371'

library(zoo)
source("http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/*checkout*/pkg/R/read.zoo.R?rev=588&root=zoo")

z <- read.zoo(textConnection(Lines), header = TRUE, split = 2)

See the three zoo vignettes and the zoo help pages for more on zoo.


On Wed, Jul 1, 2009 at 1:35 PM, Young Cho<young.stat at gmail.com>
wrote:> Hi, thanks everyone for any help in advance.
>
> I found myself dealing with a tabular time-series data formatted ?each row
> like [ time stamp, ? ID, values]. I made a small examples:
>
> X = data.frame(t=c(1,1,1,2,2,2,2,3,3,3,4,4,4,5,5),id >
c('a','b','c','c','b','d','e','b','a','e','a','b','d','b','c'))
> X$x = rnorm(15)
>
> 't' is time stamp, 'id' is identifier, 'x' is time
series values. They are
> not necessarily ordered and have sometimes missing values. In order to do
> any analysis, I used to convert this type of data into a matrix form :
>
> Y = matrix(NA,length(unique(X$id)),length(unique(X$t)))
> rownames(Y) = sort(unique(X$id))
> colnames(Y) = sort(unique(X$t))
> for(i in 1:nrow(Y)){
> ? xi = X[ X$id == rownames(Y)[i], ]
> ? Y[i, match(xi$t, colnames(Y)) ] = xi$x
> }
>
> Then, run any R operations on Y. Now, this conversion gets very painfully
> slow as my data gets substantially larger. I was wondering if there is some
> better ways to convert a table like 'X' into a matrix like
'Y', or even
> better ways to re-format data, not necessarily matrix form.
>
> Young
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Henrique Dallazuanna

2009-Jul-01 21:56 UTC

head link

[R] convert tabular time series into a matrix format

Try this:

xtabs(x ~ t + id, data = X)


On Wed, Jul 1, 2009 at 2:35 PM, Young Cho <young.stat@gmail.com> wrote:
> Hi, thanks everyone for any help in advance.
>
> I found myself dealing with a tabular time-series data formatted  each row
> like [ time stamp,   ID, values]. I made a small examples:
>
> X = data.frame(t=c(1,1,1,2,2,2,2,3,3,3,4,4,4,5,5),id >
c('a','b','c','c','b','d','e','b','a','e','a','b','d','b','c'))
> X$x = rnorm(15)
>
> 't' is time stamp, 'id' is identifier, 'x' is time
series values. They are
> not necessarily ordered and have sometimes missing values. In order to do
> any analysis, I used to convert this type of data into a matrix form :
>
> Y = matrix(NA,length(unique(X$id)),length(unique(X$t)))
> rownames(Y) = sort(unique(X$id))
> colnames(Y) = sort(unique(X$t))
> for(i in 1:nrow(Y)){
>   xi = X[ X$id == rownames(Y)[i], ]
>   Y[i, match(xi$t, colnames(Y)) ] = xi$x
> }
>
> Then, run any R operations on Y. Now, this conversion gets very painfully
> slow as my data gets substantially larger. I was wondering if there is some
> better ways to convert a table like 'X' into a matrix like
'Y', or even
> better ways to re-format data, not necessarily matrix form.
>
> Young
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

	[[alternative HTML version deleted]]

Reasonably Related Threads

Search for more reasonably related threads

R help - Jul 2009 - convert tabular time series into a matrix format

[R] convert tabular time series into a matrix format

[R] convert tabular time series into a matrix format

[R] convert tabular time series into a matrix format

[R] convert tabular time series into a matrix format

[R] convert tabular time series into a matrix format

[R] convert tabular time series into a matrix format

Reasonably Related Threads