Serguei Kaniovski
2009-Jun-26 23:56 UTC
[R] Compute correlation matrix for panel data with specific ordering
Hello All, I have a panel date - here a small-scale example: df <- data.frame(cbind(rep(c("AUT","BEL","DEN","GER"),4),cbind(rep(c(1999,2000,2001,2002),4)),sample(10,16,replace=T))) names(df) <- c("country","year","x") SORT <- c("GER","BEL","DEN","AUT") I need to compute the correlation between countries in the variable "x" in such a way that the rows & columns of the resulting correlation matrix are not in an alphabetical order but in the order of a given factor vector - here SORT. How can I do this? Greatly appreciate any help! Serguei Kaniovski
Dieter Menne
2009-Jun-27 10:34 UTC
[R] Compute correlation matrix for panel data with specific ordering
Serguei Kaniovski <Serguei.Kaniovski <at> wifo.ac.at> writes:> > df <- >data.frame(cbind(rep(c("AUT","BEL","DEN","GER"),4), cbind(rep(c(1999,2000,2001,2002),4)),sample(10,16,replace=T)))> names(df) <- c("country","year","x") > > SORT <- c("GER","BEL","DEN","AUT") > > I need to compute the correlation between countries in the variable "x" > in such a way that the rows & columns of the resulting correlation > matrix are not in an alphabetical order but in the order of a given > factor vector - here SORT. >This boils down to : how do I reorder a factor not to use alphabetical order? There are several reorders around, and you could do it with base function in R, but I find the following solution using package gdata the most readable: library(gdata) df <- data.frame(country = sample(c("AUT","BEL","DEN","GER"),10,TRUE)) str(df) SORT <- c("GER","BEL","DEN","AUT") df$countryS = reorder(df$country, new.order=SORT) Dieter
Serguei Kaniovski
2009-Jun-29 11:20 UTC
[R] Compute correlation matrix for panel data with specific ordering
I apologize for not being specific enough in my previous posting. Assume you have panel data in the form: df <- data.frame( cbind( rep( c( "AUT" , "BEL" , "DEN" , "GER" ) , 4) , cbind( rep( c( 1999 , 2000 , 2001 , 2002 ) , 4 ) ), sample( 10 , 16 , replace=T) ) ) names(df) <- c( "country" , "year" , "x" ) 1. I would like to compute the correlation matrix between countries based on the annual observations of the variable x. I tried the following: library( combinat ) temp <- split( df$x, df$year ) apply( combn(4,2) , 2 , function(x) cor( temp[[1]] , temp[[2]] ) ) This gives wrong answer. Why? 2. The pairwise correlations computed as above should be in the order: GER with BEL, GER with DEN, GER with AUT, BEL with DEN, BEL with AUT, DEN with AUT. That is, the correctly sorted vector of factors is: SORT <- c( "GER" , "BEL" , "DEN" , "AUT" ) not c( "AUT" , "BEL" , "DEN" , "GER" ) May be there is an altogether better way of achieving what I want? Serge
John Kane
2009-Jun-29 14:34 UTC
[R] Compute correlation matrix for panel data with specific ordering
Have a look at str(df). Those values are being interpreted as factors not numbers. I don't think this is what you want. --- On Mon, 6/29/09, Serguei Kaniovski <Serguei.Kaniovski at wifo.ac.at> wrote:> From: Serguei Kaniovski <Serguei.Kaniovski at wifo.ac.at> > Subject: Re: [R] Compute correlation matrix for panel data with specific ordering > To: r-help at r-project.org > Received: Monday, June 29, 2009, 7:20 AM > I apologize for not being specific > enough in my previous posting. Assume you have panel data in > the form: > > df <- data.frame( cbind( rep( c( "AUT" , "BEL" , "DEN" , > "GER" ) , 4) , cbind( rep( c( 1999 , 2000 , 2001 , 2002 ) , > 4 ) ), sample( 10 , 16 , replace=T) ) ) > names(df) <- c( "country" , "year" , "x" ) > > 1. I would like to compute the correlation matrix between > countries based on the annual observations of the variable > x. I tried the following: > library( combinat ) > > temp <- split( df$x, df$year ) > apply( combn(4,2) , 2 , function(x) cor( temp[[1]] , > temp[[2]] ) ) > > This gives wrong answer. Why? > > 2. The pairwise correlations computed as above should be in > the order: > > GER with BEL, GER with DEN, GER with AUT, BEL with DEN, BEL > with AUT, DEN with AUT. > > That is, the correctly sorted vector of factors is: > > SORT <- c( "GER" , "BEL" , "DEN" , "AUT" ) not c( "AUT" > , "BEL" , "DEN" , "GER" ) > > May be there is an altogether better way of achieving what > I want? > > Serge > > ______________________________________________ > R-help at r-project.org > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. >__________________________________________________________________ Make your browsing faster, safer, and easier with the new Internet Explorer? 8. Optimized f lorer/