thr3ads.net - R help - [R] Data permutation [Dec 2010]

If this information is useful, please help other people find it:
Share via:

matteop

2010-Dec-13 16:37 UTC

[R] Data permutation

Hello R User,

I am new in R and trying to migrate from SAS. I have to convert a table that
look like this

YEAR	FIRM		ID_NAME		VALUE
1994	Microsoft	John Doe		5
1994	Microsoft	Mark Smith		3
1994	Microsoft	David Ring		2

In this: 
YEAR	FIRM		ID1		vALUE	ID2		VALUE
1994	Microsoft	John Doe	5		Mark Smith		3
1994	Microsoft	John Doe	5		David Ring		2
1994	Microsoft	Mark Smith	3		David Ring		2

I have to do it for all the possible pair combination of ID_Name linked to
the same firm for any given year in my sample. 
Do you have any suggestion?

Thank you very much,

Matteo
-- 
View this message in context:
http://r.789695.n4.nabble.com/Data-permutation-tp3085707p3085707.html
Sent from the R help mailing list archive at Nabble.com.

Gabor Grothendieck

2010-Dec-13 17:18 UTC

head link

[R] Data permutation

On Mon, Dec 13, 2010 at 11:37 AM, matteop <mprato at iese.edu>
wrote:>
> Hello R User,
>
> I am new in R and trying to migrate from SAS. I have to convert a table
that
> look like this
>
> YEAR ? ?FIRM ? ? ? ? ? ?ID_NAME ? ? ? ? VALUE
> 1994 ? ?Microsoft ? ? ? John Doe ? ? ? ? ? ? ? ?5
> 1994 ? ?Microsoft ? ? ? Mark Smith ? ? ? ? ? ? ?3
> 1994 ? ?Microsoft ? ? ? David Ring ? ? ? ? ? ? ?2
>
> In this:
> YEAR ? ?FIRM ? ? ? ? ? ?ID1 ? ? ? ? ? ? vALUE ? ID2 ? ? ? ? ? ? VALUE
> 1994 ? ?Microsoft ? ? ? John Doe ? ? ? ?5 ? ? ? ? ? ? ? Mark Smith ? ? ? ?
? ? ?3
> 1994 ? ?Microsoft ? ? ? John Doe ? ? ? ?5 ? ? ? ? ? ? ? David Ring ? ? ? ?
? ? ?2
> 1994 ? ?Microsoft ? ? ? Mark Smith ? ? ?3 ? ? ? ? ? ? ? David Ring ? ? ? ?
? ? ?2
>
> I have to do it for all the possible pair combination of ID_Name linked to
> the same firm for any given year in my sample.
> Do you have any suggestion?
>
Here are a few possibilities:

1. merge/subset

subset(merge(DF, DF, by = 1:2), as.character(ID_NAME.x) <
as.character(ID_NAME.y))

2. sqldf with default names

library(sqldf)
sqldf("select * from DF a join DF b using(YEAR, FIRM)
  where a.ID_NAME < b.ID_NAME", method = "raw")

Its important that you use method = "raw" to override the automatic
class assignment heuristic which in this case tries to assign factors
to the ID_NAME columns but gets the factor levels wrong.  If you use
method = "raw" it should work ok here.

3. sqldf with new names

This also works and does not need method = "raw":

sqldf("select YEAR, FIRM,
   a.ID_NAME ID_NAME1, a.VALUE VALUE1,
   b.ID_NAME ID_NAME2, b.VALUE VALUE2
   from DF a join DF b using(YEAR, FIRM)
   where a.ID_NAME < b.ID_NAME")

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

R help - Dec 2010 - Data permutation

[R] Data permutation

[R] Data permutation