ManuelS
2011-Oct-25 16:43 UTC
[R] merging to data.frames whose columns are different but follow a pattern.
Hi, I'm working with panel data from the Swiss Houshold Panel (SHP). The data i got came in the following way: 1.) 12 *different* /individual/ files - one for each year . 2.) 12 *different* /houshold/ files - again: one for each year Each file came in the SPSS format (.sav). I implemented all the files in R an managed (via rename, cbind, rbind, merge etc.) to get *two* files. The first file is a combination of all 12 /individual/ files orderd by IDPERS (unique individual identification) and T (=Time 1...12) The second file contains all houshold waves (1-12) and is orderd by IDHOUS (unique houshold identification) and T (=1...12). Now, to the problem: As the following except shows: the individual id used in the individual file (IDPERS) is derived from the houshold id (IDHOUS).>head(INDIVIDUAL,20)IDPERS T SEX PC02 GHS Single 1 4101 1 1 9 1 NA 2 4101 2 1 10 1 NA 3 4101 3 1 10 1 NA 4 4101 4 1 10 1 NA . . . 13 4102 1 2 0 0 NA 14 4102 2 2 5 1 NA 15 4102 3 2 3 0 NA 16 4102 4 2 9 1 NA . . . 26 5101 1 1 8 1 NA 27 5101 2 1 8 1 NA 28 5101 3 1 8 1 NA 29 5101 4 1 8 1 NA> head(HOUSHOLD,20)IDHOUS T COM2 NBPERS NBKID 1 41 1 2 5 3 2 41 2 2 5 3 3 41 3 2 5 3 4 41 4 2 5 3 . . . 13 51 1 2 1 0 14 51 2 2 1 0 15 51 3 2 1 0 16 51 4 2 1 0 I want to merge the two files (data.frames) according to their ID's and the time dimension T. Since IDHOUS and IDPERS are not the same I cannot use /merge() /. Doing the task "by hand" in a excel spread sheet or any equivalent is possible but would probably take ages. So my question is: Does anyone know a way, a command, a trick to merge two files whose columns are different but follow a pattern as described above? I am gratful for any kind of input. Manuel Steiner -- View this message in context: http://r.789695.n4.nabble.com/merging-to-data-frames-whose-columns-are-different-but-follow-a-pattern-tp3937376p3937376.html Sent from the R help mailing list archive at Nabble.com.
Dennis Murphy
2011-Oct-25 19:15 UTC
[R] merging to data.frames whose columns are different but follow a pattern.
Hi: Try this: INDIVIDUAL <- transform(INDIVIDUAL, IDHOUS = IDPERS %/% 100) merge(INDIVIDUAL, HOUSHOLD, by = c('IDHOUS', 'T')) IDHOUS T IDPERS SEX PC02 GHS Single COM2 NBPERS NBKID 1 41 1 4101 1 9 1 NA 2 5 3 2 41 1 4102 2 0 0 NA 2 5 3 3 41 2 4101 1 10 1 NA 2 5 3 4 41 2 4102 2 5 1 NA 2 5 3 5 41 3 4101 1 10 1 NA 2 5 3 6 41 3 4102 2 3 0 NA 2 5 3 7 41 4 4101 1 10 1 NA 2 5 3 8 41 4 4102 2 9 1 NA 2 5 3 9 51 1 5101 1 8 1 NA 2 1 0 10 51 2 5101 1 8 1 NA 2 1 0 11 51 3 5101 1 8 1 NA 2 1 0 12 51 4 5101 1 8 1 NA 2 1 0 HTH, Dennis On Tue, Oct 25, 2011 at 9:43 AM, ManuelS <steiner-manuel at web.de> wrote:> Hi, > > I'm working with panel data from the Swiss Houshold Panel (SHP). The data i > got came in the following way: > 1.) 12 *different* /individual/ files - one for each year . > 2.) 12 *different* /houshold/ files - again: one for each year > > Each file came in the SPSS format (.sav). I implemented all the files in R > an managed (via rename, cbind, rbind, merge etc.) to get *two* files. > The first file is a combination of all 12 /individual/ files orderd by > IDPERS (unique individual identification) and T (=Time 1...12) > The second file contains all houshold waves (1-12) and is orderd by IDHOUS > (unique houshold identification) and T (=1...12). > > Now, to the problem: > As the following except shows: the individual id used in the individual file > (IDPERS) is derived from the houshold id (IDHOUS). > >>head(INDIVIDUAL,20) > ? IDPERS ?T SEX PC02 GHS Single > 1 ? ?4101 ?1 ? 1 ? ?9 ? 1 ? ? NA > 2 ? ?4101 ?2 ? 1 ? 10 ? 1 ? ? NA > 3 ? ?4101 ?3 ? 1 ? 10 ? 1 ? ? NA > 4 ? ?4101 ?4 ? 1 ? 10 ? 1 ? ? NA > . > . > . > 13 ? 4102 ?1 ? 2 ? ?0 ? 0 ? ? NA > 14 ? 4102 ?2 ? 2 ? ?5 ? 1 ? ? NA > 15 ? 4102 ?3 ? 2 ? ?3 ? 0 ? ? NA > 16 ? 4102 ?4 ? 2 ? ?9 ? 1 ? ? NA > . > . > . > 26 ? 5101 ?1 ? 1 ? ?8 ? 1 ? ? NA > 27 ? 5101 ?2 ? 1 ? ?8 ? 1 ? ? NA > 28 ? 5101 ?3 ? 1 ? ?8 ? 1 ? ? NA > 29 ? 5101 ?4 ? 1 ? ?8 ? 1 ? ? NA > > >> head(HOUSHOLD,20) > ? IDHOUS ?T COM2 NBPERS NBKID > 1 ? ? ?41 ?1 ? ?2 ? ? ?5 ? ? 3 > 2 ? ? ?41 ?2 ? ?2 ? ? ?5 ? ? 3 > 3 ? ? ?41 ?3 ? ?2 ? ? ?5 ? ? 3 > 4 ? ? ?41 ?4 ? ?2 ? ? ?5 ? ? 3 > . > . > . > 13 ? ? 51 ?1 ? ?2 ? ? ?1 ? ? 0 > 14 ? ? 51 ?2 ? ?2 ? ? ?1 ? ? 0 > 15 ? ? 51 ?3 ? ?2 ? ? ?1 ? ? 0 > 16 ? ? 51 ?4 ? ?2 ? ? ?1 ? ? 0 > > I want to merge the two files (data.frames) according to their ID's and the > time dimension T. Since IDHOUS and IDPERS are not the same I cannot use > /merge() /. Doing the task "by hand" in a excel spread sheet or any > equivalent is possible but would probably take ages. > So my question is: > > Does anyone know a way, a command, a trick to merge two files whose columns > are different but follow a pattern as described above? > > I am gratful for any kind of input. > > Manuel Steiner > > > -- > View this message in context: http://r.789695.n4.nabble.com/merging-to-data-frames-whose-columns-are-different-but-follow-a-pattern-tp3937376p3937376.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >