I have a couple of data sets that I want to combine into one data frame. One set contains a number of records on individual observations and includes a geographic descriptor called dacode. The dacode is not unique in that table. The other table contains a number of socio-economic variables for each of the geographic areas identified in the other table. This second table also includes a variable called dacode and I have also used dacode for the rownames. I want to pull a number of the variables from the socioeconomic status data into the first table but I'm having no luck. Here's what I have tried so far: #first attempt grade6DA$ses1 <- sesdata$ses1[as.character(grade6DA$dacode)] All that this does is create a new column in the grade6DA data set and calls it ses1 but then fills it with NA. #second attempt grade6DA$ses1 <- sesdata$ses1[grade6DA$dacode] Neither approach has worked so I've ended up combining the tables in a MySQL database and then importing back into R. However, it would be much easier if I could just manipulate the variables in R rather than going through MySQL everytime I want to try something new. I've looked in the R-manuals but did not see anything about this - but I could have been looking in the wrong places. Any ideas on how to accomplish what I am trying to do or advice on where to find the info would be greatly appreciated. Cheers, Neil ================================================Neil Hepburn, PhD Candidate Department of Economics University of Alberta email nhepburn at ualberta.ca URL http://www.ualberta.ca/~nhepburn
Marc Schwartz
2005-Nov-25 02:06 UTC
[R] adding variables to a data set/combining two data sets
On Wed, 2005-11-23 at 16:13 -0700, nhepburn wrote:> I have a couple of data sets that I want to combine into one data frame. > One set contains a number of records on individual observations and includes > a geographic descriptor called dacode. The dacode is not unique in that > table. The other table contains a number of socio-economic variables for > each of the geographic areas identified in the other table. This second > table also includes a variable called dacode and I have also used dacode for > the rownames. I want to pull a number of the variables from the > socioeconomic status data into the first table but I'm having no luck. > Here's what I have tried so far: > > #first attempt > > grade6DA$ses1 <- sesdata$ses1[as.character(grade6DA$dacode)] > > All that this does is create a new column in the grade6DA data set and calls > it ses1 but then fills it with NA. > > > #second attempt > grade6DA$ses1 <- sesdata$ses1[grade6DA$dacode] > > > Neither approach has worked so I've ended up combining the tables in a MySQL > database and then importing back into R. However, it would be much easier > if I could just manipulate the variables in R rather than going through > MySQL everytime I want to try something new. I've looked in the R-manuals > but did not see anything about this - but I could have been looking in the > wrong places. Any ideas on how to accomplish what I am trying to do or > advice on where to find the info would be greatly appreciated. > > Cheers, > NeilNeil, See ?merge which performs a SQL join type operation. HTH, Marc Schwartz P.S. To R Core: help.search("join") does not seem to return merge(), which would likely be helpful here.