Dear R users,
I am interested in taking the columns from multiple dataframes, the
problem is that the different dataframes have different combinations
of the same variable names, here's a simple example:
a<-rep(1:10)
b<-rep(1:10)
c<-rep(21:30)
d<-rep(31:40)
dat.a<-data.frame(a,b,c,d)
names(dat.a)<-c("a", "b", "c", "d")
dat.b<-data.frame(a,c,d)
names(dat.b)<-c("a", "c", "d")
I would like to first see if the names in the larger dataframe match
those of the smaller (they have the same variables)
names(dat.a)%in%names(dat.b)
Could anyone help with this problem, I would basically like to form a
subset of the dat.a that matches the variable names in dat.b. If
there were only a few variables, this would be easier, but I have
between 4 and 5 thousand variables in each dataset
Any help would be greatly appreciated.
Best,
Corey
Corey Sparks
Assistant Professor
Department of Demography and Organization Studies
University of Texas at San Antonio
College of Public Policy
501 West Durango Blvd
Monterey Building 2.270C
San Antonio, TX 78207
210 458 3166
corey.sparks 'at' utsa.edu
On Sep 22, 2009, at 5:58 PM, Corey Sparks wrote:> Dear R users, > I am interested in taking the columns from multiple dataframes, the > problem is that the different dataframes have different combinations > of the same variable names, here's a simple example: > a<-rep(1:10) > b<-rep(1:10) > c<-rep(21:30) > d<-rep(31:40) > > dat.a<-data.frame(a,b,c,d) > names(dat.a)<-c("a", "b", "c", "d") > > dat.b<-data.frame(a,c,d) > names(dat.b)<-c("a", "c", "d") > > I would like to first see if the names in the larger dataframe match > those of the smaller (they have the same variables) > > names(dat.a)%in%names(dat.b) > > > Could anyone help with this problem, I would basically like to form > a subset of the dat.a that matches the variable names in dat.b. If > there were only a few variables, this would be easier, but I have > between 4 and 5 thousand variables in each datasetI have never tried this on the scale you propose, but on your toy example, here's what works; > names(dat.a)%in%names(dat.b) # your code which returns a logical vector [1] TRUE FALSE TRUE TRUE > subset(dat.a, select= names(dat.a)%in%names(dat.b) ) a c d 1 1 21 31 2 2 22 32 3 3 23 33 4 4 24 34 5 5 25 35 6 6 26 36 7 7 27 37 8 8 28 38 9 9 29 39 10 10 30 40>-- David Winsemius, MD Heritage Laboratories West Hartford, CT
Henrique Dallazuanna
2009-Sep-23 00:18 UTC
[R] Subsetting dataframes based on column names
You can use intersect also: dat.a[intersect(names(dat.a), names(dat.b))] On Tue, Sep 22, 2009 at 6:58 PM, Corey Sparks <corey.sparks at utsa.edu> wrote:> Dear R users, > I am interested in taking the columns from multiple dataframes, the problem > is that the different dataframes have different combinations of the same > variable names, here's a simple example: > a<-rep(1:10) > b<-rep(1:10) > c<-rep(21:30) > d<-rep(31:40) > > dat.a<-data.frame(a,b,c,d) > names(dat.a)<-c("a", "b", "c", "d") > > dat.b<-data.frame(a,c,d) > names(dat.b)<-c("a", "c", "d") > > I would like to first see if the names in the larger dataframe match those > of the smaller (they have the same variables) > > names(dat.a)%in%names(dat.b) > > > Could anyone help with this problem, I would basically like to form a subset > of the dat.a that matches the variable names in dat.b. ?If there were only a > few variables, this would be easier, but I have between 4 and 5 thousand > variables in each dataset > > Any help would be greatly appreciated. > Best, > Corey > > Corey Sparks > Assistant Professor > Department of Demography and Organization Studies > University of Texas at San Antonio > College of Public Policy > 501 West Durango Blvd > Monterey Building 2.270C > San Antonio, TX 78207 > 210 458 3166 > corey.sparks 'at' utsa.edu > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O