Sarah Henderson
2010-Mar-04 06:51 UTC
[R] Sub-setting a data frame by partial column names?
Hi all -- I think my Python brain is missing something crucial about string operations in R, but I cannot figure this out. I have a large data frame with several groups of similar variables. Similar variables are named according to their group, and I am now writing a function to check correlations within groups. I want to subset the data frame by partial variable name, something along the lines of this: partialName <- "XXX" subsetData <- bigData[, partialName in colnames(bigData)] Where bigData might have 10 columns named "XXX1", "XXX2" etc. Many thanks for any thoughts, Sarah
On 03/04/2010 05:51 PM, Sarah Henderson wrote:> Hi all -- > > I think my Python brain is missing something crucial about string > operations in R, but I cannot figure this out. > > I have a large data frame with several groups of similar variables. > Similar variables are named according to their group, and I am now > writing a function to check correlations within groups. I want to > subset the data frame by partial variable name, something along the > lines of this: > > partialName<- "XXX" > subsetData<- bigData[, partialName in colnames(bigData)] > > Where bigData might have 10 columns named "XXX1", "XXX2" etc. >Hi Sarah, Try this: column.names<-paste(sample(c("X","Y","Z"),100,TRUE), sample(c("X","Y","Z"),100,TRUE), sample(c("X","Y","Z"),100,TRUE), sample(0:9,100,TRUE),sep="") column.names[grep("XXX",column.names,fixed=TRUE)] Jim
Sarah Henderson
2010-Mar-04 07:40 UTC
[R] Sub-setting a data frame by partial column names?
Hi Jim, and thanks for your solution. I figured one out for myself about a minute after I posted this, and here it is if anyone else can find it valuable: subsetData<- bigData[,grep(partialName, colnames(bigData))] This is smaller than your solution, but similar I think. Cheers, Sarah On Thu, Mar 4, 2010 at 6:34 PM, Jim Lemon <jim at bitwrit.com.au> wrote:> On 03/04/2010 05:51 PM, Sarah Henderson wrote: >> >> Hi all -- >> >> I think my Python brain is missing something crucial about string >> operations in R, but I cannot figure this out. >> >> I have a large data frame with several groups of similar variables. >> Similar variables are named according to their group, and I am now >> writing a function to check correlations within groups. ?I want to >> subset the data frame by partial variable name, something along the >> lines of this: >> >> partialName<- "XXX" >> subsetData<- bigData[, partialName in colnames(bigData)] >> >> Where bigData might have 10 columns named "XXX1", "XXX2" etc. >> > > Hi Sarah, > > Try this: > > column.names<-paste(sample(c("X","Y","Z"),100,TRUE), > ?sample(c("X","Y","Z"),100,TRUE), > ?sample(c("X","Y","Z"),100,TRUE), > ?sample(0:9,100,TRUE),sep="") > column.names[grep("XXX",column.names,fixed=TRUE)] > > Jim >
Sarah Henderson wrote:> > > I think my Python brain is missing something crucial about string > operations in R, but I cannot figure this out. > > I have a large data frame with several groups of similar variables. > Similar variables are named according to their group, and I am now > writing a function to check correlations within groups. I want to > subset the data frame by partial variable name, something along the > lines of this: >With thanks to Peter Dalgaard, who sent me this 10 years ago at my first posting. You can do it in one line though, and use <- to be a real fReak. d = data.frame(xxx1=1:10,xx2=1:10,yy2=1:10,axx=1:10) selcols = grep("^xx", names(d)) d[,selcols] Dieter -- View this message in context: http://n4.nabble.com/Sub-setting-a-data-frame-by-partial-column-names-tp1577672p1577699.html Sent from the R help mailing list archive at Nabble.com.
Hi Sarah, Thanks a lot for the suggestion. It is working for me. subsetData<- bigData[,grep(partialName, colnames(bigData))] But if i want to extract multiple columns along with the columns with similar names, how do i that? Say the others column have similar values for all the columns with similar partialnames. I tried the following subsetData<- bigData[,grep("partialName", "Colname1", "Colname2", colnames(bigData))] It did not work. Do you have any suggestion ? Regards, Pankaj Barah -- View this message in context: http://r.789695.n4.nabble.com/Sub-setting-a-data-frame-by-partial-column-names-tp1577672p3057205.html Sent from the R help mailing list archive at Nabble.com.