Hello, I have a data frame with > 5000 columns and I'd like to be able to make subsets of that data frame made up of certain columns by using part of the column names. I've had a surprisingly hard time finding something that works by searching online. For example, lets say I have a data frame (df) of 2 obs. of 6 variables. The 6 variables are called "1940_tmax", "1940_ppt", "1940_tmin", "1941_tmax", "1941_ppt", "1941_tmin". I want to create a new data frame with only the variables that have "ppt" in the variable (column) name, so that it looks like this: plot name 1940_ppt 1941_ppt 774-CL 231 344 778-RW 228 313 Thanks. -- Christopher R. Dolanc Post-doctoral Researcher University of California, Davis & University of Montana
On 19.06.2014 23:50, Chris Dolanc wrote:> Hello, > > I have a data frame with > 5000 columns and I'd like to be able to make > subsets of that data frame made up of certain columns by using part of > the column names. I've had a surprisingly hard time finding something > that works by searching online. > > For example, lets say I have a data frame (df) of 2 obs. of 6 variables. > The 6 variables are called "1940_tmax", "1940_ppt", "1940_tmin", > "1941_tmax", "1941_ppt", "1941_tmin". I want to create a new data frame > with only the variables that have "ppt" in the variable (column) name, > so that it looks like this: > > plot name 1940_ppt 1941_ppt > 774-CL 231 344 > 778-RW 228 313 > > Thanks. >df[ , grepl("_ppt$", names(df))] Best, Uwe Ligges
On Thu, 19 Jun 2014 02:50:20 PM Chris Dolanc wrote:> Hello, > > I have a data frame with > 5000 columns and I'd like to be able tomake> subsets of that data frame made up of certain columns by usingpart of> the column names. I've had a surprisingly hard time findingsomething> that works by searching online. > > For example, lets say I have a data frame (df) of 2 obs. of 6variables.> The 6 variables are called "1940_tmax", "1940_ppt", "1940_tmin", > "1941_tmax", "1941_ppt", "1941_tmin". I want to create a new dataframe> with only the variables that have "ppt" in the variable (column)name,> so that it looks like this: > > plot name 1940_ppt 1941_ppt > 774-CL 231 344 > 778-RW 228 313 >Hi Chris, One way is to get the column indices: grep("ppt",names(df)) [1] 2 5 so, newdf<-df[grep("ppt",names(df))] and then you apparently want to add a column with some other information, so probably: newdf<-cbind(<something_else>, df[grep("ppt",names(df))]) Jim