Hi: I'm sure this is a very easy problem. I've consulted Data Manipulation With R and the R Book and can't find an answer. Sample list of data frames looks as follows: .xx<-list(df<-data.frame(Var1=rep('Alabama', 400), Var2=rep(c(2004, 2005, 2006, 2007), 400)), df2<-data.frame(Var1=rep('Tennessee', 400), Var2=rep(c(2004,2005,2006,2007), 400)), df3<-data.frame(Var1=rep('Alaska', 400), Var2=rep(c(2004,2005,2006,2007), 400)) ) I would like to accomplish the following two tasks. First, I'd like to go through and change the names of each of the data frames within the list to be 'State' and 'Year' Second, I'd like to go through and add one year to each of the 'Var2' variables. Third, I'd like to then delete those cases in the data frames that have values of Var2 (or Year) values of 2008. I could do this manually, but my data are actually bigger than this, plus I'd really like to learn. I've been trying to use lapply, but I can't get my head around how it works: .xx<- lapply(.xx, function(x) colnames(x)<-c('State', 'Year') just changes the actual list of data frames to a list of the character string ('State' and 'Year') How do I actually change the underlying variable names? I'm grateful for your suggestions! Simon Kiss ********************************* Simon J. Kiss, PhD Assistant Professor, Wilfrid Laurier University 73 George Street Brantford, Ontario, Canada N3T 2C9 Cell: +1 905 746 7606
Steve Lianoglou
2012-Mar-12 18:49 UTC
[R] lapply to change variable names and variable values
Hi, On Mon, Mar 12, 2012 at 2:37 PM, Simon Kiss <sjkiss at gmail.com> wrote:> Hi: I'm sure this is a very easy problem. I've consulted Data Manipulation With R and the R Book and can't find an answer. > > Sample list of data frames looks as follows: > > .xx<-list(df<-data.frame(Var1=rep('Alabama', 400), Var2=rep(c(2004, 2005, 2006, 2007), 400)), df2<-data.frame(Var1=rep('Tennessee', 400), Var2=rep(c(2004,2005,2006,2007), 400)), df3<-data.frame(Var1=rep('Alaska', 400), Var2=rep(c(2004,2005,2006,2007), 400)) ) > > I would like to accomplish the following two tasks. > First, I'd like to go through and change the names of each of the data frames within the list > to be 'State' and 'Year' > > Second, I'd like to go through and add one year to each of the 'Var2' ?variables. > > Third, I'd like to then delete those cases in the data frames that have values of Var2 (or Year) values of 2008. > > I could do this manually, but my data are actually bigger than this, plus I'd really like to learn. I've been trying to use lapply, but I can't get my head around how it works: > ?.xx<- lapply(.xx, function(x) colnames(x)<-c('State', 'Year') > just changes the actual list of data frames to a list of the character string ('State' and 'Year') ?How do I actually change the underlying variable names?Almost there, you have to return the data.frame you've just changed, eg: xx <- lapply(.xx, function(x) { colnames(x) <- c('state', 'year') x }) If you want to remove the rows that correspond to 2008, you can do this: xx <- lapply(.xx, function(x) { colnames(x) <- c('state', 'year') subset(x, year != 2008) }) HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
Sarah Goslee
2012-Mar-12 18:52 UTC
[R] lapply to change variable names and variable values
Hi Simon, On Mon, Mar 12, 2012 at 2:37 PM, Simon Kiss <sjkiss at gmail.com> wrote:> Hi: I'm sure this is a very easy problem. I've consulted Data Manipulation With R and the R Book and can't find an answer. > > Sample list of data frames looks as follows: > > .xx<-list(df<-data.frame(Var1=rep('Alabama', 400), Var2=rep(c(2004, 2005, 2006, 2007), 400)), df2<-data.frame(Var1=rep('Tennessee', 400), Var2=rep(c(2004,2005,2006,2007), 400)), df3<-data.frame(Var1=rep('Alaska', 400), Var2=rep(c(2004,2005,2006,2007), 400)) )I tweaked this a bit so that it doesn't actually create df, df2, df3 as well as making a list of them, and so that xx doesn't begin with a . and shows up with ls(). I don't need invisible objects in my testing session. xx<-list(df=data.frame(Var1=rep('Alabama', 400), Var2=rep(c(2004, 2005, 2006, 2007), 400)), df2=data.frame(Var1=rep('Tennessee', 400), Var2=rep(c(2004,2005,2006,2007), 400)), df3=data.frame(Var1=rep('Alaska', 400), Var2=rep(c(2004,2005,2006,2007), 400)) )> I would like to accomplish the following two tasks. > First, I'd like to go through and change the names of each of the data frames within the list > to be 'State' and 'Year' > > Second, I'd like to go through and add one year to each of the 'Var2' ?variables. > > Third, I'd like to then delete those cases in the data frames that have values of Var2 (or Year) values of 2008. > > I could do this manually, but my data are actually bigger than this, plus I'd really like to learn. I've been trying to use lapply, but I can't get my head around how it works: > ?.xx<- lapply(.xx, function(x) colnames(x)<-c('State', 'Year') > just changes the actual list of data frames to a list of the character string ('State' and 'Year') ?How do I actually change the underlying variable names?Your function doesn't return the right thing. To see how it works, it's often a good idea to write a stand-alone function and see what it does. For instance, rename <- function(x) { colnames(x)<-c('State', 'Year') x } To me at least, as soon as it's written as a stand-alone it's obvious that you have to return x in the last line. You can either use rename() in your lapply statement: xx<- lapply(xx, rename) or you can write the full function into the lapply statement:> xx<-list(df=data.frame(Var1=rep('Alabama', 400), Var2=rep(c(2004, 2005, 2006, 2007), 400)), df2=data.frame(Var1=rep('Tennessee', 400), Var2=rep(c(2004,2005,2006,2007), 400)), df3=data.frame(Var1=rep('Alaska', 400), Var2=rep(c(2004,2005,2006,2007), 400)) ) > xx <- lapply(xx, function(x){ colnames(x)<-c('State', 'Year'); x} ) > colnames(xx[[1]])[1] "State" "Year" The same strategy should work for your other needs as well. Sarah -- Sarah Goslee http://www.functionaldiversity.org
R. Michael Weylandt
2012-Mar-12 19:02 UTC
[R] lapply to change variable names and variable values
Your function doesn't return the new data frame but rather the new names. Note, e.g. x <- 1:2 names(x) <- letters[1:2] .Last.value # Not x! Try this: .xx<- lapply(.xx, function(x) {colnames(x)<-c('State', 'Year'); x}) or more explicitly .xx<- lapply(.xx, function(x) {colnames(x)<-c('State', 'Year'); return(x)}) Michael On Mon, Mar 12, 2012 at 2:37 PM, Simon Kiss <sjkiss at gmail.com> wrote:> Hi: I'm sure this is a very easy problem. I've consulted Data Manipulation With R and the R Book and can't find an answer. > > Sample list of data frames looks as follows: > > .xx<-list(df<-data.frame(Var1=rep('Alabama', 400), Var2=rep(c(2004, 2005, 2006, 2007), 400)), df2<-data.frame(Var1=rep('Tennessee', 400), Var2=rep(c(2004,2005,2006,2007), 400)), df3<-data.frame(Var1=rep('Alaska', 400), Var2=rep(c(2004,2005,2006,2007), 400)) ) > > I would like to accomplish the following two tasks. > First, I'd like to go through and change the names of each of the data frames within the list > to be 'State' and 'Year' > > Second, I'd like to go through and add one year to each of the 'Var2' ?variables. > > Third, I'd like to then delete those cases in the data frames that have values of Var2 (or Year) values of 2008. > > I could do this manually, but my data are actually bigger than this, plus I'd really like to learn. I've been trying to use lapply, but I can't get my head around how it works: > ?.xx<- lapply(.xx, function(x) colnames(x)<-c('State', 'Year') > just changes the actual list of data frames to a list of the character string ('State' and 'Year') ?How do I actually change the underlying variable names? > > I'm grateful for your suggestions! > Simon Kiss > > ********************************* > Simon J. Kiss, PhD > Assistant Professor, Wilfrid Laurier University > 73 George Street > Brantford, Ontario, Canada > N3T 2C9 > Cell: +1 905 746 7606 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.