Dear baseRs, I recently made a mistake when renaming data frame columns, accidentally creating an NA column. I found the following strange behavior when negative indexes are used. Can anyone explain what happens here. No "workarounds" required, just curious. Dieter Version: Windows, R version 2.6.1 (2007-11-26) #----------------------------- df = data.frame(a=0:10,b=10:20) df[,-2] #ok names(df)=c("A") # implicitly creates an NA column df[,-2] df[,-2,drop=FALSE] # has nothing to do with drop df3 = data.frame(a=0:10,b=10:20,c=20:30) df3[,-2] #ok names(df3)=c("A","B") #creates an NA column df3[,-2] # error # Error in `[.data.frame`(df3, , -2) : undefined columns selected names(df3)[3]="NaN" # another reserved word df3[,-2] # no problem
I don't know why this is happening but it has nothing to do with a negative index df[,-2] has not changed df. --- Dieter Menne <dieter.menne at menne-biomed.de> wrote:> Dear baseRs, > > I recently made a mistake when renaming data frame > columns, accidentally > creating an NA column. I found the following strange > behavior when negative > indexes are used. > > Can anyone explain what happens here. No > "workarounds" required, just curious. > > Dieter > > Version: Windows, R version 2.6.1 (2007-11-26) > > #----------------------------- > df = data.frame(a=0:10,b=10:20) > df[,-2] #ok > names(df)=c("A") # implicitly creates an NA column > df[,-2] > df[,-2,drop=FALSE] # has nothing to do with > drop > > df3 = data.frame(a=0:10,b=10:20,c=20:30) > df3[,-2] #ok > names(df3)=c("A","B") #creates an NA column > df3[,-2] # error > # Error in `[.data.frame`(df3, , -2) : undefined > columns selected > > names(df3)[3]="NaN" # another reserved word > df3[,-2] # no problem > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. >Looking for the perfect gift? Give the gift of Flickr!
Assigning a name vector to a dataframe that is shorter than the number of columns results in some columns having NA values for their names. "[.data.frame" has the following code in it: cols <- names(x) ... if (any(is.na(cols))) stop("undefined columns selected") so, if a dataframe x has NA values for column names, you should expect x[...] to *sometimes* stop with that error (with a bit of reading and testing you could probably work out exactly when that error will occur). -- Tony Plate Dieter Menne wrote:> Dear baseRs, > > I recently made a mistake when renaming data frame columns, accidentally > creating an NA column. I found the following strange behavior when negative > indexes are used. > > Can anyone explain what happens here. No "workarounds" required, just curious. > > Dieter > > Version: Windows, R version 2.6.1 (2007-11-26) > > #----------------------------- > df = data.frame(a=0:10,b=10:20) > df[,-2] #ok > names(df)=c("A") # implicitly creates an NA column > df[,-2] > df[,-2,drop=FALSE] # has nothing to do with drop > > df3 = data.frame(a=0:10,b=10:20,c=20:30) > df3[,-2] #ok > names(df3)=c("A","B") #creates an NA column > df3[,-2] # error > # Error in `[.data.frame`(df3, , -2) : undefined columns selected > > names(df3)[3]="NaN" # another reserved word > df3[,-2] # no problem > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >