Dear R-users, -I am new to R, and I am struggling with the following problem. -I am repeating the following operations hundreds of times, within a loop: I want to subset a data frame by columns. I am interested in the columns names that are given by the rows of another data frame that was built in parallel. The solution I have so far works well as long as the elements of the second data frame are included in the column names of the first data frame but if an element from the second object is not a column name of the first one, then it bugs. -More concretely, I have the following data frames d and v: yyyymmdd<-c("19720601", "19720602", "19720605") sret.10006<-c(1,2,3) sret.10014<-c(5,9,7) sret.10065<-c(10,2,11) d<- data.frame(yyyymmdd=yyyymmdd, sret.10006=sret.10006, sret.10014=sret.10014, sret.10065=sret.10065) v<- data.frame(V1="sret.10006", V2="sret.10090") v<- sapply(v, function(x) levels(x)[x]) -I want to do the following subsetting: sub<- subset(d, select=c(v)) and I get the following error message: Error in `[.data.frame`(x, r, vars, drop = drop) : undefined columns selected Any help would be very much appreciated, Best, Aurelien [[alternative HTML version deleted]]
On 12/02/2011 07:20 AM, Aur?lien PHILIPPOT wrote:> Dear R-users, > -I am new to R, and I am struggling with the following problem. > > -I am repeating the following operations hundreds of times, within a loop: > I want to subset a data frame by columns. I am interested in the columns > names that are given by the rows of another data frame that was built in > parallel. The solution I have so far works well as long as the elements of > the second data frame are included in the column names of the first data > frame but if an element from the second object is not a column name of the > first one, then it bugs.Hi Aurelien, I would call this a feature, not a bug. I think R does what it should do, you request a non-existent column and it throws an error. What kind of behavior are you looking for instead of this error? regards, Paul> > -More concretely, I have the following data frames d and v: > yyyymmdd<-c("19720601", "19720602", "19720605") > sret.10006<-c(1,2,3) > sret.10014<-c(5,9,7) > sret.10065<-c(10,2,11) > > > d<- data.frame(yyyymmdd=yyyymmdd, sret.10006=sret.10006, > sret.10014=sret.10014, sret.10065=sret.10065) > > v<- data.frame(V1="sret.10006", V2="sret.10090") > v<- sapply(v, function(x) levels(x)[x]) > > -I want to do the following subsetting: > sub<- subset(d, select=c(v)) > > > and I get the following error message: > Error in `[.data.frame`(x, r, vars, drop = drop) : > undefined columns selected > > > > Any help would be very much appreciated, > > Best, > Aurelien > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770
?try If you know that you might have a problem with undefined columns, or whatever, then trap the error with 'try' so your program can recover. You could also validate the data that you are going to use before entering the loop; standard defensive programming - errors are always going to happen, so guard against them. Sent from my iPad On Dec 2, 2011, at 2:20, Aur?lien PHILIPPOT <aurelien.philippot at gmail.com> wrote:> Dear R-users, > -I am new to R, and I am struggling with the following problem. > > -I am repeating the following operations hundreds of times, within a loop: > I want to subset a data frame by columns. I am interested in the columns > names that are given by the rows of another data frame that was built in > parallel. The solution I have so far works well as long as the elements of > the second data frame are included in the column names of the first data > frame but if an element from the second object is not a column name of the > first one, then it bugs. > > > -More concretely, I have the following data frames d and v: > yyyymmdd<-c("19720601", "19720602", "19720605") > sret.10006<-c(1,2,3) > sret.10014<-c(5,9,7) > sret.10065<-c(10,2,11) > > > d<- data.frame(yyyymmdd=yyyymmdd, sret.10006=sret.10006, > sret.10014=sret.10014, sret.10065=sret.10065) > > v<- data.frame(V1="sret.10006", V2="sret.10090") > v<- sapply(v, function(x) levels(x)[x]) > > -I want to do the following subsetting: > sub<- subset(d, select=c(v)) > > > and I get the following error message: > Error in `[.data.frame`(x, r, vars, drop = drop) : > undefined columns selected > > > > Any help would be very much appreciated, > > Best, > Aurelien > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.