Hello I am having problems indexing a subset dataframe, which was created as:> waspsNoGV<-subset(wasps,site!="GV")Fitting a linear model revealed some data points which had high leverage, so I attempted to redo the regression without these data points:>wasps.lm<-lm(r~Nt,data=waspsNoGV[-c(61,69,142),])which resulted in a "subscript out of bounds" error. I'm pretty sure the problem is that the data points identified in the regression as having high leverage were the row names carried over from the original dataframe which had 150 rows, but when I try to remove data point #142 from the subset dataframe this tries to reference by a numerical index but there are only 130 data points in the subset dataframe hence the "subscript out of bounds" message. So I guess my question is how do I reference the data points to drop from the regression by name? Thanks Mandy ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ WARNING: This email and any attachments may be confidential ...{{dropped}}
Mandy Barron wrote:> Hello > I am having problems indexing a subset dataframe, which was created > as: >> waspsNoGV<-subset(wasps,site!="GV") > > Fitting a linear model revealed some data points which had high > leverage, so I attempted to redo the regression without these data > points: >> wasps.lm<-lm(r~Nt,data=waspsNoGV[-c(61,69,142),]) > which resulted in a "subscript out of bounds" error. > > I'm pretty sure the problem is that the data points identified in the > regression as having high leverage were the row names carried over from > the original dataframe which had 150 rows, but when I try to remove data > point #142 from the subset dataframe this tries to reference by a > numerical index but there are only 130 data points in the subset > dataframe hence the "subscript out of bounds" message. So I guess my > question is how do I reference the data points to drop from the > regression by name?Does this do it? wasps.lm <- lm(r ~ Nt, data = subset(wasps, site != "GV" & !(rownames(wasps) %in% c(61,69,142)))> Thanks > Mandy > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > WARNING: This email and any attachments may be confidential ...{{dropped}} > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894
Mandy Barron wrote:> Hello > I am having problems indexing a subset dataframe, which was created > as: > >>waspsNoGV<-subset(wasps,site!="GV") > > > Fitting a linear model revealed some data points which had high > leverage, so I attempted to redo the regression without these data > points: > >>wasps.lm<-lm(r~Nt,data=waspsNoGV[-c(61,69,142),]) > > which resulted in a "subscript out of bounds" error. > > I'm pretty sure the problem is that the data points identified in the > regression as having high leverage were the row names carried over from > the original dataframe which had 150 rows, but when I try to remove data > point #142 from the subset dataframe this tries to reference by a > numerical index but there are only 130 data points in the subset > dataframe hence the "subscript out of bounds" message. So I guess my > question is how do I reference the data points to drop from the > regression by name? >Hi Mandy, You're correct in that the old indices are no longer valid in the new dataframe. If you want to use the original indices (i.e. you can't just identify the new row indices in the new dataframe), you can do this: waspsNoGV$oldindices<-which(wasps$site != "GV") wasps.lm<-lm(r~Nt, data=waspsNoGV[-(wasps$oldindices %in% c(61,69,142))]) Jim