I have a data frame with 155,000 rows. One of the columns represents the user id (of which about 10,000 are unique). I am able to isolate 1000 of these user ids (stored in a list) that I want to eliminate from the data set, but I don't know of an efficient way to do this. Certainly this would be slow: newdf<-df for(i in listofbadusers) { newdf<-subset(tmp,uid!=i) } is there a better approach? I guess I could use the opposite logic and use a list of good users and add their data to the new frame... thanks, pete
I can't see what would be wrong with newdf <- subset(df, uid %in% listofgoodusers) or newdf <- subset(df, !(uid %in% listofbadusers)) Is this what you want? Please note the code you supplied will not run at all, let alone slowly, so it is not easy to know exactly what you are trying to achieve> -----Original Message----- > newdf<-df > for(i in listofbadusers) { > newdf<-subset(tmp,uid!=i) > }Simon Fear Senior Statistician Syne qua non Ltd Tel: +44 (0) 1379 644449 Fax: +44 (0) 1379 644445 email: Simon.Fear at synequanon.com web: http://www.synequanon.com Number of attachments included with this message: 0 This message (and any associated files) is confidential and\...{{dropped}}
Peter Whiting wrote:> I have a data frame with 155,000 rows. One of the columns > represents the user id (of which about 10,000 are unique). I am > able to isolate 1000 of these user ids (stored in a list) that > I want to eliminate from the data set, but I don't know of an > efficient way to do this. Certainly this would be slow: > > newdf<-df > for(i in listofbadusers) { > newdf<-subset(tmp,uid!=i) > }What about subsetting? See help("["). One solution (not saying it is the optimal one): newdf <- df[!(df$uid %in% listofbadusers), ] Uwe Ligges> is there a better approach? > > I guess I could use the opposite logic and use a list of > good users and add their data to the new frame... > > thanks, > pete > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help