Hi all, I have a huge dataset(5000k observations), which contains the daily sales for each company. If I want to find out the a company with unique company id number, which function I should use that is more efficient, match or "=="? For example, use<-dataset[!is.na(match(dataset$companyID, 12345)),] or use<-dataset[dataset$companyID==12345,] Thank you very much. -- View this message in context: http://n4.nabble.com/match-function-or-tp1754505p1754505.html Sent from the R help mailing list archive at Nabble.com.
You might try ?system.time> Hi all, > > I have a huge dataset(5000k observations), which contains the daily sales > for each company. > If I want to find out the a company with unique company id number, which > function I should use that is more efficient, match or "=="? For example, > > use<-dataset[!is.na(match(dataset$companyID, 12345)),] > or > use<-dataset[dataset$companyID==12345,] > > Thank you very much. >
You might want to investigate the 'data.table' package. On 07/04/2010 16:15, bo wrote:> > Hi all, > > I have a huge dataset(5000k observations), which contains the daily sales > for each company. > If I want to find out the a company with unique company id number, which > function I should use that is more efficient, match or "=="? For example, > > use<-dataset[!is.na(match(dataset$companyID, 12345)),] > or > use<-dataset[dataset$companyID==12345,] > > Thank you very much. >-- Patrick Burns pburns at pburns.seanet.com http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno')
Thank you very much for the help. I installed data.table package, but I keep getting the following warnings:> setkey(DT,id,date)Warning messages: 1: In `[.data.table`(deref(x), o) : This R session is < 2.4.0. Please upgrade to 2.4.0+. I'm using R 2.10, but why I keep getting warnings on upgrades. Thanks again. -- View this message in context: http://n4.nabble.com/match-function-or-tp1754505p1755876.html Sent from the R help mailing list archive at Nabble.com.