Liaw, Andy
2006-Jun-28 12:43 UTC
[R] Very slow read.table on Linux, compared to Win2000 [Broad cast]
From: Peter Dalgaard> > <davidek at zla-ryba.cz> writes: > > > Dear all, > > > > I read.table a 17MB tabulator separated table with 483 > > variables(mostly numeric) and 15000 observations into R. > This takes a > > few seconds with R 2.3.1 on windows 2000, but it takes > several minutes > > on my Linux machine. The linux machine is Ubuntu 6.06, 256 MR RAM, > > Athlon 1600 processor. The windows hardware is better > (Pentium 4, 512 RAM), but it shouldn't make such a difference. > > > > The strange thing is that even doing something with the data(say a > > histogram of a variable, or transforming integers into a factor) > > takes really long time on the linux box and the computer > seems to work > > extensively with the hard disk. > > Could this be caused by swapping ? Can I increase the > memory allocated to R somehow ? > > I have checked the manual, but the memory options allowed for linux > > don't seem to help me (I may be doing it wrong, though ...) > > > > The code I run: > > > > TBO <- > read.table(file="TBO.dat",sep="\t",header=TRUE,dec=","); # > this takes forever > > TBO$sexe<-factor(TBO$sexe,labels=c("man","vrouw")); # > even this takes like 30 seconds, compared > > to nothing on Win2000 > > > > I'd be grateful for any suggestions, > > Almost surely, the fix is to insert more RAM chips. 256 MB > leaves you very little space for actual work these days, andTry running Windows on the 256MB box and you'll see why Peter recommended the above. Consider yourself lucky that R actually still does something useful under Unbuntu with so little RAM. If adding more RAM is not an option, perhaps not running X altogether would help. Andy> a 17MB file will get expanded to several times the original > size during reading and data manipulations. Using a > lightweight window manager can help, but you usually regret > the switch for other reasons. > > > -- > O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B > c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K > (*) \(*) -- University of Copenhagen Denmark Ph: > (+45) 35327918 > ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: > (+45) 35327907 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > >
Alberto Murta
2006-Jun-28 23:43 UTC
[R] Very slow read.table on Linux, compared to Win2000 [Broad cast]
I have a pentium 4 pc with 256 MB of RAM, so I made a text file, tab separated, with column names and 15000 x 483 integers:> system("ls -l teste.txt")-rw-r--r-- 1 amurta amurta 16998702 Jun 28 16:08 teste.txt the time it took to import it was around 15 secs:> system.time(teste <- read.delim("teste.txt"))[1] 15.349 0.244 16.597 0.000 0.000 so I think lack of RAM must not be the main problem. Cheers Alberto> version_ platform i386-unknown-freebsd6.1 arch i386 os freebsd6.1 system i386, freebsd6.1 status major 2 minor 3.1 year 2006 month 06 day 01 svn rev 38247 language R version.string Version 2.3.1 (2006-06-01) On Wednesday 28 June 2006 05:43, Liaw, Andy wrote:> From: Peter Dalgaard > > > <davidek at zla-ryba.cz> writes: > > > Dear all, > > > > > > I read.table a 17MB tabulator separated table with 483 > > > variables(mostly numeric) and 15000 observations into R. > > > > This takes a > > > > > few seconds with R 2.3.1 on windows 2000, but it takes > > > > several minutes > > > > > on my Linux machine. The linux machine is Ubuntu 6.06, 256 MR RAM, > > > Athlon 1600 processor. The windows hardware is better > > > > (Pentium 4, 512 RAM), but it shouldn't make such a difference. > > > > > The strange thing is that even doing something with the data(say a > > > histogram of a variable, or transforming integers into a factor) > > > takes really long time on the linux box and the computer > > > > seems to work > > > > > extensively with the hard disk. > > > Could this be caused by swapping ? Can I increase the > > > > memory allocated to R somehow ? > > > > > I have checked the manual, but the memory options allowed for linux > > > don't seem to help me (I may be doing it wrong, though ...) > > > > > > The code I run: > > > > > > TBO <- > > > > read.table(file="TBO.dat",sep="\t",header=TRUE,dec=","); # > > this takes forever > > > > > TBO$sexe<-factor(TBO$sexe,labels=c("man","vrouw")); # > > > > even this takes like 30 seconds, compared > > > > > to nothing on Win2000 > > > > > > I'd be grateful for any suggestions, > > > > Almost surely, the fix is to insert more RAM chips. 256 MB > > leaves you very little space for actual work these days, and > > Try running Windows on the 256MB box and you'll see why Peter recommended > the above. Consider yourself lucky that R actually still does something > useful under Unbuntu with so little RAM. If adding more RAM is not an > option, perhaps not running X altogether would help. > > Andy > > > a 17MB file will get expanded to several times the original > > size during reading and data manipulations. Using a > > lightweight window manager can help, but you usually regret > > the switch for other reasons. > > > > > > -- > > O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B > > c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K > > (*) \(*) -- University of Copenhagen Denmark Ph: > > (+45) 35327918 > > ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: > > (+45) 35327907 > > > > ______________________________________________ > > R-help at stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > > http://www.R-project.org/posting-guide.html > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html-- Alberto G. Murta IPIMAR - Institute of Fisheries and Marine Research Avenida de Brasilia; 1449-006 Lisboa; Portugal Tel: +351 213027120; Fax: +351 213015948