Hi all, Is there a fast way to determine the number of lines in a file? I'm looking for something like count.lines analogous to count.fields. Hadley -- http://had.co.nz/
Hadley Wickham <hadley <at> rice.edu> writes:> > Hi all, > > Is there a fast way to determine the number of lines in a file? I'm > looking for something like count.lines analogous to count.fields. > > HadleyHow about something like length(readLines(fname)) Ken
Hi, parser::nlines does it in C. Romain On 02/08/2010 03:16 PM, Hadley Wickham wrote:> > Hi all, > > Is there a fast way to determine the number of lines in a file? I'm > looking for something like count.lines analogous to count.fields. > > Hadley-- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/MPYc : RProtoBuf: protocol buffers for R |- http://tr.im/KfKn : Rcpp 0.7.2 `- http://tr.im/JOlc : External pointers with Rcpp
Gabor Grothendieck
2010-Feb-08 14:52 UTC
[R] Fast way to determine number of lines in a file
If you are willing to use an external program parse the result of:> system("wc -l small.dat")10 small.dat On Windows there is a wc.exe program in the Rtools distribution. On Mon, Feb 8, 2010 at 9:16 AM, Hadley Wickham <hadley at rice.edu> wrote:> Hi all, > > Is there a fast way to determine the number of lines in a file? ?I'm > looking for something like count.lines analogous to count.fields. > > Hadley > > -- > http://had.co.nz/ > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
It depends on the type of file and your system. 'count.fields()' is impractical for large files because it generates a matrix with the same number of dimensions as the file. It would be easier to use scan() with the delimiter argument set up to read to the end of line marker, "\n" I believe, and the 'what' argument set to a null list, so nothing is actually read. Scan will still report the number of lines read. For flat files, and in windows, additional utilities installed with RTOOLS (just need the tools-Cygwin dlls install) are the fastest that I know of. if(.Platform$OS.type=="windows"){ system.time({ cmd<-system(paste("/RTools/bin/wc -l","much_data.bin"), intern=TRUE) cmd<-strsplit(cmd, " ")[[1]][1] }) } Sincerely, KeithC. -----Original Message----- From: Hadley Wickham [mailto:hadley at rice.edu] Sent: Monday, February 08, 2010 7:16 AM To: R-help Subject: [R] Fast way to determine number of lines in a file Hi all, Is there a fast way to determine the number of lines in a file? I'm looking for something like count.lines analogous to count.fields. Hadley -- http://had.co.nz/
Indian_R_Analyst
2010-Feb-10 11:08 UTC
[R] Fast way to determine number of lines in a file
Hi Hadley, Hope this is what you are looking for. This approach provides number of lines in a large 'bzip' file using chunks. testconn <- file("xyzxyz.csv.bz2", open="r") csize <- 10000 nolines <- 0 while((readnlines <- length(readLines(testconn,csize))) >0 ) nolines <- nolines+readnlines close(testconn) nolines Regards, Indian_R_Analyst. On Feb 8, 7:16?pm, Hadley Wickham <had... at rice.edu> wrote:> Hi all, > > Is there afastwayto determine the number of lines in a file? ?I'm > looking for something like count.lines analogous to count.fields. > > Hadley > > --http://had.co.nz/ > > ______________________________________________ > R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.