Thomas Nyberg
2014-Oct-29 15:23 UTC
[R] Using readLines on a file without resetting internal file offset parameter?
Hi everyone, I would like to read a file line by line, but I would rather not load all lines into memory first. I've tried using readLines with n = 1, but that seems to reset the internal file descriptor's file offset after each call. I.e. this is the current behavior: ------- bash $ echo 1 > testfile bash $ echo 2 >> testfile bash $ cat testfile 1 2 bash > R R > f <- file('testfile') R > readLines(f, n = 1) [1] "1" R > readLines(f, n = 1) [1] "1" ------- I would like the behavior to be: ------- bash > R R > f <- file('testfile') R > readLines(f, n = 1) [1] "1" R > readLines(f, n = 1) [1] "2" ------- I'm coming to R from a python background, where the default behavior is exactly the opposite. I.e. when you read a line from a file it is your responsibility to use seek explicitly to get back to the original position in the file (this is rarely necessary though). Is there some flag to turn off the default behavior of resetting the file offset in R? Cheers, Thomas
William Dunlap
2014-Oct-29 16:22 UTC
[R] Using readLines on a file without resetting internal file offset parameter?
Open your file object before calling readLines and close it when you are done with a sequence of calls to readLines. > tf <- tempfile() > cat(sep="\n", letters[1:10], file=tf) > f <- file(tf) > open(f) > # or f <- file(tf, "r") instead of previous 2 lines > readLines(f, n=1) [1] "a" > readLines(f, n=1) [1] "b" > readLines(f, n=2) [1] "c" "d" > close(f) I/O operations on an unopened connection generally open it, do the operation, then close it. Bill Dunlap TIBCO Software wdunlap tibco.com On Wed, Oct 29, 2014 at 8:23 AM, Thomas Nyberg <tomnyberg at gmail.com> wrote:> Hi everyone, > > I would like to read a file line by line, but I would rather not load all > lines into memory first. I've tried using readLines with n = 1, but that > seems to reset the internal file descriptor's file offset after each call. > I.e. this is the current behavior: > > ------- > > bash $ echo 1 > testfile > bash $ echo 2 >> testfile > bash $ cat testfile > 1 > 2 > > bash > R > R > f <- file('testfile') > R > readLines(f, n = 1) > [1] "1" > R > readLines(f, n = 1) > [1] "1" > > ------- > > I would like the behavior to be: > > ------- > > bash > R > R > f <- file('testfile') > R > readLines(f, n = 1) > [1] "1" > R > readLines(f, n = 1) > [1] "2" > > ------- > > I'm coming to R from a python background, where the default behavior is > exactly the opposite. I.e. when you read a line from a file it is your > responsibility to use seek explicitly to get back to the original position > in the file (this is rarely necessary though). Is there some flag to turn > off the default behavior of resetting the file offset in R? > > Cheers, > Thomas > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.