I have been trying to read a random sample of lines from a file into a data frame using readLines(). The help indicates that readLines() will start from the current line if the connection is open, but presented with a closed connection it will open it, start from the beginning, and close it when finished. In the code that follows I tried to open the file before reading but apparently without success, because the result was repeated copies of the first line: flines <- 107165 slines <- 100 selected <- sort(sample(flines,slines)) strvec <- rep(??,slines) file(?c:/data/perry/data.csv?,open="r") isel <- 0 for (iline in 1:slines) { isel <- isel + 1 cline <- readLines(?c:/data/perry/data.csv?,n=1) if (iline == selected[isel]) strvec[isel] <- cline else isel <- isel - 1 } close(?c:/data/perry/data.csv?) sel.flows <- read.table(textConnection(strvec), header=FALSE, sep=",") There was also an error "no applicable method" for close. Comments gratefully received. Murray Jorgensen
You are using the connection the wrong way. You need to do something like: fcon <- file("c:/data/perry/data.csv", open="r") for (iline in 1:slines) { isel <- isel + 1 cline <- readLines(fcon, n=1) ... } close(fcon) BTW, here's how I'd do it (not tested!): strvec <- rep("",slines) selected <- sort(sample(flines, slines)) skip <- c(0, diff(selected) - 1) fcon <- file("c:/data/[erry/data.csv", open="r") for (i in 1:length(skip)) { ## skip to the selected line readLines(fcon, n=skip[i]) strvec[i] <- readLines(fcon, n=1) } close(fcon) HTH, Andy> -----Original Message----- > From: maj at stats.waikato.ac.nz [mailto:maj at stats.waikato.ac.nz] > Sent: Wednesday, August 27, 2003 7:19 PM > To: r-help at stat.math.ethz.ch > Subject: [R] Using files as connections > > > I have been trying to read a random sample of lines from a > file into a data frame using readLines(). The help indicates > that readLines() will start from the current line if the > connection is open, but presented with a closed connection it > will open it, start from the beginning, and close it when finished. > > In the code that follows I tried to open the file before > reading but apparently without success, because the result > was repeated copies of the first line: > > flines <- 107165 > slines <- 100 > selected <- sort(sample(flines,slines)) > strvec <- rep("",slines) > file("c:/data/perry/data.csv",open="r") > isel <- 0 > for (iline in 1:slines) { > isel <- isel + 1 > cline <- readLines("c:/data/perry/data.csv",n=1) > if (iline == selected[isel]) strvec[isel] <- cline else > isel <- isel - 1 > } > close("c:/data/perry/data.csv") > sel.flows <- read.table(textConnection(strvec), header=FALSE, sep=",") > > > There was also an error "no applicable method" for close. > > Comments gratefully received. > > Murray Jorgensen > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo> /r-help >------------------------------------------------------------------------------ Notice: This e-mail message, together with any attachments,...{{dropped}}
You need to save the connection object returned by file() and then use that object in other functions. You need to change the appropriate lines to the following (at least):>con <- file("c:/data/perry/data.csv",open="r") > cline <- readLines(con,n=1) >close(con)(I don't know if more changes are needed to get it working.) Note that using the connection object in other functions can have side effects on the connection object (which is how a connection "remembers" its point in the file.) (Perhaps more accurately, the side effect is on the internal system data referred to by the R connection object.) > con <- textConnection(letters) > con description class mode text "letters" "textConnection" "r" "text" opened can read can write "opened" "yes" "no" > readLines(con, 1) [1] "a" > readLines(con, 1) [1] "b" > con.saved <- con > readLines(con, 1) [1] "c" > readLines(con.saved, 1) [1] "d" > readLines(con, 1) [1] "e" > identical(con, con.saved) [1] TRUE > showConnections() description class mode text isopen can read can write 3 "letters" "textConnection" "r" "text" "opened" "yes" "no" > > hope this helps, Tony Plate At Thursday 11:19 AM 8/28/2003 +1200, you wrote:>I have been trying to read a random sample of lines from a file into a >data frame using readLines(). The help indicates that readLines() will >start from the current line if the connection is open, but presented with >a closed connection it will open it, start from the beginning, and close >it when finished. > >In the code that follows I tried to open the file before reading but >apparently without success, because the result was repeated copies of the >first line: > >flines <- 107165 >slines <- 100 >selected <- sort(sample(flines,slines)) >strvec <- rep("",slines) >file("c:/data/perry/data.csv",open="r") >isel <- 0 >for (iline in 1:slines) { > isel <- isel + 1 > cline <- readLines("c:/data/perry/data.csv",n=1) > if (iline == selected[isel]) strvec[isel] <- cline else > isel <- isel - 1 >} >close("c:/data/perry/data.csv") >sel.flows <- read.table(textConnection(strvec), header=FALSE, sep=",") > > >There was also an error "no applicable method" for close. > >Comments gratefully received. > >Murray Jorgensen > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://www.stat.math.ethz.ch/mailman/listinfo/r-help