I have been trying to read a random sample of lines from a file into a
data frame using readLines(). The help indicates that readLines() will
start from the current line if the connection is open, but presented with
a closed connection it will open it, start from the beginning, and close
it when finished.
In the code that follows I tried to open the file before reading but
apparently without success, because the result was repeated copies of the
first line:
flines <- 107165
slines <- 100
selected <- sort(sample(flines,slines))
strvec <- rep(??,slines)
file(?c:/data/perry/data.csv?,open="r")
isel <- 0
for (iline in 1:slines) {
isel <- isel + 1
cline <- readLines(?c:/data/perry/data.csv?,n=1)
if (iline == selected[isel]) strvec[isel] <- cline else
isel <- isel - 1
}
close(?c:/data/perry/data.csv?)
sel.flows <- read.table(textConnection(strvec), header=FALSE,
sep=",")
There was also an error "no applicable method" for close.
Comments gratefully received.
Murray Jorgensen
You are using the connection the wrong way. You need to do something like:
fcon <- file("c:/data/perry/data.csv", open="r")
for (iline in 1:slines) {
isel <- isel + 1
cline <- readLines(fcon, n=1)
...
}
close(fcon)
BTW, here's how I'd do it (not tested!):
strvec <- rep("",slines)
selected <- sort(sample(flines, slines))
skip <- c(0, diff(selected) - 1)
fcon <- file("c:/data/[erry/data.csv", open="r")
for (i in 1:length(skip)) {
## skip to the selected line
readLines(fcon, n=skip[i])
strvec[i] <- readLines(fcon, n=1)
}
close(fcon)
HTH,
Andy
> -----Original Message-----
> From: maj at stats.waikato.ac.nz [mailto:maj at stats.waikato.ac.nz]
> Sent: Wednesday, August 27, 2003 7:19 PM
> To: r-help at stat.math.ethz.ch
> Subject: [R] Using files as connections
>
>
> I have been trying to read a random sample of lines from a
> file into a data frame using readLines(). The help indicates
> that readLines() will start from the current line if the
> connection is open, but presented with a closed connection it
> will open it, start from the beginning, and close it when finished.
>
> In the code that follows I tried to open the file before
> reading but apparently without success, because the result
> was repeated copies of the first line:
>
> flines <- 107165
> slines <- 100
> selected <- sort(sample(flines,slines))
> strvec <- rep("",slines)
> file("c:/data/perry/data.csv",open="r")
> isel <- 0
> for (iline in 1:slines) {
> isel <- isel + 1
> cline <- readLines("c:/data/perry/data.csv",n=1)
> if (iline == selected[isel]) strvec[isel] <- cline else
> isel <- isel - 1
> }
> close("c:/data/perry/data.csv")
> sel.flows <- read.table(textConnection(strvec), header=FALSE,
sep=",")
>
>
> There was also an error "no applicable method" for close.
>
> Comments gratefully received.
>
> Murray Jorgensen
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo> /r-help
>
------------------------------------------------------------------------------
Notice: This e-mail message, together with any attachments,...{{dropped}}
You need to save the connection object returned by file() and then use that object in other functions. You need to change the appropriate lines to the following (at least):>con <- file("c:/data/perry/data.csv",open="r") > cline <- readLines(con,n=1) >close(con)(I don't know if more changes are needed to get it working.) Note that using the connection object in other functions can have side effects on the connection object (which is how a connection "remembers" its point in the file.) (Perhaps more accurately, the side effect is on the internal system data referred to by the R connection object.) > con <- textConnection(letters) > con description class mode text "letters" "textConnection" "r" "text" opened can read can write "opened" "yes" "no" > readLines(con, 1) [1] "a" > readLines(con, 1) [1] "b" > con.saved <- con > readLines(con, 1) [1] "c" > readLines(con.saved, 1) [1] "d" > readLines(con, 1) [1] "e" > identical(con, con.saved) [1] TRUE > showConnections() description class mode text isopen can read can write 3 "letters" "textConnection" "r" "text" "opened" "yes" "no" > > hope this helps, Tony Plate At Thursday 11:19 AM 8/28/2003 +1200, you wrote:>I have been trying to read a random sample of lines from a file into a >data frame using readLines(). The help indicates that readLines() will >start from the current line if the connection is open, but presented with >a closed connection it will open it, start from the beginning, and close >it when finished. > >In the code that follows I tried to open the file before reading but >apparently without success, because the result was repeated copies of the >first line: > >flines <- 107165 >slines <- 100 >selected <- sort(sample(flines,slines)) >strvec <- rep("",slines) >file("c:/data/perry/data.csv",open="r") >isel <- 0 >for (iline in 1:slines) { > isel <- isel + 1 > cline <- readLines("c:/data/perry/data.csv",n=1) > if (iline == selected[isel]) strvec[isel] <- cline else > isel <- isel - 1 >} >close("c:/data/perry/data.csv") >sel.flows <- read.table(textConnection(strvec), header=FALSE, sep=",") > > >There was also an error "no applicable method" for close. > >Comments gratefully received. > >Murray Jorgensen > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://www.stat.math.ethz.ch/mailman/listinfo/r-help