gianni lavaredo
2012-May-15 20:16 UTC
[R] Problem to resolve a step for reading a large TXT and split in several file
Dear Researchs,
It's the first time I am trying to resolve this problem. I have a TXT file
with 1408452 rows. I wish to split file-by-file where each file has
1,000,000 rows with the following procedure:
# split in two file one with 1,000,000 of rows and one with 408,452 of rows
file <- "09G001_72975_7575_25_4025.txt"
fileNames <- strsplit(as.character(file), ".", fixed = TRUE)
fileNames.temp.1 <- unique(as.vector(do.call("rbind", fileNames)[,
1]))
con <- file(file, open = "r")
# n is the number of row
n <- 1000000
i <- 0
while (length(readLines(con, n=n)) > 0 ) {
i <- i + 1
pv <- read.table(con,header=F,sep="\t", nrow=n)
write.table(pv, file =
paste(fileNames.temp.1,"_",i,".txt",sep = ""),
sep = "\t")
}
close(con)
when I use 1,000,000 I have in the directory only
"09G001_72975_7575_25_4025_1.txt" (with 1000000 of rows) and not
"09G001_72975_7575_25_4025_2.txt" (with 408,452). I din't
understand where
is my bug
Furthermore when i wish for example split in 3 files (where n is 469484
1408452/3) i have this message:
*Error in read.table(con, header = F, sep = "\t", nrow = n) :
no lines available in input*
Thanks for all help and sorry for the disturb
[[alternative HTML version deleted]]