Andersson, Jafet
2009-Jan-23 13:06 UTC
[R] Write to multiple connections or multiple text files
Hi all, I want to modify a large number of text files (ca 4000) by replacing a value found on a particular line in them with a value from an R object. For a single file I would normally use: con<-file ("foo.txt", open="r+") content<-readLines(con) content[n]<-"test" writeLines(content,con) close(con) For repeating this for several files I can write a for loop around this. However, my problem with this is that it is rather slow. I am therefore wondering if there is any other way to write to multiple connections in a similar way as one can e.g. write to a large number of rows in a matrix simultaneously? (Note that seek() is not so practical for me since the number of bytes before the specific line varies between the files, therefore I use reanLines() and match the right line instead.) My Systems: OS: Windows Server 2003 & Linux Red Hat (interchangeably) R version: 2.7.2 Thanks for any suggestions! ooo Jafet Andersson Eawag - The Swiss Federal Institute of Aquatic Science and Technology Ueberlandstrasse 133 P.O. Box 611 CH-8600 Duebendorf Switzerland Phone: +41 (0)44 823 5358 Fax: +41 (0)44 823 5028 http://www.eawag.ch/index_EN
jim holtman
2009-Jan-23 13:22 UTC
[R] Write to multiple connections or multiple text files
The solution you have seems to read in all the lines of data at once, operate on them and then write them out as a whole chunck. Having multiple connections open won't really help since I/O is serial. So is the code you included the actual code, or just an example? It would help to see what the real code is to see if there is some way of optimizing it. On Fri, Jan 23, 2009 at 8:06 AM, Andersson, Jafet <Jafet.Andersson at eawag.ch> wrote:> Hi all, > > I want to modify a large number of text files (ca 4000) by replacing a > value found on a particular line in them with a value from an R object. > > For a single file I would normally use: > con<-file ("foo.txt", open="r+") > content<-readLines(con) > content[n]<-"test" > writeLines(content,con) > close(con) > > For repeating this for several files I can write a for loop around this. > However, my problem with this is that it is rather slow. I am therefore > wondering if there is any other way to write to multiple connections in > a similar way as one can e.g. write to a large number of rows in a > matrix simultaneously? > > (Note that seek() is not so practical for me since the number of bytes > before the specific line varies between the files, therefore I use > reanLines() and match the right line instead.) > > My Systems: > OS: Windows Server 2003 & Linux Red Hat (interchangeably) > R version: 2.7.2 > > Thanks for any suggestions! > > ooo > Jafet Andersson > Eawag - The Swiss Federal Institute of Aquatic Science and Technology > Ueberlandstrasse 133 > P.O. Box 611 > CH-8600 Duebendorf > Switzerland > Phone: +41 (0)44 823 5358 > Fax: +41 (0)44 823 5028 > http://www.eawag.ch/index_EN > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?
Charles C. Berry
2009-Jan-23 18:46 UTC
[R] Write to multiple connections or multiple text files
On Fri, 23 Jan 2009, Andersson, Jafet wrote:> Hi all, > > I want to modify a large number of text files (ca 4000) by replacing a > value found on a particular line in them with a value from an R object. > > For a single file I would normally use: > con<-file ("foo.txt", open="r+") > content<-readLines(con) > content[n]<-"test" > writeLines(content,con) > close(con)If you know 'n' before opening the connection or can easily figure it out using the system's grep command (on your linux side), you would probably be best off using two calls to the system() function along with your system's head, cat, and tail functions to copy the parts you do not alter to a temp file. Using pipe() with the system's head and tail commands you pick off the line you modify and cat() to append it in between the system() call that copies what preceeds it and and the call that copies what succeeds it. Once the temp file is complete, you replace the orignal with it. HTH, Chuck> > For repeating this for several files I can write a for loop around this. > However, my problem with this is that it is rather slow. I am therefore > wondering if there is any other way to write to multiple connections in > a similar way as one can e.g. write to a large number of rows in a > matrix simultaneously? > > (Note that seek() is not so practical for me since the number of bytes > before the specific line varies between the files, therefore I use > reanLines() and match the right line instead.) > > My Systems: > OS: Windows Server 2003 & Linux Red Hat (interchangeably) > R version: 2.7.2 > > Thanks for any suggestions! > > ooo > Jafet Andersson > Eawag - The Swiss Federal Institute of Aquatic Science and Technology > Ueberlandstrasse 133 > P.O. Box 611 > CH-8600 Duebendorf > Switzerland > Phone: +41 (0)44 823 5358 > Fax: +41 (0)44 823 5028 > http://www.eawag.ch/index_EN > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cberry at tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901