Henrik Bengtsson
2002-Aug-06 10:14 UTC
[R] write.table() adds unnecessary spaces to right align integer column
When using write.table() to write data frames the integer columns are padded with unnecessary spaces (0x20) so that these columns are right align if you look at them in a text editor. However, I think it is quite a vast of file size. For instance, I am reading a tab-delimited 4200kb microarray data file and writing it back verbatim using write.table() and it becomes 5100kb, a 20% increase. Is this problem known and is there anyway to get around it? Example (R --vanilla): x <- c(0,1,10,100,1e3,1e5,1e6) df <- data.frame(a=x, b=as.integer(x), c=as.character(x)) write.table(df, "out.dat") Output: "a" "b" "c" "1" 0e+00 0 "0" "2" 1e+00 1 "1" "3" 1e+01 10 "10" "4" 1e+02 100 "100" "5" 1e+03 1000 "1000" "6" 1e+05 100000 "1e+05" "7" 1e+06 1000000 "1e+06" str(R.Version()): $ platform: chr "i386-pc-mingw32" $ arch : chr "i386" $ os : chr "mingw32" $ system : chr "i386, mingw32" $ status : chr "" $ major : chr "1" $ minor : chr "5.1" $ year : chr "2002" $ month : chr "06" $ day : chr "17" $ language: chr "R" Thanks Henrik Bengtsson Dept. of Mathematical Statistics @ Centre for Mathematical Sciences Lund Institute of Technology/Lund University, Sweden (+2h UTC) +46 46 2229611 (off), +46 708 909208 (cell), +46 46 2224623 (fax) h b @ m a t h s . l t h . s e, http://www.maths.lth.se/bioinformatics/ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Prof Brian D Ripley
2002-Aug-06 10:32 UTC
[R] write.table() adds unnecessary spaces to right align integer column
On Tue, 6 Aug 2002, Henrik Bengtsson wrote:> When using write.table() to write data frames the integer columns are padded > with unnecessary spaces (0x20) so that these columns are right align if you > look at them in a text editor. However, I think it is quite a vast of file > size. For instance, I am reading a tab-delimited 4200kb microarray data file > and writing it back verbatim using write.table() and it becomes 5100kb, a > 20% increase. Is this problem known and is there anyway to get around it?Well, that's not really what write.table is for, so the problem is not with write.table but with your usage of it. Is 900kB file space important? If so, why are you not using compression? If you don't want a table for human consumption, use e.g.save(compress=TRUE). And if you want to strip blanks in a text file, use an pipe output connection and strip them with sed, and then compress the file.> Example (R --vanilla): > > x <- c(0,1,10,100,1e3,1e5,1e6) > df <- data.frame(a=x, b=as.integer(x), c=as.character(x)) > write.table(df, "out.dat") > > > Output: > > "a" "b" "c" > "1" 0e+00 0 "0" > "2" 1e+00 1 "1" > "3" 1e+01 10 "10" > "4" 1e+02 100 "100" > "5" 1e+03 1000 "1000" > "6" 1e+05 100000 "1e+05" > "7" 1e+06 1000000 "1e+06" > > > str(R.Version()): > $ platform: chr "i386-pc-mingw32" > $ arch : chr "i386" > $ os : chr "mingw32" > $ system : chr "i386, mingw32" > $ status : chr "" > $ major : chr "1" > $ minor : chr "5.1" > $ year : chr "2002" > $ month : chr "06" > $ day : chr "17" > $ language: chr "R" > > Thanks > > Henrik Bengtsson > > Dept. of Mathematical Statistics @ Centre for Mathematical Sciences > Lund Institute of Technology/Lund University, Sweden (+2h UTC) > +46 46 2229611 (off), +46 708 909208 (cell), +46 46 2224623 (fax) > h b @ m a t h s . l t h . s e, http://www.maths.lth.se/bioinformatics/ > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._