Chabot Denis
2007-May-22 10:32 UTC
[R] Reducing the size of pdf graphics files produced with R
Hi, Without trying to print 1000000 points (see <http:// finzi.psych.upenn.edu/R/Rhelp02a/archive/42105.html>), I often print maps for which I do not want to loose too much of coastline detail, and/or plots with 1000-5000 points (yes, some are on top of each other, but using transparency (i.e. rgb colors with alpha information) this actually comes through as useful information. But the files are large (not as large as in the thread above of course, 800 KB to about 2 MB), especially when included in a LaTeX document by the dozen. Acrobat (not the reader, the full program) has an option "reduce file size". I don't know what it does, but it shrinks most of my plots to about 30% or original size, and I cannot detect any loss of detail even when zooming several times. But it is a pain to do this with Acrobat when you generate many plots... And you need to buy Acrobat. Is this something the pdf device could do in a future version? I tried the "million points" example from the thread above and the 55 MB file was reduced to 6.9 MB, an even better shrinking I see on my usual plots. Denis Chabot
Prof Brian Ripley
2007-May-22 16:47 UTC
[R] Reducing the size of pdf graphics files produced with R
>From the help page'pdf' writes uncompressed PDF. It is primarily intended for producing PDF graphics for inclusion in other documents, and PDF-includers such as 'pdftex' are usually able to handle compression. If you are able to contribute a stream compressor, R will produce smaller plots. Otherwise it is unlikely to happen (and it any case would be a smaller contribution than that of the author of pdf(), who is quite happy with external compressors). Acrobat does other things (not all of which it tells you about), but compression is the main advantage. On Tue, 22 May 2007, Chabot Denis wrote:> Hi, > > Without trying to print 1000000 points (see <http:// > finzi.psych.upenn.edu/R/Rhelp02a/archive/42105.html>), I often print > maps for which I do not want to loose too much of coastline detail, > and/or plots with 1000-5000 points (yes, some are on top of each > other, but using transparency (i.e. rgb colors with alpha > information) this actually comes through as useful information. > > But the files are large (not as large as in the thread above of > course, 800 KB to about 2 MB), especially when included in a LaTeX > document by the dozen. > > Acrobat (not the reader, the full program) has an option "reduce file > size". I don't know what it does, but it shrinks most of my plots to > about 30% or original size, and I cannot detect any loss of detail > even when zooming several times. But it is a pain to do this with > Acrobat when you generate many plots... And you need to buy Acrobat. > > Is this something the pdf device could do in a future version? I > tried the "million points" example from the thread above and the 55 > MB file was reduced to 6.9 MB, an even better shrinking I see on my > usual plots. > > > Denis Chabot > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Sundar Dorai-Raj
2007-May-22 17:19 UTC
[R] Reducing the size of pdf graphics files produced with R
You need not buy Acrobat. There are two free software programs that will compress pdf files: http://www.cutepdf.com http://www.cs.wisc.edu/~ghost/ (and in particular GSView) They both allow several levels of compression. Thanks, --sundar Chabot Denis said the following on 5/22/2007 3:32 AM:> Hi, > > Without trying to print 1000000 points (see <http:// > finzi.psych.upenn.edu/R/Rhelp02a/archive/42105.html>), I often print > maps for which I do not want to loose too much of coastline detail, > and/or plots with 1000-5000 points (yes, some are on top of each > other, but using transparency (i.e. rgb colors with alpha > information) this actually comes through as useful information. > > But the files are large (not as large as in the thread above of > course, 800 KB to about 2 MB), especially when included in a LaTeX > document by the dozen. > > Acrobat (not the reader, the full program) has an option "reduce file > size". I don't know what it does, but it shrinks most of my plots to > about 30% or original size, and I cannot detect any loss of detail > even when zooming several times. But it is a pain to do this with > Acrobat when you generate many plots... And you need to buy Acrobat. > > Is this something the pdf device could do in a future version? I > tried the "million points" example from the thread above and the 55 > MB file was reduced to 6.9 MB, an even better shrinking I see on my > usual plots. > > > Denis Chabot > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Waichler, Scott R
2007-May-23 14:24 UTC
[R] Reducing the size of pdf graphics files produced with R
> as you are using MacOS X, you'll have ghostscript installed anyway. so > try in R `dev2bitmap' with `type =pdfwrite'. I believe `gs' _does_ > include compression. a quick test showed at least a reduction by about > a factor of 2 relative to `pdf()'. probably one can fiddle with the > ghostscript settings (cf. e.g. `Ps2pdf.htm' in the ghostscipt > docs: you > can adjust the resolution for images in the pdf file) to > improve this, so as a last resort you could indeed export the graphics > as postscript and do the conversion to `pdf' by adjusting the `ps2pdf' > switches. but even with the default settings the pdf produced via > dev2bitmap/ghostscript is the better solution. apart from file size I > by and then ran into problems when converting `pdf()' output to > postscript later on, for instance.Can you give an example of dev2bitmap usage? I tried using it in place of a pdf() statement. An X11 window opened and my figure flew by, but I didn't get the file output. I also used dev2bitmap after opening a pdf() and just before the dev.off() statement, since the help says it works on the "current device", but again no written output. What am I doing wrong? I tried: dev2bitmap(file = plotfile2, type="pdfwrite", width=8.5, height=11, pointsize=12) print(myplot()) dev.off() and pdf(file = plotfile, paper="letter", width=8.5, height=11, pointsize=12) print(myplot()) dev2bitmap(file = plotfile2, type="pdfwrite", width=8.5, height=11, pointsize=12) dev.off() Thanks, Scott Waichler scott.waichler _at_ pnl.gov
Chabot Denis
2007-May-24 14:59 UTC
[R] Reducing the size of pdf graphics files produced with R
Hi again, Many of you have suggested other means than pdf device and/or conversion/compression of pdf outside of R. I ran some tests on a small, a medium-size and a large figure. Here I summarize the results, which depend very much on the original graphics file. Please note that I wish to retain a vector-based graphic file. You'll find at the end of this message the R program used to produce the graphics files. Starting with a small size graphic file: in order, these were produced by 1) postscript device 2) pdf device 3) bitmap device (pdf output) 4) dev2bitmap, pdf output, from a quartz window 5) quartz device saved to pdf via command quarts.save 6) quartz device saved to pdf via save menu in R gui -rw-r--r-- 1 chabotd chabotd 243446 May 23 21:00 test_ps_from_R.ps -rw-r--r-- 1 chabotd chabotd 572513 May 23 21:00 test_pdf_from_R.pdf -rw-r--r-- 1 chabotd chabotd 600106 May 24 09:21 test_pdf_bitmapR.pdf -rw-r--r-- 1 chabotd chabotd 600050 May 24 09:22 test_dev2bitmap.pdf -rw-r--r-- 1 chabotd chabotd 1657446 May 23 21:00 test_pdf_from_quartz.save.pdf -rw-r--r-- 1 chabotd chabotd 572634 May 23 21:01 test_pdf_from_quartz.menu.pdf These show how "test_pdf_from_R.pdf" can be shrunk outside of R 1) the command pdftk 2) opening the pdf in any Mac OS X pdf viewer and doing "print to compressed pdf" -rw-r--r-- 1 chabotd chabotd 68742 May 24 09:25 test_pdf_pdftk.pdf -rw-r--r-- 1 chabotd chabotd 100660 May 23 21:16 test_pdf_print_to_comppdf.pdf Finally, these show 3 conversions from postscript to pdf outside of R 1) command ps2pdf 2) command epstopdf 3) command pstopdf -rw-r--r-- 1 chabotd chabotd 566626 May 23 21:12 test_ps_ps2pdf.pdf -rw-r--r-- 1 chabotd chabotd 566587 May 24 10:21 test_ps_epstopdf.pdf -rw-r--r-- 1 chabotd chabotd 1939788 May 24 10:20 test_ps_pstopdf.pdf For this first example, all pdf produced directly within R were of similar size, except one (quartz.save) that was 3x larger. Producing a postscript file and transforming it into pdf resulted in no significant saving. However pdf output from R can be shrunk (here to 12% of original size) with pdftk. So far I found no adverse effect of this shrinking. I did the same with a larger graphic, this example came from Dave Watson. Using the same blocks as above: Produced with R: -rw-r--r-- 1 chabotd chabotd 854320 May 24 09:08 mauna_ps_from_R.eps -rw-r--r-- 1 chabotd chabotd 1000504 May 24 09:08 mauna_pdf_from_R.pdf -rw-r--r-- 1 chabotd chabotd 96737 May 24 09:08 mauna_pdf_bitmapR.pdf -rw-r--r-- 1 chabotd chabotd 97236 May 24 09:17 mauna_dev2bitmap.pdf -rw-r--r-- 1 chabotd chabotd 468195 May 24 09:08 mauna_pdf_from_quartz.save.pdf -rw-r--r-- 1 chabotd chabotd 999853 May 24 09:09 mauna_pdf_from_quartz.menu.pdf PS to pdf outside of R -rw-r--r-- 1 chabotd chabotd 95024 May 24 09:11 mauna_ps_ps2pdf.pdf -rw-r--r-- 1 chabotd chabotd 603021 May 24 10:40 mauna_ps_pstopdf.pdf -rw-r--r-- 1 chabotd chabotd 95015 May 24 10:40 mauna_ps_epstopdf.pdf pdf transformation outside of R -rw-r--r-- 1 chabotd chabotd 104487 May 24 09:12 mauna_pdf_pdftk.pdf -rw-r--r-- 1 chabotd chabotd 134663 May 24 09:23 mauna_print_to_comppdf.pdf For this example, different methods of producing pdf within R had very different file sizes. The two methods based on quartz performed in reverse order compare to the previous example. Overall, using bitmap device or postscript-transformed-to-pdf outside of R produced files about 10% the size of the file produced by pdf device. But the latter could be shrunk almost as much using pdftk. Finally, a larger-size example: Produced with R: -rw-r--r-- 1 chabotd chabotd 1426330 May 23 20:54 fig_ps_from_R.ps -rw-r--r-- 1 chabotd chabotd 3384788 May 23 20:54 fig_pdf_from_R.pdf -rw-r--r-- 1 chabotd chabotd 3494689 May 24 09:03 fig_pdf_bitmapR.pdf -rw-r--r-- 1 chabotd chabotd 3494689 May 24 10:46 fig_dev2bitmap.pdf -rw-r--r-- 1 chabotd chabotd 3384832 May 23 20:54 fig_pdf_from_quartz.menu.pdf -rw-r--r-- 1 chabotd chabotd 9583552 May 23 20:52 fig_pdf_from_quartz.save.pdf PS to pdf outside of R -rw-r--r-- 1 chabotd chabotd 3356223 May 23 21:12 fig_ps_ps2pdf.pdf -rw-r--r-- 1 chabotd chabotd 11397461 May 23 23:51 fig_ps_pstopdf.pdf -rw-r--r-- 1 chabotd chabotd 3354762 May 23 23:55 fig_ps_epstopdf.pdf pdf transformation outside of R -rw-r--r-- 1 chabotd chabotd 379307 May 23 22:31 fig_pdf_comptk.pdf -rw-r--r-- 1 chabotd chabotd 520509 May 24 00:19 fig_pdf_print_to_comppdf.pdf This time, as in the first example, there was little benefit going the bitmap device or ps to pdf route. Only shrinking the pdf with pdftk was effective. So examples with a lot of objects on the plot do not seem to benefit from postscript use, but one example with few objects (but objects that were "filled, don't know if it matters) did. I have never done this in R, but could the pdftk command be run from within a R script? This would allow one to compress automatically when needed. Thank you all for the suggestions, Denis ############## R program that produced the above files ################# # example 1, small pdf(file="test_pdf_from_R.pdf", w=5, h=5, version="1.4", bg="transparent") plot(rnorm(10000), rnorm(10000)) dev.off() postscript(file="test_ps_from_R.ps", width=5, height=5, paper="special") plot(rnorm(10000), rnorm(10000)) dev.off() bitmap(file = "test_pdf_bitmapR.pdf", width=5, height=5, type = "pdfwrite") plot(rnorm(10000), rnorm(10000)) dev.off() plot(rnorm(10000), rnorm(10000)) quartz.save(file="test_pdf_from_quartz.save.pdf", type="pdf") dev2bitmap(file="test_dev2bitmap.pdf", width=5, height=5, type="pdfwrite") # here I also manually saved the quartz graphics and called it "test_pdf_from_quartz.menu.pdf" # Example from Dave Watson postscript(file = "mauna_ps_from_R.eps", width=5, height=5, horizontal=FALSE, paper="special", onefile=FALSE) filled.contour(volcano, color=terrain.colors, asp=1) title(main="volcano data: filled contour map") dev.off() pdf(file = "mauna_pdf_from_R.pdf", width=5, height=5) filled.contour(volcano, color=terrain.colors, asp=1) title(main="volcano data: filled contour map") dev.off() bitmap(file = "mauna_pdf_bitmapR.pdf", width=5, height=5, type = "pdfwrite") filled.contour(volcano, color=terrain.colors, asp=1) title(main="volcano data: filled contour map") dev.off() # on mac os x only quartz(w=5, h=5) filled.contour(volcano, color=terrain.colors, asp=1) title(main="volcano data: filled contour map") dev2bitmap(file="mauna_dev2bitmap.pdf", width=5, height=5, type="pdfwrite") quartz.save(file="mauna_pdf_from_quartz.save.pdf", type="pdf") # here I also manually saved the quartz graphics and called it "mauna_pdf_from_quartz.menu.pdf" # example 3, large x <- rep(1:99, 20) c <- 0 for (a in 1:3) { for (b in c(0.7, 0.9) ) { c<-c+1 nam <- paste("Y", c, sep="") assign(nam, a + b*x + rnorm(length(x),20,10)) } } the.data <- data.frame(Y1, Y2, Y3, Y4, Y5, Y6) rm(Y1, Y2, Y3, Y4, Y5, Y6) pdf(file="fig_pdf_from_R.pdf", w=8, h=8, version="1.4", bg="transparent") pairs(the.data) dev.off() postscript(file="fig_ps_from_R.ps", width=8, height=8, paper="special") pairs(the.data) dev.off() bitmap(file = "fig_pdf_bitmapR.pdf", width=8, height=8, type = "pdfwrite") pairs(the.data) dev.off() # on mac os x only quartz(w=8, h=8) pairs(the.data) dev2bitmap(file="fig_dev2bitmap.pdf", width=8, height=8, type="pdfwrite") quartz.save(file="fig_pdf_from_quartz.savev2.pdf", type="pdf") # here I also manually saved the quartz graphics and called it "fig_pdf_from_quartz.menu.pdf"
Paul Hiemstra
2008-Feb-07 10:52 UTC
[R] Reducing the size of pdf graphics files produced with R
Hi all, Maybe a bit late, but I found a way that worked great for me. In windows, download CutePDF In linux (debian for me), install CUPS and cups-pdf Open your pdf with a viewer and print to CutePDF or cups-pdf. Both support a range of compression options. I use cups-pdf and reduced an R output file of 3.6 mb to 0.9 mb. Much better if you want to include in a Latex article cheers, Paul Chabot Denis wrote:> Hi, > > Without trying to print 1000000 points (see <http:// > finzi.psych.upenn.edu/R/Rhelp02a/archive/42105.html>), I often print > maps for which I do not want to loose too much of coastline detail, > and/or plots with 1000-5000 points (yes, some are on top of each > other, but using transparency (i.e. rgb colors with alpha > information) this actually comes through as useful information. > > But the files are large (not as large as in the thread above of > course, 800 KB to about 2 MB), especially when included in a LaTeX > document by the dozen. > > Acrobat (not the reader, the full program) has an option "reduce file > size". I don't know what it does, but it shrinks most of my plots to > about 30% or original size, and I cannot detect any loss of detail > even when zooming several times. But it is a pain to do this with > Acrobat when you generate many plots... And you need to buy Acrobat. > > Is this something the pdf device could do in a future version? I > tried the "million points" example from the thread above and the 55 > MB file was reduced to 6.9 MB, an even better shrinking I see on my > usual plots. > > > Denis Chabot > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Drs. Paul Hiemstra Department of Physical Geography Faculty of Geosciences University of Utrecht Heidelberglaan 2 P.O. Box 80.115 3508 TC Utrecht Phone: +31302535773 Fax: +31302531145 http://intamap.geo.uu.nl/~paul