Regarding the reasons that make the doc directory large, I wonder if
we can make some changes in R:
1. Use a null graphics device as the default device rather than pdf()
when running Sweave -- this can avoid the useless Rplots.pdf:
options(device = function(...) {
.Call("R_GD_nullDevice", PACKAGE = "grDevices")
})
This can save some time in building the vignette(s) as well. (see
http://yihui.name/en/?p=673)
However, this undocumented null device may not work for certain
graphics. Here is an example that it fails for ggplot2:
http://stackoverflow.com/questions/4692974/ggplot2-code-that-works-interactively-rkward-crashes-under-lyx-pgfsweave-hint/4707745#4707745
Is it possible for someone to look into the null device (Dr Murrell?)
to make it stable enough?
2. Compress the PDF graphics and vignettes using third-party tools,
among which I recommend qpdf (it's free).
qpdf --stream-data=compress input.pdf output.pdf
This can reduce the size of PDF files a lot without quality loss. I'm
using this tool in the animation package to reduce the size of PDF
animations.
3. Sorry I bring up this issue again, but I don't understand why
Sweave could not implement the png() device along with pdf() and
postscript(). I'm willing to provide a patch if needed.
Thanks!
Regards,
Yihui
--
Yihui Xie <xieyihui at gmail.com>
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA
On Sun, Feb 13, 2011 at 6:30 AM, Prof Brian Ripley
<ripley at stats.ox.ac.uk> wrote:> Robin Hankin's post reminded me to post about the following recent
addition
> to 'Writing R Extensions', in the section on 'Submitting a
package to CRAN'
>
> ?Ensure that the package sources are not unnecessarily large. ...
> ?As a general rule, doc directories should not exceed 5Mb, and
> ?where data directories need to be 10Mb or more, consideration should
> ?be given to a separate package containing just the data. (Similarly
> ?for external data directories, large jar files and other libraries
> ?that need to be installed.)
>
> With 2800 packages on CRAN, overall size is becoming a concern and
currently
> to install all of CRAN takes 4Gb. ?As the attached (I hope) graph shows,
the
> 20 packages over 20Mb take a quarter, and those over 5Mb take half. ?(And
> this is after we have removed 100Mb from the largest installed package by
> re-compression, and archived the second largest, so Robin's package is
> currently the largest.) ?Some of the largest packages are data/jar
packages,
> but there are 55 packages with 'doc' directories over 5Mb. ?To put
that in
> perspective, PDFs of whole books with lots of figures (MASS, Paul's R
> Graphics) are well under 5Mb.
>
> R CMD check in R-devel reports on large packages, and expect in future that
> submitted package sizes will be questioned more often.
>
> There are lots of different reasons why doc directories are large, but the
> major ones are
>
> - installing files that are unneeded, such as Rplots.pdf and .eps
> ?figures.
> - using PDF figures of images where PNG would be more appropriate.
> - including less than relevant material (such as how to install R,
> ?with screenshots!)
>
> There are several ways to reduce the sizes of PDFs with no loss in quality,
> e.g. Adobe Acrobat Standard/Pro.
>
> --
> Brian D. Ripley, ? ? ? ? ? ? ? ? ?ripley at stats.ox.ac.uk
> Professor of Applied Statistics, ?http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, ? ? ? ? ? ? Tel: ?+44 1865 272861 (self)
> 1 South Parks Road, ? ? ? ? ? ? ? ? ? ? +44 1865 272866 (PA)
> Oxford OX1 3TG, UK ? ? ? ? ? ? ? ?Fax: ?+44 1865 272595
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>