Is there an easy way to "thin" a lattice plot? I often create plots from large data sets, and use the "pdf" command to save them to a file, but the resulting files can be huge, because every point in the underlying dataset is rendered in the plot, even though it isn't possible to see that much detail. For example: require(Hmisc) x <- rnorm(1e6) pdf("test.pdf") Ecdf(x) dev.off() The resulting pdf files is 31MB. Is there any easy way to get a smaller pdf file without having to manually prune the dataset? Thanks. - Elliot -- Elliot Joel Bernstein, Ph.D. | Research Associate | FDO Partners, LLC 134 Mount Auburn Street | Cambridge, MA | 02138 Phone: (617) 503-4619 | Email: elliot.bernstein@fdopartners.com [[alternative HTML version deleted]]
On Jul 30, 2012, at 2:13 PM, Elliot Joel Bernstein wrote:> Is there an easy way to "thin" a lattice plot? I often create plots > from > large data sets, and use the "pdf" command to save them to a file, > but the > resulting files can be huge, because every point in the underlying > dataset > is rendered in the plot, even though it isn't possible to see that > much > detail. > > For example: > > require(Hmisc) > x <- rnorm(1e6) > > pdf("test.pdf") > Ecdf(x) > dev.off() > > The resulting pdf files is 31MB. Is there any easy way to get a > smaller pdf > file without having to manually prune the dataset?There are plotting routines that display the density of distributions. I use hexbin fairly frequently but that is for 2d plots. If you wanted the ECDF of a 1d vector, you could use cumsum() on the output of hist() or quantile() with suitable arguments to their parameters to control the degree of aggregation. Either of these yields an 8KB file on my machine. > pdf("test.pdf") > xyplot( cumsum(hist(x, plot=F)$intensities) ~ hist(x, plot=F) $breaks ) > dev.off() quartz 2 > pdf("test.pdf") > xyplot( (0:100)/100 ~ quantile(x, prob=(0:100)/100) ) > dev.off() quartz 2> > Thanks. > > - Elliot > > -- > Elliot Joel Bernstein, Ph.D. | Research Associate | FDO Partners, LLC > 134 Mount Auburn Street | Cambridge, MA | 02138 > Phone: (617) 503-4619 | Email: elliot.bernstein at fdopartners.com >David Winsemius, MD Alameda, CA, USA
On Tue, Jul 31, 2012 at 2:43 AM, Elliot Joel Bernstein <elliot.bernstein at fdopartners.com> wrote:> Is there an easy way to "thin" a lattice plot? I often create plots from > large data sets, and use the "pdf" command to save them to a file, but the > resulting files can be huge, because every point in the underlying dataset > is rendered in the plot, even though it isn't possible to see that much > detail. > > For example: > > require(Hmisc) > x <- rnorm(1e6) > > pdf("test.pdf") > Ecdf(x) > dev.off()(This is not a lattice plot, BTW.)> The resulting pdf files is 31MB.Hmm, for me it's 192K. Perhaps you have not bothered to update R recently.> Is there any easy way to get a smaller pdf > file without having to manually prune the dataset?In general, as David noted, you need to do some sort of data summarization; great if tools are available to that, otherwise yourself. In this case, for example, it seems reasonable to do Ecdf(quantile(x, probs = ppoints(500, a=1))) If you don't like to do this yourself, ecdfplot() in latticeExtra will allow library(latticeExtra) ecdfplot(x, f.value = ppoints(500, a=1)) -Deepayan