thr3ads.net - R help - [R] Display time of PDF plots [Sep 2018]

If this information is useful, please help other people find it:
Share via:

Rich Shepard

2018-Sep-03 17:45 UTC

[R] Display time of PDF plots

This may be an inappropriate forum for this question. If so, please point
me in a better direction.

   A current project includes scatter plots with thousands of points. Saved
as PDF files they display slowly using a pdf viewer or when included in the
PDF output of a LaTeX document.

   Is there a process by which these plots can be 'thinned' so they show
the
same overall patterns but with fewer points so they display more quickly?

   Rasterizing them to .jpg files using 'convert' allows them to load
immediately, but the bit-mapped resolution is, of course, much lower than
the vector PDF format.

Rich

Bert Gunter

2018-Sep-03 18:20 UTC

head link

[R] Display time of PDF plots

1. Plot a random sample of the points (e.g. of rows of matrix/dataframe
containing "x" and "y" columns

2. See the hexbin package

3. Check out the graphics taskview on cran:
https://cran.r-project.org/web/views/Graphics.html
(though it may be somewhat dated by now)

4. Internet search:  e.g. on "display scatterplots with thousands of
points"
typical hit:
https://stackoverflow.com/questions/7714677/scatterplot-with-too-many-points

5. Search/Post on stats.stackexchange.com instead.

-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Sep 3, 2018 at 10:45 AM Rich Shepard <rshepard at appl-ecosys.com>
wrote:
>    This may be an inappropriate forum for this question. If so, please
> point
> me in a better direction.
>
>    A current project includes scatter plots with thousands of points. Saved
> as PDF files they display slowly using a pdf viewer or when included in the
> PDF output of a LaTeX document.
>
>    Is there a process by which these plots can be 'thinned' so they
show
> the
> same overall patterns but with fewer points so they display more quickly?
>
>    Rasterizing them to .jpg files using 'convert' allows them to
load
> immediately, but the bit-mapped resolution is, of course, much lower than
> the vector PDF format.
>
> Rich
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Rich Shepard

2018-Sep-03 18:31 UTC

head link

[R] Display time of PDF plots

On Mon, 3 Sep 2018, Bert Gunter wrote:
> 1. Plot a random sample of the points (e.g. of rows of matrix/dataframe
> containing "x" and "y" columns
>
> 2. See the hexbin package
>
> 3. Check out the graphics taskview on cran:
> https://cran.r-project.org/web/views/Graphics.html
> (though it may be somewhat dated by now)
>
> 4. Internet search:  e.g. on "display scatterplots with thousands of
> points"
> typical hit:
>
https://stackoverflow.com/questions/7714677/scatterplot-with-too-many-points
>
> 5. Search/Post on stats.stackexchange.com instead.
Bert,

   I did a web search without finding useful information. Probably not the
best search terms.

   Will implement your suggestions.

Thanks,

Rich

David L Carlson

2018-Sep-03 18:36 UTC

head link

[R] Display time of PDF plots

If the plot is being displayed on a monitor, it is being bitmapped to the
resolution of the display device regardless of how you save it. Most computer
monitors are about 100dpi.

If the problem is that the points are overprinting, Bert's suggestion to use
hexbin() is the way to go.

If the points are not substantially overprinting, you could just save the plot
in raster format using an lzh compressed tif() or png() to the maximum likely
resolution of the display device (take zooming into account by going up to
600dpi or 1200dpi, for example). Don't use jpg since it is lossy and you
will get halos when you zoom in.

You can always preserve a vector version for publication. If you have Adobe
Acrobat (not Reader), you can Save As Other | Image | tiff (or png) and set the
resolution before exporting.

----------------------------
David L. Carlson
Department of Anthropology
Texas A&M University


-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Rich Shepard
Sent: Monday, September 3, 2018 12:45 PM
To: r-help at r-project.org
Subject: [R] Display time of PDF plots

   This may be an inappropriate forum for this question. If so, please point
me in a better direction.

   A current project includes scatter plots with thousands of points. Saved
as PDF files they display slowly using a pdf viewer or when included in the
PDF output of a LaTeX document.

   Is there a process by which these plots can be 'thinned' so they show
the
same overall patterns but with fewer points so they display more quickly?

   Rasterizing them to .jpg files using 'convert' allows them to load
immediately, but the bit-mapped resolution is, of course, much lower than
the vector PDF format.

Rich

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Rich Shepard

2018-Sep-03 19:10 UTC

head link

[R] Display time of PDF plots

On Mon, 3 Sep 2018, David L Carlson wrote:
> If the plot is being displayed on a monitor, it is being bitmapped to the
> resolution of the display device regardless of how you save it. Most
> computer monitors are about 100dpi.
David,

   I'm looking at the report on the monitor. I suspect that most readers
will, too. But, some will print it.
> If the problem is that the points are overprinting, Bert's suggestion
to
> use hexbin() is the way to go.
   Most look like overprints, but at the top there are discrete print
characters.
> If the points are not substantially overprinting, you could just save the
> plot in raster format using an lzh compressed tif() or png() to the
> maximum likely resolution of the display device (take zooming into account
> by going up to 600dpi or 1200dpi, for example). Don't use jpg since it
is
> lossy and you will get halos when you zoom in.
   I used convert to produce .png images but, of course, bit-maps of plots
and text are less sharp than are vector images.
> You can always preserve a vector version for publication. If you have
> Adobe Acrobat (not Reader), you can Save As Other | Image | tiff (or png)
> and set the resolution before exporting.
   'convert', the ImageMagick tool, does this, too.

Thanks,

Rich

Paul Murrell

2018-Sep-03 19:32 UTC

head link

[R] [FORGED] Re: Display time of PDF plots

Hi

Another option is to just rasterize the points (but leave the rest of 
the plot vector).  See ...

https://www.stat.auckland.ac.nz/~paul/Reports/rasterize/rasterize.html

Paul

On 04/09/18 06:20, Bert Gunter wrote:> 1. Plot a random sample of the points (e.g. of rows of matrix/dataframe
> containing "x" and "y" columns
> 
> 2. See the hexbin package
> 
> 3. Check out the graphics taskview on cran:
> https://cran.r-project.org/web/views/Graphics.html
> (though it may be somewhat dated by now)
> 
> 4. Internet search:  e.g. on "display scatterplots with thousands of
> points"
> typical hit:
>
https://stackoverflow.com/questions/7714677/scatterplot-with-too-many-points
> 
> 5. Search/Post on stats.stackexchange.com instead.
> 
> -- Bert
> 
> Bert Gunter
> 
> "The trouble with having an open mind is that people keep coming along
and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip
)
> 
> 
> On Mon, Sep 3, 2018 at 10:45 AM Rich Shepard <rshepard at
appl-ecosys.com>
> wrote:
> 
>>     This may be an inappropriate forum for this question. If so, please
>> point
>> me in a better direction.
>>
>>     A current project includes scatter plots with thousands of points.
Saved
>> as PDF files they display slowly using a pdf viewer or when included in
the
>> PDF output of a LaTeX document.
>>
>>     Is there a process by which these plots can be 'thinned' so
they show
>> the
>> same overall patterns but with fewer points so they display more
quickly?
>>
>>     Rasterizing them to .jpg files using 'convert' allows them
to load
>> immediately, but the bit-mapped resolution is, of course, much lower
than
>> the vector PDF format.
>>
>> Rich
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Rich Shepard

2018-Sep-04 16:41 UTC

head link

[R] Display time of PDF plots

On Mon, 3 Sep 2018, Rich Shepard wrote:
> Is there a process by which these plots can be 'thinned' so they
show the
> same overall patterns but with fewer points so they display more quickly?
Bert/Paul/David/John:

   Thanks very much for the suggestions. I think an appropriate way to
illustrate the patterns is to plot the median and maximum for each month
(for all sites). That's the important information and plotting each daily
point over 13 years obscures that information.

   The dataframe is structured this way:

str(rainfall)
'data.frame':	113569 obs. of  6 variables:
  $ name    : chr  "Headworks Portland Water" "Headworks Portland
Water" "Headworks Portland Water" "Headworks Portland
Water" ...
  $ easting : num  2370575 2370575 2370575 2370575 2370575 ...
  $ northing: num  199338 199338 199338 199338 199338 ...
  $ elev    : num  228 228 228 228 228 228 228 228 228 228 ...
  $ sampdate: Date, format: "2005-01-01" "2005-01-02" ...
  $ prcp    : num  0.59 0.08 0.1 0 0 0.02 0.05 0.1 0 0.02 ...

   There are probably multiple ways of extracting the monthly median and
maximum 'prcp' and I don't know how to identify the appropriate one.
Is
there a task view for this type of data manipulation? I've not before done
anything like this and would appreciate a pointer to where I start to learn.

Regards,

Rich

MacQueen, Don

2018-Sep-05 20:05 UTC

head link

[R] Display time of PDF plots

(this is somewhat a change of subject from the original question)

Rich, there functions such as aggregate() in base R. There are also many options
in CRAN packages.

But I tend to have difficulty getting them to do exactly what I want, and
usually end up rolling my own.

The idea is to split the data into groups by station and month, then calculate
summary stats for each group, then recombine into a new data frame.

## untested with your data, but this kind of approach works well for me
## note that this code assumes easting, northing, and elevation are in fact
unique within each group
## if they are not, you will get an ERROR

## add a 'month' variable
raindf <- rainfall
raindf$mon <- format(raindf$sampdate,'%Y-%m')
  
  mysum <- function(df) {
    data.frame( name=unique(df$name),
               easting=unique(df$easting),
               northing=unique(df$northing),
               elev=unique(df$elev),
               mon=unique(df$mon),
               pr.med=median(df$prcp),
               pr.max=max(df$prcp) )
  }

tmpdf <- split(raindf, paste(raindf$name, raindf$mon) )

## at this point, you can check your summary stats function with, for example,
mysum(tmpdf[[1]])
mysum(tmpdf[[2]])

## when satisfied with mysum(), do this
tmpsum <- lapply(tmpdf, mysum)

## recombine
rain.by.mon <- do.call(rbind, tmpsum)

## might still want to create a numeric month to facilitate plotting
## or maybe assign each month to the first of the month, or the 15th, or end or
whatever makes sense
rain.by.mon$mondt <- as.Date(paste0(rain.by.mon$mon,'-1'))




--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
Lab cell 925-724-7509
 
 

?On 9/4/18, 9:41 AM, "R-help on behalf of Rich Shepard"
<r-help-bounces at r-project.org on behalf of rshepard at appl-ecosys.com>
wrote:

    On Mon, 3 Sep 2018, Rich Shepard wrote:
    
    > Is there a process by which these plots can be 'thinned' so
they show the
    > same overall patterns but with fewer points so they display more
quickly?
    
    Bert/Paul/David/John:
    
       Thanks very much for the suggestions. I think an appropriate way to
    illustrate the patterns is to plot the median and maximum for each month
    (for all sites). That's the important information and plotting each
daily
    point over 13 years obscures that information.
    
       The dataframe is structured this way:
    
    str(rainfall)
    'data.frame':	113569 obs. of  6 variables:
      $ name    : chr  "Headworks Portland Water" "Headworks
Portland Water" "Headworks Portland Water" "Headworks
Portland Water" ...
      $ easting : num  2370575 2370575 2370575 2370575 2370575 ...
      $ northing: num  199338 199338 199338 199338 199338 ...
      $ elev    : num  228 228 228 228 228 228 228 228 228 228 ...
      $ sampdate: Date, format: "2005-01-01" "2005-01-02"
...
      $ prcp    : num  0.59 0.08 0.1 0 0 0.02 0.05 0.1 0 0.02 ...
    
       There are probably multiple ways of extracting the monthly median and
    maximum 'prcp' and I don't know how to identify the appropriate
one. Is
    there a task view for this type of data manipulation? I've not before
done
    anything like this and would appreciate a pointer to where I start to learn.
    
    Regards,
    
    Rich
    
    ______________________________________________
    R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    and provide commented, minimal, self-contained, reproducible code.

R help - Sep 2018 - Display time of PDF plots

[R] Display time of PDF plots

[R] Display time of PDF plots

[R] Display time of PDF plots

[R] Display time of PDF plots

[R] Display time of PDF plots

[R] [FORGED] Re: Display time of PDF plots

[R] Display time of PDF plots

[R] Display time of PDF plots