jiho
2007-Oct-04 08:21 UTC
[R] pdf() device uses fonts to represent points - data alteration?
Hello all,
I discovered that the pdf device uses fonts to represent "points"
symbols (as in plot(...,type="p",...) ). Namely it uses ZapfDingbats
with symbol U+25cf. This can lead to problems when the font is not
available, or available in another version (such as points being
replaced by other symbols, or worst: slightly displaced).
Furthermore, it also causes problems when opening the pdf files for
editing in other programs. I know that for reproducibility one should
avoid doing this but there are cases where R is simply not suited to
produce the end result graphic directly using code (Ex: replace some
colors by CMYK versions for color consistency in print). In addition,
publishers also often like being able to retouch graphics to ensure
fonts consistency or such, and this will be destructive in the case
of these pdfs. For example, Inkscape interprets points as squares
(more like U+2751 in ZapfDingbats) and Adobe Illustrator does not
even recognize the font (substituting AdobePiStd).
I tried to embed fonts with embedFonts() but his does not solves the
issue with editing (Inkscape produces a kind of star and AI still
chokes on the font) and worst, it modifies how the original graphic
renders in pdf viewers: the circles are now filled (I believe this is
because this is the default state of the ZapfDingbats character).
So my questions are:
- does anyone have a work around this?
- why can't the pdf device use shapes instead of fonts to represent
data point? It would appear as a much more robust approach and would
ensure that the points are rendered the same everywhere. Font
substitution in axes labels is not as bad since it does not modify
the data itself (at worst the labels are offset a little bit) but
font substitution on the data points can really harm the graphic.
Examples of code:
pdf("test.pdf")
plot(0,0,xlab="",ylab="",bty="n",xaxt="n",yaxt="n");
grid(lty=1);
dev.off()
embedFonts("test.pdf","pdfwrite","test_embed.pdf")
visualize the fonts:
pdffonts test.pdf
and a package with the two pdf files and bitmaps of how they render
or are interpreted in various programs:
http://jo.irisson.free.fr/dropbox/test_R_pdf_fonts.zip
Thank you in advance for your attention and help.
JiHO
---
http://jo.irisson.free.fr/
Paul Murrell
2007-Oct-31 11:00 UTC
[R] pdf() device uses fonts to represent points - data alteration?
Hi jiho wrote:> Hello all, > > I discovered that the pdf device uses fonts to represent "points" > symbols (as in plot(...,type="p",...) ). Namely it uses ZapfDingbats > with symbol U+25cf. This can lead to problems when the font is not > available, or available in another version (such as points being > replaced by other symbols, or worst: slightly displaced). > Furthermore, it also causes problems when opening the pdf files for > editing in other programs. I know that for reproducibility one should > avoid doing this but there are cases where R is simply not suited to > produce the end result graphic directly using code (Ex: replace some > colors by CMYK versions for color consistency in print). In addition, > publishers also often like being able to retouch graphics to ensure > fonts consistency or such, and this will be destructive in the case > of these pdfs. For example, Inkscape interprets points as squares > (more like U+2751 in ZapfDingbats) and Adobe Illustrator does not > even recognize the font (substituting AdobePiStd). > I tried to embed fonts with embedFonts() but his does not solves the > issue with editing (Inkscape produces a kind of star and AI still > chokes on the font) and worst, it modifies how the original graphic > renders in pdf viewers: the circles are now filled (I believe this is > because this is the default state of the ZapfDingbats character). > > So my questions are: > - does anyone have a work around this? > - why can't the pdf device use shapes instead of fonts to represent > data point? It would appear as a much more robust approach and would > ensure that the points are rendered the same everywhere. Font > substitution in axes labels is not as bad since it does not modify > the data itself (at worst the labels are offset a little bit) but > font substitution on the data points can really harm the graphic.If I recall correctly, the PDF device uses a character for small circles because that looks better. There is no PDF circle primitive, so circles have to be drawn using bezier curves. The original author may be able to elaborate on that. Two suggestions for workarounds: (i) produce PostScript and then convert to PDF using something like ghostscript (e.g., ps2pdf) (ii) use an almost-but-not-quite opaque colour, e.g., rgb(0, 0, 0, .99) for the points. If the points are not fully opaque, the character is not used. Paul> Examples of code: > pdf("test.pdf") > plot(0,0,xlab="",ylab="",bty="n",xaxt="n",yaxt="n"); grid(lty=1); > dev.off() > embedFonts("test.pdf","pdfwrite","test_embed.pdf") > > visualize the fonts: > pdffonts test.pdf > > and a package with the two pdf files and bitmaps of how they render > or are interpreted in various programs: > http://jo.irisson.free.fr/dropbox/test_R_pdf_fonts.zip > > Thank you in advance for your attention and help. > > JiHO > --- > http://jo.irisson.free.fr/ > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Dr Paul Murrell Department of Statistics The University of Auckland Private Bag 92019 Auckland New Zealand 64 9 3737599 x85392 paul at stat.auckland.ac.nz http://www.stat.auckland.ac.nz/~paul/
Thomas Petzoldt
2007-Nov-01 19:18 UTC
[R] pdf() device uses fonts to represent points - data alteration?
Hello, I had the same problem. When opening PDFs with a recent developer version of inkscape all circles were replaced by the letter "q", see a screenshoot of the imported figure: http://www.simecol.de/figs/R_pdf_inkscape.png I spent at least two hours trying different development versions of inkscape, different versions of R, reading docs, trying different machines, installing fonts etc., finally giving up. Now, the two workarounds of Paul Murrell indeed solved the problem for *me*. Thank you. Here are example results of workarounds (i) and (ii): http://www.simecol.de/figs/R_ps_pdf_inkscape.png or http://www.simecol.de/figs/R_pdftrans_inkscape.png One problem remains. I wanted to demonstrate post-editing PDFs with inkscape to motivate students to use R for graphics even if they dont want to "become programmers". However, double conversion (via postscript) or the magics of transparency and opaqueness are not yet the way to increase the trust of point-and-click users to R. Maybe this is a topic for r-devel? Thomas -- Thomas Petzoldt Technische Universitaet Dresden Institut fuer Hydrobiologie 01062 Dresden GERMANY http://tu-dresden.de/Members/thomas.petzoldt?set_language=en
jiho
2007-Nov-01 19:40 UTC
[R] pdf() device uses fonts to represent points - data alteration?
On 2007-November-01 , at 20:18 , Thomas Petzoldt wrote:> I had the same problem. When opening PDFs with a recent developer > version of inkscape all circles were replaced by the letter "q", > see a screenshoot of the imported figure: > > http://www.simecol.de/figs/R_pdf_inkscape.png > > I spent at least two hours trying different development versions of > inkscape, different versions of R, reading docs, trying different > machines, installing fonts etc., finally giving up. Now, the two > workarounds of Paul Murrell indeed solved the problem for *me*. > Thank you. Here are example results of workarounds (i) and (ii): > > http://www.simecol.de/figs/R_ps_pdf_inkscape.png > > or > > http://www.simecol.de/figs/R_pdftrans_inkscape.png > > One problem remains. I wanted to demonstrate post-editing PDFs with > inkscape to motivate students to use R for graphics even if they > dont want to "become programmers". However, double conversion (via > postscript) or the magics of transparency and opaqueness are not > yet the way to increase the trust of point-and-click users to R. > Maybe this is a topic for r-devel?By the way, depending on what OS you are, you may find an entirely SVG workflow more suitable: R with RSVvgDevice package -1-> SVG figures -2-> Inkscape -3-> whatever you like (SVG, PNG, PDF...) This gives all transparency, fonts etc to Inkscape so it is fine on this side. The only "problem" with this workflow for me is that many of my plots stay between stage 1 and 2 and I like to be able to view them quickly. I would need a quick SVG viewer but there are none on OS X. If you are on Linux, many documents viewers (eog, evince, gThumbs) can display SVGs so you would be all set. JiHO --- http://jo.irisson.free.fr/