jiho
2007-Oct-04 08:21 UTC
[R] pdf() device uses fonts to represent points - data alteration?
Hello all, I discovered that the pdf device uses fonts to represent "points" symbols (as in plot(...,type="p",...) ). Namely it uses ZapfDingbats with symbol U+25cf. This can lead to problems when the font is not available, or available in another version (such as points being replaced by other symbols, or worst: slightly displaced). Furthermore, it also causes problems when opening the pdf files for editing in other programs. I know that for reproducibility one should avoid doing this but there are cases where R is simply not suited to produce the end result graphic directly using code (Ex: replace some colors by CMYK versions for color consistency in print). In addition, publishers also often like being able to retouch graphics to ensure fonts consistency or such, and this will be destructive in the case of these pdfs. For example, Inkscape interprets points as squares (more like U+2751 in ZapfDingbats) and Adobe Illustrator does not even recognize the font (substituting AdobePiStd). I tried to embed fonts with embedFonts() but his does not solves the issue with editing (Inkscape produces a kind of star and AI still chokes on the font) and worst, it modifies how the original graphic renders in pdf viewers: the circles are now filled (I believe this is because this is the default state of the ZapfDingbats character). So my questions are: - does anyone have a work around this? - why can't the pdf device use shapes instead of fonts to represent data point? It would appear as a much more robust approach and would ensure that the points are rendered the same everywhere. Font substitution in axes labels is not as bad since it does not modify the data itself (at worst the labels are offset a little bit) but font substitution on the data points can really harm the graphic. Examples of code: pdf("test.pdf") plot(0,0,xlab="",ylab="",bty="n",xaxt="n",yaxt="n"); grid(lty=1); dev.off() embedFonts("test.pdf","pdfwrite","test_embed.pdf") visualize the fonts: pdffonts test.pdf and a package with the two pdf files and bitmaps of how they render or are interpreted in various programs: http://jo.irisson.free.fr/dropbox/test_R_pdf_fonts.zip Thank you in advance for your attention and help. JiHO --- http://jo.irisson.free.fr/
Paul Murrell
2007-Oct-31 11:00 UTC
[R] pdf() device uses fonts to represent points - data alteration?
Hi jiho wrote:> Hello all, > > I discovered that the pdf device uses fonts to represent "points" > symbols (as in plot(...,type="p",...) ). Namely it uses ZapfDingbats > with symbol U+25cf. This can lead to problems when the font is not > available, or available in another version (such as points being > replaced by other symbols, or worst: slightly displaced). > Furthermore, it also causes problems when opening the pdf files for > editing in other programs. I know that for reproducibility one should > avoid doing this but there are cases where R is simply not suited to > produce the end result graphic directly using code (Ex: replace some > colors by CMYK versions for color consistency in print). In addition, > publishers also often like being able to retouch graphics to ensure > fonts consistency or such, and this will be destructive in the case > of these pdfs. For example, Inkscape interprets points as squares > (more like U+2751 in ZapfDingbats) and Adobe Illustrator does not > even recognize the font (substituting AdobePiStd). > I tried to embed fonts with embedFonts() but his does not solves the > issue with editing (Inkscape produces a kind of star and AI still > chokes on the font) and worst, it modifies how the original graphic > renders in pdf viewers: the circles are now filled (I believe this is > because this is the default state of the ZapfDingbats character). > > So my questions are: > - does anyone have a work around this? > - why can't the pdf device use shapes instead of fonts to represent > data point? It would appear as a much more robust approach and would > ensure that the points are rendered the same everywhere. Font > substitution in axes labels is not as bad since it does not modify > the data itself (at worst the labels are offset a little bit) but > font substitution on the data points can really harm the graphic.If I recall correctly, the PDF device uses a character for small circles because that looks better. There is no PDF circle primitive, so circles have to be drawn using bezier curves. The original author may be able to elaborate on that. Two suggestions for workarounds: (i) produce PostScript and then convert to PDF using something like ghostscript (e.g., ps2pdf) (ii) use an almost-but-not-quite opaque colour, e.g., rgb(0, 0, 0, .99) for the points. If the points are not fully opaque, the character is not used. Paul> Examples of code: > pdf("test.pdf") > plot(0,0,xlab="",ylab="",bty="n",xaxt="n",yaxt="n"); grid(lty=1); > dev.off() > embedFonts("test.pdf","pdfwrite","test_embed.pdf") > > visualize the fonts: > pdffonts test.pdf > > and a package with the two pdf files and bitmaps of how they render > or are interpreted in various programs: > http://jo.irisson.free.fr/dropbox/test_R_pdf_fonts.zip > > Thank you in advance for your attention and help. > > JiHO > --- > http://jo.irisson.free.fr/ > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Dr Paul Murrell Department of Statistics The University of Auckland Private Bag 92019 Auckland New Zealand 64 9 3737599 x85392 paul at stat.auckland.ac.nz http://www.stat.auckland.ac.nz/~paul/
Thomas Petzoldt
2007-Nov-01 19:18 UTC
[R] pdf() device uses fonts to represent points - data alteration?
Hello, I had the same problem. When opening PDFs with a recent developer version of inkscape all circles were replaced by the letter "q", see a screenshoot of the imported figure: http://www.simecol.de/figs/R_pdf_inkscape.png I spent at least two hours trying different development versions of inkscape, different versions of R, reading docs, trying different machines, installing fonts etc., finally giving up. Now, the two workarounds of Paul Murrell indeed solved the problem for *me*. Thank you. Here are example results of workarounds (i) and (ii): http://www.simecol.de/figs/R_ps_pdf_inkscape.png or http://www.simecol.de/figs/R_pdftrans_inkscape.png One problem remains. I wanted to demonstrate post-editing PDFs with inkscape to motivate students to use R for graphics even if they dont want to "become programmers". However, double conversion (via postscript) or the magics of transparency and opaqueness are not yet the way to increase the trust of point-and-click users to R. Maybe this is a topic for r-devel? Thomas -- Thomas Petzoldt Technische Universitaet Dresden Institut fuer Hydrobiologie 01062 Dresden GERMANY http://tu-dresden.de/Members/thomas.petzoldt?set_language=en
jiho
2007-Nov-01 19:40 UTC
[R] pdf() device uses fonts to represent points - data alteration?
On 2007-November-01 , at 20:18 , Thomas Petzoldt wrote:> I had the same problem. When opening PDFs with a recent developer > version of inkscape all circles were replaced by the letter "q", > see a screenshoot of the imported figure: > > http://www.simecol.de/figs/R_pdf_inkscape.png > > I spent at least two hours trying different development versions of > inkscape, different versions of R, reading docs, trying different > machines, installing fonts etc., finally giving up. Now, the two > workarounds of Paul Murrell indeed solved the problem for *me*. > Thank you. Here are example results of workarounds (i) and (ii): > > http://www.simecol.de/figs/R_ps_pdf_inkscape.png > > or > > http://www.simecol.de/figs/R_pdftrans_inkscape.png > > One problem remains. I wanted to demonstrate post-editing PDFs with > inkscape to motivate students to use R for graphics even if they > dont want to "become programmers". However, double conversion (via > postscript) or the magics of transparency and opaqueness are not > yet the way to increase the trust of point-and-click users to R. > Maybe this is a topic for r-devel?By the way, depending on what OS you are, you may find an entirely SVG workflow more suitable: R with RSVvgDevice package -1-> SVG figures -2-> Inkscape -3-> whatever you like (SVG, PNG, PDF...) This gives all transparency, fonts etc to Inkscape so it is fine on this side. The only "problem" with this workflow for me is that many of my plots stay between stage 1 and 2 and I like to be able to view them quickly. I would need a quick SVG viewer but there are none on OS X. If you are on Linux, many documents viewers (eog, evince, gThumbs) can display SVGs so you would be all set. JiHO --- http://jo.irisson.free.fr/