thr3ads.net - R devel - [Rd] RFC: Kerning, postscript() and pdf() [Oct 2008]

If this information is useful, please help other people find it:
Share via:

Prof Brian Ripley

2008-Oct-12 15:36 UTC

[Rd] RFC: Kerning, postscript() and pdf()

Ei-ji Nakama has pointed out (from another Japanese user, I believe) that 
postscript() and pdf() have not been handling kerning correctly, and this 
is a request for opinions about how we should correct it.

Kerning is the adjustment of the spacing between letters from their 
natural width, so that for example 'Yo' is usually typeset with the o 
closer to the Y than 'Yl' would be.  Kerning is not very well 
standardized, so that for example R's default Helvetica and its URW clone 
(Nimbus Sans) have quite different ideas of the amount of kerning 
corrections for 'Yo'. This matters, because not many people actually see
Helvetica when viewing R's PostScript or PDF output, but rather a similar 
face like Nimbus Sans or Arial, or in the case of Acrobat Reader, a not 
very similar face.  Kerning is only a feature of some proportionally 
spaced fonts and so not of Courier nor CJK fonts.

The current position (R <= 2.8.0) is that string widths have been 
computing using kerning from the Adobe Font Metric files for the nominal 
font, but the strings have been displayed without using kerning (at least 
in the viewers we are aware of, and the PostScript and PDF reference 
manuals mandate that behaviour, if rather obscurely).  This means that in 
strings such as 'You', the width used in the string placement differs
from
that actually displayed.

For postscript(), this doesn't have much impact, as centring or right 
justification ('hadj' in text()) is done by PostScript code and computes
the width from the actual font used (and so copes well with font 
substitution).  It might affect the fine layout in plotmath, but using 
strings which would be kerned in annotations is not common.

For pdf() the effect is more commonly seen, as all text is set 
left-justified, and the computed width is used to centre/right-justify.

There are several things we could do:

A.  Do nothing, for back compatibility.  After all, this has been going on 
for years and no one has complained until last month.

B.  Ignore kerning, and hence change the string width computations to 
match the current display.  This is more attractive than it appears at 
first sight -- as far as I know all other devices ignore kerning, and we 
are increasingly used to seeing 'typeset' output without kerning.  It 
would be desirable when copying graphs by e.g. dev.copy2eps from devices 
that do not kern.

C.  Insert kerning corrections by splitting up strings, so e.g. 'You' is
set as (Y)-140 kc(ou): this is what TeX engines do.

D.  Compute the position of each letter in the string and place them 
individually.

C and D would give visually identical output when the font used is exactly 
as specified, and hopefully also when a substitute font is using with the 
same glyph widths (as substituting Nimbus Sans for Helvetica, at least for 
some versions of each), but where the substitute is a poor match, C ought 
to look more elegant but line up less well.  D would produce much larger 
files than C.

We do have the option of not changing the output when there is no kerning. 
That would be by far the most common case except that some fonts 
(including Helvetica but not Nimbus Sans) kern between punctuation and a 
space, e.g. ', '.  I'm inclined to believe that most uses of
',' in R
graphical output are not punctuation (certainly true of R's own examples), 
and also that we nowadays do not expect to see kerning involving spaces.

Ei-ji Nakama provided an implementation of C for pdf() and D for 
postscript() (thanks Ei-ji, and apologies that we did not have a chance to 
discuss the principles first).  I'm inclined to suggest that we should go 
forwards with at most two of these alternatives, and those two should be 
the same for postscript() and pdf() -- my own inclination is to B and C.

So questions:

1) Do people feel strongly that we should preserve graphical output from 
past versions of R, even when there are known bugs?  I can see the need to 
reproduce published figures, but normally this would also need using the 
same version of R.

2) Is kerning worth pursuing?

3) If so, is elegant looking output more important than exact layout?

4) If we allow kerning, should it be the default (or only) option?

To see that sometimes there can be a large effect, try in postscript() or 
pdf()

xx <- 'You You You You You You You You'
plot(0,0,xlim=c(0,1),ylim=c(0,1),type='n')
abline(v=0)
text(0, 0.5, xx, adj=0)
abline(v=strwidth(xx))
x2 <- strsplit(xx, "")
w <- sapply(x2, strwidth)
abline(v=sum(w))

The leftmost of the right pair of lines is the computed width, the 
rightmost the (normal) displayed width.

Unless there are cogent reasons to bring this forward to 2.8.1, any 
changes would be as from 2.9.0.

Brian Ripley

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Duncan Murdoch

2008-Oct-12 15:57 UTC

head link

[Rd] RFC: Kerning, postscript() and pdf()

On 12/10/2008 11:36 AM, Prof Brian Ripley wrote:> Ei-ji Nakama has pointed out (from another Japanese user, I believe) that 
> postscript() and pdf() have not been handling kerning correctly, and this 
> is a request for opinions about how we should correct it.
> 
> Kerning is the adjustment of the spacing between letters from their 
> natural width, so that for example 'Yo' is usually typeset with the
o
> closer to the Y than 'Yl' would be.  Kerning is not very well 
> standardized, so that for example R's default Helvetica and its URW
clone
> (Nimbus Sans) have quite different ideas of the amount of kerning 
> corrections for 'Yo'. This matters, because not many people
actually see
> Helvetica when viewing R's PostScript or PDF output, but rather a
similar
> face like Nimbus Sans or Arial, or in the case of Acrobat Reader, a not 
> very similar face.  Kerning is only a feature of some proportionally 
> spaced fonts and so not of Courier nor CJK fonts.
> 
> The current position (R <= 2.8.0) is that string widths have been 
> computing using kerning from the Adobe Font Metric files for the nominal 
> font, but the strings have been displayed without using kerning (at least 
> in the viewers we are aware of, and the PostScript and PDF reference 
> manuals mandate that behaviour, if rather obscurely).  This means that in 
> strings such as 'You', the width used in the string placement
differs from
> that actually displayed.
> 
> For postscript(), this doesn't have much impact, as centring or right 
> justification ('hadj' in text()) is done by PostScript code and
computes
> the width from the actual font used (and so copes well with font 
> substitution).  It might affect the fine layout in plotmath, but using 
> strings which would be kerned in annotations is not common.
> 
> For pdf() the effect is more commonly seen, as all text is set 
> left-justified, and the computed width is used to centre/right-justify.
> 
> There are several things we could do:
> 
> A.  Do nothing, for back compatibility.  After all, this has been going on 
> for years and no one has complained until last month.
> 
> B.  Ignore kerning, and hence change the string width computations to 
> match the current display.  This is more attractive than it appears at 
> first sight -- as far as I know all other devices ignore kerning, and we 
> are increasingly used to seeing 'typeset' output without kerning. 
It
> would be desirable when copying graphs by e.g. dev.copy2eps from devices 
> that do not kern.
> 
> C.  Insert kerning corrections by splitting up strings, so e.g.
'You' is
> set as (Y)-140 kc(ou): this is what TeX engines do.
> 
> D.  Compute the position of each letter in the string and place them 
> individually.
> 
> C and D would give visually identical output when the font used is exactly 
> as specified, and hopefully also when a substitute font is using with the 
> same glyph widths (as substituting Nimbus Sans for Helvetica, at least for 
> some versions of each), but where the substitute is a poor match, C ought 
> to look more elegant but line up less well.  D would produce much larger 
> files than C.
> 
> We do have the option of not changing the output when there is no kerning. 
> That would be by far the most common case except that some fonts 
> (including Helvetica but not Nimbus Sans) kern between punctuation and a 
> space, e.g. ', '.  I'm inclined to believe that most uses of
',' in R
> graphical output are not punctuation (certainly true of R's own
examples),
> and also that we nowadays do not expect to see kerning involving spaces.
> 
> Ei-ji Nakama provided an implementation of C for pdf() and D for 
> postscript() (thanks Ei-ji, and apologies that we did not have a chance to 
> discuss the principles first).  I'm inclined to suggest that we should
go
> forwards with at most two of these alternatives, and those two should be 
> the same for postscript() and pdf() -- my own inclination is to B and C.
> 
> So questions:
> 
> 1) Do people feel strongly that we should preserve graphical output from 
> past versions of R, even when there are known bugs?  I can see the need to 
> reproduce published figures, but normally this would also need using the 
> same version of R.

I think we can make this sort of change in 2.9.0.
> 2) Is kerning worth pursuing?

I think that is up to you and other people who might do the work; I 
don't think I'll contribute to it.
> 3) If so, is elegant looking output more important than exact layout?

I suppose it matters how bad the exact layout looks, but I think your 
comment above that exact layout will produce much larger files is of 
more concern.  We're sure to get complaints if "much larger" is 
noticeable.  Other concerns are whether text searches in .pdf or .ps 
files get confused by the difference.
> 4) If we allow kerning, should it be the default (or only) option?

If we do it, I think we should make it the default.  Whether it is 
optional depends on how much work that would be (so it is mainly up to 
the implementor).

> 
> To see that sometimes there can be a large effect, try in postscript() or 
> pdf()
> 
> xx <- 'You You You You You You You You'
> plot(0,0,xlim=c(0,1),ylim=c(0,1),type='n')
> abline(v=0)
> text(0, 0.5, xx, adj=0)
> abline(v=strwidth(xx))
> x2 <- strsplit(xx, "")
> w <- sapply(x2, strwidth)
> abline(v=sum(w))
> 
> The leftmost of the right pair of lines is the computed width, the 
> rightmost the (normal) displayed width.
> 
> Unless there are cogent reasons to bring this forward to 2.8.1, any 
> changes would be as from 2.9.0.
> 
> Brian Ripley
>

Prof Brian Ripley

2008-Oct-16 09:03 UTC

head link

[Rd] RFC: Kerning, postscript() and pdf()

I've now implemented B and C in R-devel, with C as the default.

On Sun, 12 Oct 2008, Prof Brian Ripley wrote:
> Ei-ji Nakama has pointed out (from another Japanese user, I believe) that 
> postscript() and pdf() have not been handling kerning correctly, and this
is
> a request for opinions about how we should correct it.
>
> Kerning is the adjustment of the spacing between letters from their natural
> width, so that for example 'Yo' is usually typeset with the o
closer to the Y
> than 'Yl' would be.  Kerning is not very well standardized, so that
for
> example R's default Helvetica and its URW clone (Nimbus Sans) have
quite
> different ideas of the amount of kerning corrections for 'Yo'. This
matters,
> because not many people actually see Helvetica when viewing R's
PostScript or
> PDF output, but rather a similar face like Nimbus Sans or Arial, or in the 
> case of Acrobat Reader, a not very similar face.  Kerning is only a feature
> of some proportionally spaced fonts and so not of Courier nor CJK fonts.
>
> The current position (R <= 2.8.0) is that string widths have been
computing
> using kerning from the Adobe Font Metric files for the nominal font, but
the
> strings have been displayed without using kerning (at least in the viewers
we
> are aware of, and the PostScript and PDF reference manuals mandate that 
> behaviour, if rather obscurely).  This means that in strings such as
'You',
> the width used in the string placement differs from that actually
displayed.
>
> For postscript(), this doesn't have much impact, as centring or right 
> justification ('hadj' in text()) is done by PostScript code and
computes the
> width from the actual font used (and so copes well with font substitution).
> It might affect the fine layout in plotmath, but using strings which would
be
> kerned in annotations is not common.
>
> For pdf() the effect is more commonly seen, as all text is set 
> left-justified, and the computed width is used to centre/right-justify.
>
> There are several things we could do:
>
> A.  Do nothing, for back compatibility.  After all, this has been going on 
> for years and no one has complained until last month.
>
> B.  Ignore kerning, and hence change the string width computations to match
> the current display.  This is more attractive than it appears at first
sight
> -- as far as I know all other devices ignore kerning, and we are
increasingly
> used to seeing 'typeset' output without kerning.  It would be
desirable when
> copying graphs by e.g. dev.copy2eps from devices that do not kern.
>
> C.  Insert kerning corrections by splitting up strings, so e.g.
'You' is set
> as (Y)-140 kc(ou): this is what TeX engines do.
>
> D.  Compute the position of each letter in the string and place them 
> individually.
>
> C and D would give visually identical output when the font used is exactly
as
> specified, and hopefully also when a substitute font is using with the same
> glyph widths (as substituting Nimbus Sans for Helvetica, at least for some 
> versions of each), but where the substitute is a poor match, C ought to
look
> more elegant but line up less well.  D would produce much larger files than
> C.
>
> We do have the option of not changing the output when there is no kerning. 
> That would be by far the most common case except that some fonts (including
> Helvetica but not Nimbus Sans) kern between punctuation and a space, e.g.
',
> '.  I'm inclined to believe that most uses of ',' in R
graphical output are
> not punctuation (certainly true of R's own examples), and also that we 
> nowadays do not expect to see kerning involving spaces.
>
> Ei-ji Nakama provided an implementation of C for pdf() and D for
postscript()
> (thanks Ei-ji, and apologies that we did not have a chance to discuss the 
> principles first).  I'm inclined to suggest that we should go forwards
with
> at most two of these alternatives, and those two should be the same for 
> postscript() and pdf() -- my own inclination is to B and C.
>
> So questions:
>
> 1) Do people feel strongly that we should preserve graphical output from
past
> versions of R, even when there are known bugs?  I can see the need to 
> reproduce published figures, but normally this would also need using the
same
> version of R.
>
> 2) Is kerning worth pursuing?
>
> 3) If so, is elegant looking output more important than exact layout?
>
> 4) If we allow kerning, should it be the default (or only) option?
>
> To see that sometimes there can be a large effect, try in postscript() or 
> pdf()
>
> xx <- 'You You You You You You You You'
> plot(0,0,xlim=c(0,1),ylim=c(0,1),type='n')
> abline(v=0)
> text(0, 0.5, xx, adj=0)
> abline(v=strwidth(xx))
> x2 <- strsplit(xx, "")
> w <- sapply(x2, strwidth)
> abline(v=sum(w))
>
> The leftmost of the right pair of lines is the computed width, the
rightmost
> the (normal) displayed width.
>
> Unless there are cogent reasons to bring this forward to 2.8.1, any changes
> would be as from 2.9.0.
>
> Brian Ripley
>
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>
-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Seemingly Similar Threads

Search for more seemingly similar threads

R devel - Oct 2008 - RFC: Kerning, postscript() and pdf()

[Rd] RFC: Kerning, postscript() and pdf()

[Rd] RFC: Kerning, postscript() and pdf()

[Rd] RFC: Kerning, postscript() and pdf()

Seemingly Similar Threads