thr3ads.net - R devel - [Rd] qqline (PR#764) [Dec 2000]

If this information is useful, please help other people find it:
Share via:

Setzer.Woodrow@epamail.epa.gov

2000-Dec-11 23:15 UTC

[Rd] qqline (PR#764)

I think qqline does not do exactly what it is advertised to do
("`qqline'
adds a line to a normal quantile-quantile plot which passes through the
first and third quartiles.").  Consider the graph:

tmp <- qnorm(ppoints(10))
qqnorm(tmp)
qqline(tmp)

The line (which I expected go through all the points), has a slightly
shallower slope than does the points plotted by qqnorm.  I think the
problem is that qqline bases its line on the relationship between the
quartiles in the data and the large sample expected quartiles for a normal
distribution; qqnorm bases its plot on the relationship between the
quantiles in the data and an approximation to the (finite-sample) expected
quantiles for a normal distribution.  In qqnorm, the x-coordinates of the
first and third quartiles of the data vector ('tmp' in this case) are
not
qnorm(c(0.25,0.75)) (as qqline does), but rather something like
quantile(qnorm(ppoints(length(tmp))),c(0.25,0.75)).  I say "something
like"
because it is exactly right when the quartiles fall on data points, and an
approximation otherwise.

The following definition for qqline reflects this point:

function (y, ...)
{
  y <- y[!is.na(y)]
  n <- length(y)
    y <- quantile(y, c(0.25, 0.75))
    x <- quantile(qnorm(ppoints(n)),c(0.25, 0.75))
    slope <- diff(y)/diff(x)
    int <- y[1] - slope * x[1]
    abline(int, slope, ...)
}

I'm not sure it makes very much of a difference, though, for looking at
real data, instead of something like expected quantiles.

--please do not edit the information below--

Version:
 platform = Windows
 arch = x86
 os = Win32
 system = x86, Win32
 status  major = 1
 minor = 1.1
 year = 2000
 month = August
 day = 15
 language = R

Windows 9x 4.10 (build 1998)

Search Path:
 .GlobalEnv, package:MASS, package:logspline, Autoloads, package:base

R. Woodrow Setzer, Jr.                                            Phone:
(919) 541-0128
Experimental Toxicology Division                       Fax:  (919) 541-5394
Pharmacokinetics Branch
NHEERL MD-74; US EPA; RTP, NC 27711


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To:
r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Martin Maechler

2000-Dec-12 14:24 UTC

head link

[Rd] qqline (PR#764)

>>>>> "Setzer" == Setzer Woodrow
<Setzer.Woodrow@epamail.epa.gov> writes:
    Setzer> I think qqline does not do exactly what it is advertised to do
    Setzer> ("`qqline' adds a line to a normal quantile-quantile
plot which
    Setzer> passes through the first and third quartiles.").  

yes, the above may not be clear enough.

    Setzer> Consider the graph:

    Setzer> tmp <- qnorm(ppoints(10))
    Setzer> qqnorm(tmp)
    Setzer> qqline(tmp)

    Setzer> The line (which I expected go through all the points), has a
    Setzer> slightly shallower slope than does the points plotted by
    Setzer> qqnorm.  I think the problem is that qqline bases its line on
    Setzer> the relationship between the quartiles in the data and the
    Setzer> large sample expected quartiles for a normal distribution;
    Setzer> qqnorm bases its plot on the relationship between the quantiles
    Setzer> in the data and an approximation to the (finite-sample)
    Setzer> expected quantiles for a normal distribution.  In qqnorm, the
    Setzer> x-coordinates of the first and third quartiles of the data
    Setzer> vector ('tmp' in this case) are not qnorm(c(0.25,0.75))
(as
    Setzer> qqline does), but rather something like
    Setzer> quantile(qnorm(ppoints(length(tmp))),c(0.25,0.75)).  I say
    Setzer> "something like" because it is exactly right when the
quartiles
    Setzer> fall on data points, and an approximation otherwise.

good analysis!

    Setzer> The following definition for qqline reflects this point:

    Setzer> function (y, ...)
    Setzer> {
    Setzer> y <- y[!is.na(y)]
    Setzer> n <- length(y)
    Setzer> y <- quantile(y, c(0.25, 0.75))
    Setzer> x <- quantile(qnorm(ppoints(n)),c(0.25, 0.75))
    Setzer> slope <- diff(y)/diff(x)
    Setzer> int <- y[1] - slope * x[1]
    Setzer> abline(int, slope, ...)
    Setzer> }

    Setzer> I'm not sure it makes very much of a difference, though, for
    Setzer> looking at real data, instead of something like expected
    Setzer> quantiles.

The Development Version of R (R 1.2 in a few days) has

 function (y, ...) 
 {
     y <- quantile(y[!is.na(y)], c(0.25, 0.75))
     x <- qnorm(c(0.25, 0.75))
     slope <- diff(y)/diff(x)
     int <- y[1] - slope * x[1]
     abline(int, slope, ...)
 }

which I think *does* what you suggest it should do.

HOWEVER I was quite a bit astonished to see 
that the slope is still too small (for small sample sizes only).

 par(mfrow=c(2,2))
 for(n in 9:12){ x <-
qnorm(ppoints(n));qqnorm(x,main=paste("n=",n));qqline(x) }

But I think we are now doing what Tukey defined in his EDA book(s)
and what the other S engines do as well.
 {as a matter of fact, R should also return the (int, slope) vector !}

Note that you can also play with the " a = " argument of ppoints,
it's not directly clear to me which value is "optimal" for the
above purpose...

---------

Martin Maechler <maechler@stat.math.ethz.ch>
http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum  LEO D10	Leonhardstr. 27
ETH (Federal Inst. Technology)	8092 Zurich	SWITZERLAND
phone: x-41-1-632-3408		fax: ...-1228			<><
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To:
r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Seemingly Similar Threads

Search for more seemingly similar threads

R devel - Dec 2000 - qqline (PR#764)

[Rd] qqline (PR#764)

[Rd] qqline (PR#764)

Seemingly Similar Threads