richard_raubertas@merck.com
2003-Aug-15 00:38 UTC
[Rd] plot.lm mislabels points with na.exclude (PR#3750)
R 1.7.1 on Windows XP The "normal Q-Q plot" produced by plot.lm() mislabels points when the model is fitted using na.action=na.exclude. Example: x <- 1:50 y <- x + rnorm(50) y[c(5,10,15)] <- NA # insert some NA's y[40] <- 50 # add an outlier plot(lm(y ~ x, na.action=na.omit)) # outlier correctly labeled in all # four plots plot(lm(y ~ x, na.action=na.exclude)) # labels attached to wrong points # in the QQ plot (only) Rich Raubertas Biometrics Research Merck & Co. ------------------------------------------------------------------------------ Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp & Dohme or MSD) that may be confidential, proprietary copyrighted and/or legally privileged, and is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please immediately return this by e-mail and then delete it.
Here's one possible fix (may not be very efficient). Change lines 82-83 in $R_HOME/src/base/R/plot.lm.R to the following: if (id.n > 0) { qqx <- rep(NA, n) qqy <- rep(NA, n) qqx[!is.na(rs)] <- qq$x qqy[!is.na(rs)] <- qq$y text.id(qqx[show.rs], qqy[show.rs], show.rs, adj.x = TRUE) } Andy> -----Original Message----- > From: richard_raubertas@merck.com > [mailto:richard_raubertas@merck.com] > Sent: Thursday, August 14, 2003 6:38 PM > To: r-devel@stat.math.ethz.ch > Cc: R-bugs@biostat.ku.dk > Subject: [Rd] plot.lm mislabels points with na.exclude (PR#3750) > > > R 1.7.1 on Windows XP > > The "normal Q-Q plot" produced by plot.lm() mislabels points > when the model is fitted using na.action=na.exclude. Example: > > x <- 1:50 > y <- x + rnorm(50) > y[c(5,10,15)] <- NA # insert some NA's > y[40] <- 50 # add an outlier > plot(lm(y ~ x, na.action=na.omit)) # outlier correctly > labeled in all > # four plots > plot(lm(y ~ x, na.action=na.exclude)) # labels attached to > wrong points > # in the QQ plot (only) > > > Rich Raubertas > Biometrics Research > Merck & Co. > > -------------------------------------------------------------- > ---------------- > Notice: This e-mail message, together with any attachments, > contains information of Merck & Co., Inc. (Whitehouse > Station, New Jersey, USA), and/or its affiliates (which may > be known outside the United States as Merck Frosst, Merck > Sharp & Dohme or MSD) that may be confidential, proprietary > copyrighted and/or legally privileged, and is intended solely > for the use of the individual or entity named on this > message. If you are not the intended recipient, and have > received this message in error, please immediately return > this by e-mail and then delete it. > > ______________________________________________ > R-devel@stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo> /r-devel > > > > -------------------------------------------------------------- > ---------------- > Notice: This e-mail message, together with any attachments, > contains information of Merck & Co., Inc. (Whitehouse > Station, New Jersey, USA), and/or its affiliates (which may > be known outside the United States as Merck Frosst, Merck > Sharp & Dohme or MSD) that may be confidential, proprietary > copyrighted and/or legally privileged, and is intended solely > for the use of the individual or entity named on this > message. If you are not the intended recipient, and have > received this message in error, please immediately return > this by e-mail and then delete it. > -------------------------------------------------------------- > ---------------- >------------------------------------------------------------------------------ Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp & Dohme or MSD) that may be confidential, proprietary copyrighted and/or legally privileged, and is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please immediately return this by e-mail and then delete it.
andy_liaw@merck.com
2003-Aug-15 04:09 UTC
[Rd] plot.lm mislabels points with na.exclude (PR#3750)
Here's one possible fix (may not be very efficient). Change lines 82-83 in $R_HOME/src/base/R/plot.lm.R to the following: if (id.n > 0) { qqx <- rep(NA, n) qqy <- rep(NA, n) qqx[!is.na(rs)] <- qq$x qqy[!is.na(rs)] <- qq$y text.id(qqx[show.rs], qqy[show.rs], show.rs, adj.x = TRUE) } Andy> -----Original Message----- > From: richard_raubertas@merck.com > [mailto:richard_raubertas@merck.com] > Sent: Thursday, August 14, 2003 6:38 PM > To: r-devel@stat.math.ethz.ch > Cc: R-bugs@biostat.ku.dk > Subject: [Rd] plot.lm mislabels points with na.exclude (PR#3750) > > > R 1.7.1 on Windows XP > > The "normal Q-Q plot" produced by plot.lm() mislabels points > when the model is fitted using na.action=na.exclude. Example: > > x <- 1:50 > y <- x + rnorm(50) > y[c(5,10,15)] <- NA # insert some NA's > y[40] <- 50 # add an outlier > plot(lm(y ~ x, na.action=na.omit)) # outlier correctly > labeled in all > # four plots > plot(lm(y ~ x, na.action=na.exclude)) # labels attached to > wrong points > # in the QQ plot (only) > > > Rich Raubertas > Biometrics Research > Merck & Co. > > -------------------------------------------------------------- > ---------------- > Notice: This e-mail message, together with any attachments, > contains information of Merck & Co., Inc. (Whitehouse > Station, New Jersey, USA), and/or its affiliates (which may > be known outside the United States as Merck Frosst, Merck > Sharp & Dohme or MSD) that may be confidential, proprietary > copyrighted and/or legally privileged, and is intended solely > for the use of the individual or entity named on this > message. If you are not the intended recipient, and have > received this message in error, please immediately return > this by e-mail and then delete it. > > ______________________________________________ > R-devel@stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo> /r-devel > > > > -------------------------------------------------------------- > ---------------- > Notice: This e-mail message, together with any attachments, > contains information of Merck & Co., Inc. (Whitehouse > Station, New Jersey, USA), and/or its affiliates (which may > be known outside the United States as Merck Frosst, Merck > Sharp & Dohme or MSD) that may be confidential, proprietary > copyrighted and/or legally privileged, and is intended solely > for the use of the individual or entity named on this > message. If you are not the intended recipient, and have > received this message in error, please immediately return > this by e-mail and then delete it. > -------------------------------------------------------------- > ---------------- >------------------------------------------------------------------------------ Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp & Dohme or MSD) that may be confidential, proprietary copyrighted and/or legally privileged, and is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please immediately return this by e-mail and then delete it.
maechler@stat.math.ethz.ch
2003-Aug-15 15:55 UTC
[Rd] plot.lm mislabels points with na.exclude (PR#3750)
>>>>> "AndyL" == andy liaw <andy_liaw@merck.com> >>>>> on Fri, 15 Aug 2003 04:08:59 +0200 (MET DST) writes:AndyL> Here's one possible fix (may not be very efficient). AndyL> Change lines 82-83 in $R_HOME/src/base/R/plot.lm.R to the following: AndyL> if (id.n > 0) { AndyL> qqx <- rep(NA, n) AndyL> qqy <- rep(NA, n) AndyL> qqx[!is.na(rs)] <- qq$x AndyL> qqy[!is.na(rs)] <- qq$y AndyL> text.id(qqx[show.rs], qqy[show.rs], show.rs, adj.x = TRUE) AndyL> } Thank you, Andy. I digged a bit further, however. I'd argue the bug is in qqnorm(): It shouldn't drop NA's in its result, list(x= ., y=.). R-devel will contain a fixed qqnorm.default() function which will also solve this plot.lm() behavior. Martin Maechler <maechler@stat.math.ethz.ch> http://stat.ethz.ch/~maechler/ Seminar fuer Statistik, ETH-Zentrum LEO C16 Leonhardstr. 27 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-1-632-3408 fax: ...-1228 <>< >> -----Original Message----- >> From: richard_raubertas@merck.com >> Sent: Thursday, August 14, 2003 6:38 PM >> To: r-devel@stat.math.ethz.ch >> Cc: R-bugs@biostat.ku.dk >> Subject: [Rd] plot.lm mislabels points with na.exclude (PR#3750) >> >> >> R 1.7.1 on Windows XP >> >> The "normal Q-Q plot" produced by plot.lm() mislabels points >> when the model is fitted using na.action=na.exclude. Example: >> >> x <- 1:50 >> y <- x + rnorm(50) >> y[c(5,10,15)] <- NA # insert some NA's >> y[40] <- 50 # add an outlier >> plot(lm(y ~ x, na.action=na.omit)) # outlier correctly >> labeled in all >> # four plots >> plot(lm(y ~ x, na.action=na.exclude)) # labels attached to >> wrong points >> # in the QQ plot (only) >> >> >> Rich Raubertas >> Biometrics Research >> Merck & Co.
> From: maechler@stat.math.ethz.ch [mailto:maechler@stat.math.ethz.ch] > > >>>>> "AndyL" == andy liaw <andy_liaw@merck.com> > >>>>> on Fri, 15 Aug 2003 04:08:59 +0200 (MET DST) writes: > > AndyL> Here's one possible fix (may not be very efficient). > AndyL> Change lines 82-83 in $R_HOME/src/base/R/plot.lm.R > to the following: > > AndyL> if (id.n > 0) { > AndyL> qqx <- rep(NA, n) > AndyL> qqy <- rep(NA, n) > AndyL> qqx[!is.na(rs)] <- qq$x > AndyL> qqy[!is.na(rs)] <- qq$y > AndyL> text.id(qqx[show.rs], qqy[show.rs], show.rs, adj.x = TRUE) > AndyL> } > > Thank you, Andy. > > I digged a bit further, however. > I'd argue the bug is in qqnorm(): It shouldn't drop NA's in > its result, list(x= ., y=.). > > R-devel will contain a fixed qqnorm.default() function > which will also solve this plot.lm() behavior.I completely agree. ?qqnorm does not say what it does with NAs. Maybe it should? Best, Andy> Martin Maechler <maechler@stat.math.ethz.ch>http://stat.ethz.ch/~maechler/ Seminar fuer Statistik, ETH-Zentrum LEO C16 Leonhardstr. 27 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-1-632-3408 fax: ...-1228 <>< >> -----Original Message----- >> From: richard_raubertas@merck.com >> Sent: Thursday, August 14, 2003 6:38 PM >> To: r-devel@stat.math.ethz.ch >> Cc: R-bugs@biostat.ku.dk >> Subject: [Rd] plot.lm mislabels points with na.exclude (PR#3750) >> >> >> R 1.7.1 on Windows XP >> >> The "normal Q-Q plot" produced by plot.lm() mislabels points >> when the model is fitted using na.action=na.exclude. Example: >> >> x <- 1:50 >> y <- x + rnorm(50) >> y[c(5,10,15)] <- NA # insert some NA's >> y[40] <- 50 # add an outlier >> plot(lm(y ~ x, na.action=na.omit)) # outlier correctly >> labeled in all >> # four plots >> plot(lm(y ~ x, na.action=na.exclude)) # labels attached to >> wrong points >> # in the QQ plot (only) >> >> >> Rich Raubertas >> Biometrics Research >> Merck & Co. ______________________________________________ R-devel@stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-devel ------------------------------------------------------------------------------ Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp & Dohme or MSD) that may be confidential, proprietary copyrighted and/or legally privileged, and is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please immediately return this by e-mail and then delete it.
andy_liaw@merck.com
2003-Aug-15 16:11 UTC
[Rd] plot.lm mislabels points with na.exclude (PR#3750)
> From: maechler@stat.math.ethz.ch [mailto:maechler@stat.math.ethz.ch] > > >>>>> "AndyL" == andy liaw <andy_liaw@merck.com> > >>>>> on Fri, 15 Aug 2003 04:08:59 +0200 (MET DST) writes: > > AndyL> Here's one possible fix (may not be very efficient). > AndyL> Change lines 82-83 in $R_HOME/src/base/R/plot.lm.R > to the following: > > AndyL> if (id.n > 0) { > AndyL> qqx <- rep(NA, n) > AndyL> qqy <- rep(NA, n) > AndyL> qqx[!is.na(rs)] <- qq$x > AndyL> qqy[!is.na(rs)] <- qq$y > AndyL> text.id(qqx[show.rs], qqy[show.rs], show.rs, adj.x = TRUE) > AndyL> } > > Thank you, Andy. > > I digged a bit further, however. > I'd argue the bug is in qqnorm(): It shouldn't drop NA's in > its result, list(x= ., y=.). > > R-devel will contain a fixed qqnorm.default() function > which will also solve this plot.lm() behavior.I completely agree. ?qqnorm does not say what it does with NAs. Maybe it should? Best, Andy> Martin Maechler <maechler@stat.math.ethz.ch>http://stat.ethz.ch/~maechler/ Seminar fuer Statistik, ETH-Zentrum LEO C16 Leonhardstr. 27 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-1-632-3408 fax: ...-1228 <>< >> -----Original Message----- >> From: richard_raubertas@merck.com >> Sent: Thursday, August 14, 2003 6:38 PM >> To: r-devel@stat.math.ethz.ch >> Cc: R-bugs@biostat.ku.dk >> Subject: [Rd] plot.lm mislabels points with na.exclude (PR#3750) >> >> >> R 1.7.1 on Windows XP >> >> The "normal Q-Q plot" produced by plot.lm() mislabels points >> when the model is fitted using na.action=na.exclude. Example: >> >> x <- 1:50 >> y <- x + rnorm(50) >> y[c(5,10,15)] <- NA # insert some NA's >> y[40] <- 50 # add an outlier >> plot(lm(y ~ x, na.action=na.omit)) # outlier correctly >> labeled in all >> # four plots >> plot(lm(y ~ x, na.action=na.exclude)) # labels attached to >> wrong points >> # in the QQ plot (only) >> >> >> Rich Raubertas >> Biometrics Research >> Merck & Co. ______________________________________________ R-devel@stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-devel ------------------------------------------------------------------------------ Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp & Dohme or MSD) that may be confidential, proprietary copyrighted and/or legally privileged, and is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please immediately return this by e-mail and then delete it.
maechler@stat.math.ethz.ch
2003-Aug-15 16:20 UTC
[Rd] plot.lm mislabels points with na.exclude (PR#3750)
>>>>> "AndyL" == andy liaw <andy_liaw@merck.com> >>>>> on Fri, 15 Aug 2003 16:11:18 +0200 (MET DST) writes:>> From: maechler@stat.math.ethz.ch [mailto:maechler@stat.math.ethz.ch] >> >> >>>>> "AndyL" == andy liaw <andy_liaw@merck.com> >> >>>>> on Fri, 15 Aug 2003 04:08:59 +0200 (MET DST) writes: >> AndyL> Here's one possible fix (may not be very efficient). AndyL> Change lines 82-83 in $R_HOME/src/base/R/plot.lm.R >> to the following: >> AndyL> if (id.n > 0) { AndyL> qqx <- rep(NA, n) AndyL> qqy <- rep(NA, n) AndyL> qqx[!is.na(rs)] <- qq$x AndyL> qqy[!is.na(rs)] <- qq$y AndyL> text.id(qqx[show.rs], qqy[show.rs], show.rs, adj.x = TRUE) AndyL> } >> >> Thank you, Andy. >> >> I digged a bit further, however. >> I'd argue the bug is in qqnorm(): It shouldn't drop NA's in >> its result, list(x= ., y=.). >> >> R-devel will contain a fixed qqnorm.default() function >> which will also solve this plot.lm() behavior. AndyL> I completely agree. ?qqnorm does not say what it AndyL> does with NAs. Maybe it should? It now does (in R-devel) :>> Value: >> >> For 'qqnorm' and 'qqplot', a list with components >> >> x: The x coordinates of the points that were/would be plotted >> >> y: The original 'y' vector, i.e., the corresponding y >> coordinates _including 'NA's_.Regards, Martin