Dear Community, I want to identify outliers in my data. I don't know how to use identify command in the plots obtained. I've gone through help files and use mahalanobis example for my purpose: NormalMultivarianteComparefunc <- function(x) { Sx <- cov(x) D2 <- mahalanobis(x, colMeans(x), Sx) plot(density(D2, bw=.5), main="Squared Mahalanobis distances, n=nrow(x), p=ncol(x)") rug(D2) qqplot(qchisq(ppoints(nrow(x)), df=ncol(x)), D2, main = expression("Q-Q plot of Mahalanobis" * ~D^2 * " vs. quantiles of" * ~ chi[ncol(x)]^2)) abline(0, 1, col = 'gray') } Then I run: NormalMultivarianteComparefunc(y); y dataframe with the data. Now, let's say y =replicate(5, rnorm(100)) ##what should I write now to identify data from the plot?? ##/identify(y) warning: no point within 0.25 inches / ????? I know I can use aq.plot, but I would be very grateful if you could help me with identify. /By the way, in the function, how can the title write the value of the variables in spite of "ncol(x)" or "nrow(x)"/ Thanks in advance, user at host.com -- View this message in context: http://r.789695.n4.nabble.com/outlier-identify-in-qqplot-tp4076587p4076587.html Sent from the R help mailing list archive at Nabble.com.
Try this qqInteractive <- function(..., IDENTIFY = TRUE){ qqplot(...) -> X if(IDENTIFY) return(identify(X)) invisisble(X) } The trick is that identify wants coordinates of the point in the scatter plot which are not the inputs to qqplot() but rather a transformation thereof. Michael On Wed, Nov 16, 2011 at 9:52 AM, agent dunham <crosspide at hotmail.com> wrote:> Dear Community, > > I want to identify outliers in my data. I don't know how to use identify > command in the plots obtained. > > I've gone through help files and use mahalanobis example for my purpose: > > > NormalMultivarianteComparefunc <- function(x) { > > ? ? ? ?Sx <- cov(x) > ? ? ? ?D2 <- mahalanobis(x, colMeans(x), Sx) > ? ? ? ?plot(density(D2, bw=.5), main="Squared Mahalanobis distances, n=nrow(x), > p=ncol(x)") > ? ? ? ?rug(D2) > ? ? ? ?qqplot(qchisq(ppoints(nrow(x)), df=ncol(x)), D2, > ? ? ? ? ? ? ? ?main = expression("Q-Q plot of Mahalanobis" * ~D^2 * > ? ? ? ? ? ? ? ? ? ? ? ? " vs. quantiles of" * ~ chi[ncol(x)]^2)) > > ? ? ? ?abline(0, 1, col = 'gray') > } > > Then I run: > > NormalMultivarianteComparefunc(y); y dataframe with the data. Now, let's say > y =replicate(5, rnorm(100)) > > ##what should I write now to identify data from the plot?? > ##/identify(y) > warning: no point within 0.25 inches > / ?????? > > I know I can use aq.plot, but I would be very grateful if you could help me > with identify. > > /By the way, in the function, how can the title write the value of the > variables in spite of "ncol(x)" or "nrow(x)"/ > > Thanks in advance, user at host.com > > > -- > View this message in context: http://r.789695.n4.nabble.com/outlier-identify-in-qqplot-tp4076587p4076587.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Do you want a qqnorm() instead of a qqplot() ? [Reproducibility also involves posting the code you used that lead to the error / warning] This seems to work for me: # mydata <- source("http://r.789695.n4.nabble.com/file/n4623493/mydata.txt")[[1]] # Have to drop visible attribute lmmodel <- lm(log(vdep) ~ v1 + sqrt(v2) + v3 +v5 + v6 + v7 + v8 + v9 + v10, data = mydata) qqnorm(residuals(lmmodel)) # Or if interactive: qqnormInt <- function(..., IDENTIFY = TRUE){ qqnorm(...) -> X if(IDENTIFY) return(identify(X)) invisisble(X) } qqnormInt(residuals(lmmodel)) Michael On Thu, May 10, 2012 at 9:20 AM, agent dunham <crosspide at hotmail.com> wrote:> Find the data attached, > > http://r.789695.n4.nabble.com/file/n4623493/mydata.txt mydata.txt > > The model would be /lmmodel <- lm(log(vdep) ~ v1 + sqrt(v2) + v3 +v5 + v6 + > v7 + v8 + v9 + v10, data = mydata)/ > > Thanks again, > > > user at host.com > > -- > View this message in context: http://r.789695.n4.nabble.com/outlier-identify-in-qqplot-tp4076587p4623493.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.