thr3ads.net - R help - [R] outlier detection methods in r? [Apr 2000]

If this information is useful, please help other people find it:
Share via:

Robert L. Sandefur

2000-Apr-21 12:35 UTC

[R] outlier detection methods in r?

hi -

 if I sample from a normal distribution with something like
n100<-rnorm(100,0,1)
and add an outlier with 
n100[10]<-4
then
qqnorm(n100)
visually shows the point 4 as an outlier
and calculating the probablity of a value of 4 or bigger  in 100 samples of
norm(0,1)
gives> 1-exp(log(pnorm(4,0,1))*100)[1] 0.003162164

If I have more than 1 sample above outlier threshold the math is a bit more
complicated
but doable.

My questions are 

1) are there better ways to assess probablity of outliers (ie value(s) above
theshold from a given distribution)?
2) are they implimented in r?

(I checked outlier and extreme in r help and found nothing helpful)

thanx

bob sandefur

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Manuel Castejon Limas

2000-Apr-24 16:07 UTC

head link

[R] outlier detection methods in r?

>Subject: [R] outlier detection methods in r?
> hi -
>  if I sample from a normal distribution with something like
> n100<-rnorm(100,0,1)
> and add an outlier with
> n100[10]<-4
> then
> qqnorm(n100)
> visually shows the point 4 as an outlier
> and calculating the probablity of a value of 4 or bigger  in 100 samples
of norm(0,1)> gives
> > 1-exp(log(pnorm(4,0,1))*100)
> [1] 0.003162164
>
> If I have more than 1 sample above outlier threshold the math is a bit
more complicated> but doable.
 > My questions are> 1) are there better ways to assess probablity of outliers (ie value(s)
above theshold from a given distribution)?> 2) are they implimented in r?
1)
The term "a given distribution" makes things a lot difficult, a far as
outlier detection is concerned.
If we are talking about normal distributions, or multivariate normal
distributions, the method based on Mahalanobis distances is the one I
prefer.
If the sample comes from a normal distribution, its Mahalanobis distance
follows a chi-square distribution, so you can allways assess if certain
point is above the threshold determined by your significance level.

2)
You can find mahalanobis() in base package.




-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Possibly Parallel Threads

Search for more apparently analagous threads

R help - Apr 2000 - outlier detection methods in r?

[R] outlier detection methods in r?

[R] outlier detection methods in r?

Possibly Parallel Threads