thr3ads.net - R help - [R] good method of removing outliers? [Dec 2011]

If this information is useful, please help other people find it:
Share via:

Michael

2011-Dec-30 17:03 UTC

[R] good method of removing outliers?

Happy holidays all!

I know it's very subjective to determine whether some data is outlier or
not...

But are there reasonally good and realistic methods of identifying outliers
in R?

Thanks a lot!

	[[alternative HTML version deleted]]

Joshua Wiley

2011-Dec-30 17:15 UTC

head link

[R] good method of removing outliers?

Hi Michael,

I'm afraid this is one of those cases where the short answer is
"No"
and the long answer is, "No."

If you are working with a data set stored in a data frame, something like:

sapply(mtcars, function(x) if (is.numeric(x)) range(x, na.rm = TRUE)
else c(NA, NA))

should give you the range for all numeric variables---which is a
simple check if any values fall outside the possible range (say you
have an age variable with a -3 or 320).  Beyond that, you can inspect
data visually, but ultimately, you have to decide what an outlier is
and justify it.

Cheers,

Josh

On Fri, Dec 30, 2011 at 9:03 AM, Michael <comtech.usa at gmail.com>
wrote:> Happy holidays all!
>
> I know it's very subjective to determine whether some data is outlier
or
> not...
>
> But are there reasonally good and realistic methods of identifying outliers
> in R?
>
> Thanks a lot!
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

Peter Langfelder

2011-Dec-30 17:31 UTC

head link

[R] good method of removing outliers?

On Fri, Dec 30, 2011 at 9:03 AM, Michael <comtech.usa at gmail.com>
wrote:> Happy holidays all!
>
> I know it's very subjective to determine whether some data is outlier
or
> not...
>
> But are there reasonally good and realistic methods of identifying outliers
> in R?
What kind of data do you have? For simple numeric data, there are
various methods for removing outliers developed for robust estimation
and I'm sure they are implemented in R. For example, this link

http://www.unt.edu/benchmarks/archives/2001/december01/rss.htm

describes how to calculate a robust measure of correlation that
includes a method to downweigh (or remove) outliers.

For identifying outlier samples in multivariate setting, the
possibilities are even more varied, from simple hierarchical
clustering and visual identification of outliers to network
connectivity methods etc.

HTH,

Peter

Paul

2011-Dec-30 23:01 UTC

head link

[R] good method of removing outliers?

On 30/12/11 17:03, Michael wrote:> Happy holidays all!
>
> I know it's very subjective to determine whether some data is outlier
or
> not...
>
> But are there reasonally good and realistic methods of identifying outliers
> in R?
>
> Thanks a lot!
>
>Ignoring the moral questions for a moment (totaly depends on your 
defintion of an outlier, your dataset, it's distribution etc etc), for 
the technical implementation, try the outliers package 
(http://www.stats.bris.ac.uk/R/web/packages/outliers/index.html), which 
implements the Grubbs and Cox tests.  Also, see this stackoverflow 
answer of mine that shows an implementation of the Llund test for 
outliers within a regression ( http://stackoverflow.com/a/1444548/74658 ).

Regards,

Paul.

Apparently Analagous Threads

Search for more reasonably related threads

R help - Dec 2011 - good method of removing outliers?

[R] good method of removing outliers?

[R] good method of removing outliers?

[R] good method of removing outliers?

[R] good method of removing outliers?

Apparently Analagous Threads