Melanie Vida
2005-Feb-25 18:29 UTC
[R] Temporal Analysis of variable x; How to select the outlier threshold in R?
For a financial data set with large variance, I'm trying to find the outlier threshold of one variable "x" over a two year period. I qqplot(x2001, x2002) and found a normal distribution. The latter part of the normal distribution did not look linear though. Is there a suitable method in R to find the outlier threshold of this variable from 2001 and 2002 in R?
Achim Zeileis
2005-Feb-25 18:49 UTC
[R] Temporal Analysis of variable x; How to select the outlier threshold in R?
Please stop posting (almost) identical questions! You already posted a very similar question to R-help (and received two answers) and you posted the same question on R-SIG-Finance! As both Uwe and Christian indicated in their answers, your question is very vague. If you want to receive better answers, it would help to ask better questions. Please also read the posting guide at http://www.R-project.org/posting-guide.html On Fri, 25 Feb 2005 13:29:38 -0500 Melanie Vida wrote:> For a financial data set with large variance, I'm trying to find the > outlier threshold of one variable "x" over a two year period. ITo reiterate Uwe: "This depends on your definition of an outlier and the model for your data".> qqplot(x2001, x2002) and found a normal distribution. The latter partI'm not sure how you could do that from that plot...> of the normal distribution did not look linear though. Is there a > suitable method in R to find the outlier threshold of this variable > from 2001 and 2002 in R?If you think it appropriate you could fit a normal model and cut at a quantile of your choice. Z
(Ted Harding)
2005-Feb-25 18:54 UTC
[R] Temporal Analysis of variable x; How to select the outli
On 25-Feb-05 Melanie Vida wrote:> For a financial data set with large variance, I'm trying to > find the outlier threshold of one variable "x" over a two > year period. > I qqplot(x2001, x2002) and found a normal distribution. > The latter part of the normal distribution did not look linear > though. > Is there a suitable method in R to find the outlier threshold > of this variable from 2001 and 2002 in R?I don't see how you can infer a normal distribution from qqplot(), which simply compares the distribution of x2002 with the distribution of x2001. See ?qqplot for what's available and what they show. You can check normality with a qqplot() with, e.g., qqnorm(x2001) qqnorm(x2002) or qqnorm(c(x2001,x2002)) if you want to look at their combined distirbution. Have another look at your data, with appropriate method! Best wishes, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 25-Feb-05 Time: 18:54:19 ------------------------------ XFMail ------------------------------
bogdan romocea
2005-Mar-01 16:20 UTC
[R] Temporal Analysis of variable x; How to select the outlier threshold in R?
I'm not sure I understand. You have financial data and want to throw away some outliers?? Why would you ever do this? First of all, I'd suggest you pay close attention to what the data is trying to say. Maybe your distribution is not normal after all (see tests for normality etc). Maybe you shouldn't force your normality assumption upon the data. -----Original Message----- From: Melanie Vida [mailto:mvida at mitre.org] Sent: Friday, February 25, 2005 1:30 PM To: r-help Subject: [R] Temporal Analysis of variable x; How to select the outlier threshold in R? For a financial data set with large variance, I'm trying to find the outlier threshold of one variable "x" over a two year period. I qqplot(x2001, x2002) and found a normal distribution. The latter part of the normal distribution did not look linear though. Is there a suitable method in R to find the outlier threshold of this variable from 2001 and 2002 in R? ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Melanie Vida
2005-Mar-01 18:25 UTC
[R] Temporal Analysis of variable x; How to select the outlier threshold in R?
--- bogdan romocea <br44114 at yahoo.com> wrote: > I'm not sure I understand. > You have financial data and want to throw away some > outliers?? > Why would you ever do this? I would select an outlier threshold, to extract a subset of the data "x" that had significant difference in financial contributions in a range of two years. "x" represents a variable for the amount of dollar value change in allocations to an account over a 2 year period. > > First of all, I'd suggest you pay close attention to > what the data is > trying to say. Maybe your distribution is not normal > after all (see > tests for normality etc). Maybe you shouldn't force > your normality > assumption upon the data. > A plot off qq.plot(x) or qqnorm(x) indicated that the data was not normally distributed. I also used shapiro.test() which gave a p-value << 0.05. In order to select the outlier threshold, I ended up using the following : outlier_threshold <- qauntile(x, 3/4) + 1.5* IQR(x) -Melanie > > > -----Original Message----- > From: Melanie Vida [mailto:mvida at mitre.org] > Sent: Friday, February 25, 2005 1:30 PM > To: r-help > Subject: [R] Temporal Analysis of variable x; How to > select the outlier > threshold in R? > > > For a financial data set with large variance, I'm > trying to find the > outlier threshold of one variable "x" over a two > year period. I > qqplot(x2001, x2002) and found a normal > distribution. The latter part > of > the normal distribution did not look linear though. > Is there a suitable > > method in R to find the outlier threshold of this > variable from 2001 > and > 2002 in R? > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >