similar to: Outlier statistics question

Displaying 20 results from an estimated 4000 matches similar to: "Outlier statistics question"

2009 Feb 14
2
implementing Grubbs outlier test on a large dataframe
Hi! I'm trying to implement an outlier test once/row in a large dataframe. Ideally, I'd do this then add the Pvalue results and the number flagged as an outlier as two new separate columns to the dataframe. Grubbs outlier test requires a vector and I'm confused how to make each row of my dataframe a vector, followed by doing a Grubbs test for each row containing the vector of numbers
2010 Sep 15
1
cochran-grubbs tests results
Hello, I'm new in this R world and I don't know much about statistics, but now I have to analize some data and I've got some first queries yet: I have 5 sets of area mesures and each set has 5 repetitions. My first step is to check data looking for outliers. I've used the outliers package. I have to use the cochran test and the grubbs test in case I find any outlier. The problem
2005 Apr 14
2
grubbs.test
Dear All, I have small samples of data (between 6 and 15) for numerious time series points. I am assuming the data for each time point is normally distributed. The problem is that the data arrvies sporadically and I would like to detect the number of outliers after I have six data points for any time period. Essentially, I would like to detect the number of outliers when I have 6 data points then
2004 Jun 30
1
outlier tests
I have been learning about some outlier tests -- Dixon and Grubb, specifically -- for small data sets. When I try help.start() and search for outlier tests, the only response I manage to find is the Bonferroni test avaiable from the CAR package... are there any other packages the offer outlier tests? Are the Dixon and Grubb tests "good" for small samples or are others more
2012 Apr 18
1
Pierce's criterion
Hello all, I would like to rigorously test whether observations in my dataset are outliers. I guess all the main tests in R (Grubbs) impose the assumption of normality. My data is surely not normal, so I would like to use something else. As far as I can tell from wikipedia, Peirce's criterion is just that. The data I am interested in testing is: 1) Continuous on the unit interval 2)
2006 Jul 20
2
(robust) mixed-effects model with covariate
Dear all, I am unsure about how to specify a model in R and I thought of asking some advice to the list. I have two groups ("Group"= A, B) of subjects, with each subject undertaking a test before and after a certain treatment ("Time"= pre, post). Additionally, I want to enter the age of the subject as a covariate (the performance on the test is affected by age),
2011 Dec 30
3
good method of removing outliers?
Happy holidays all! I know it's very subjective to determine whether some data is outlier or not... But are there reasonally good and realistic methods of identifying outliers in R? Thanks a lot! [[alternative HTML version deleted]]
2004 Sep 23
6
detection of outliers
Hi, this is both a statistical and a R question... what would the best way / test to detect an outlier value among a series of 10 to 30 values ? for instance if we have the following dataset: 10,11,12,15,20,22,25,30,500 I d like to have a way to identify the last data as an outlier (only one direction). One way would be to calculate abs(mean - median) and if elevated (to what extent ?) delete the
2007 Apr 25
1
How to identify and exclude the outliers with R?
Hello, everyone, I want to ask a simple question. If I have a set of data,and I want to identify how many outliers there are in the data.Which packages and functions can I use? Thanks. Shao chunxuan. [[alternative HTML version deleted]]
2012 Feb 09
1
Outlier removal techniques
Hello, I need to analyse a data matrix with dimensions of 30x100. Before analysing the data there is, however, a need to remove outliers from the data. I read quite a lot about outlier removal already and I think the most common technique for that seems to be Principal Component Analysis (PCA). However, I think that these technqiue is quite subjective. When is an outlier an outlier? I uploaded
2008 Jun 18
2
randomForest outlier
I try to use ?randomForest to find variables that are the most important to divide my dataset (continuous, categorical variables) in two given groups. But when I plot the outliers: plot(outlier(FemMalSex_NAavoid88.rf33, cls=FemMalSex_NAavoid88$Sex), type="h",col=c("red","green")[as.numeric(FemMalSex_NAavoid88$Sex)]) it seems to me that all my values appear as
2005 Feb 25
2
outlier threshold
For the analysis of financial data wih a large variance, what is the best way to select an outlier threshold? Listed below, is there a best method to select an outlier threshold and how does R calculate it? In R, how do you find the outlier threshold through an interquartile range? In R, how do you find the outlier threshold using the hist command? In R, how do you find the outlier threshold
2009 Feb 14
6
Outlier Detection for timeseries
Hello R users, Can someone tell if there is a package in R that can do outlier detection that give outputs simiilar to what I got from SAS below. Many thanks in advance for any help! Outlier Details Approx Chi-
2005 Apr 22
2
Hoaglin Outlier Method
I am a new user of R so please bear with me. I have reviewed some R books, FAQs and such but the volume of material is great. I am in the process of porting my current SAS and SVS Script code to Lotus Approach, R and WordPerfect. My question is, can you help me determine the best R method to implement the Hoaglin Outlier Method? It is used in the Appendix A and B of the fo llowing link.
2011 May 04
1
Outlier removal by Principal Component Analysis : error message
Hi, I am currently analysis Raman spectroscopic data with the hyperSpec package. I consulted the documentation on this package and I found an example work-flow dedicated to Raman spectroscopy (see the address : http://hyperspec.r-forge.r-project.org/chondro.pdf) I am currently trying to remove outliers thanks to PCA just as they did in the documentation, but I get a message error I can't
2006 Mar 14
2
bwplot and outlier symbols
Hi, I was just trying to figure out how to beautify the output of my bwplot-output. Altogether I figured most of the things out on my own. The one thing which puzzles me though are the symbols for the outliers. I can easily change the form of the median symbol by using "pch" but I don't know how to do this for outliers. Obviously the "outpch" of the
2000 Apr 21
1
outlier detection methods in r?
hi - if I sample from a normal distribution with something like n100<-rnorm(100,0,1) and add an outlier with n100[10]<-4 then qqnorm(n100) visually shows the point 4 as an outlier and calculating the probablity of a value of 4 or bigger in 100 samples of norm(0,1) gives > 1-exp(log(pnorm(4,0,1))*100) [1] 0.003162164 If I have more than 1 sample above outlier threshold the math is a
2005 Feb 25
4
Temporal Analysis of variable x; How to select the outlier threshold in R?
For a financial data set with large variance, I'm trying to find the outlier threshold of one variable "x" over a two year period. I qqplot(x2001, x2002) and found a normal distribution. The latter part of the normal distribution did not look linear though. Is there a suitable method in R to find the outlier threshold of this variable from 2001 and 2002 in R?
2012 Sep 28
2
changing outlier shapes of boxplots using lattice
Hello This is Elaine. I am using package lattice to generate boxplots. Using Richard's code, the display was almost perfect except the outlier shape. Based on the following code, the outliers are vertical lines. However, I want the outliers to be empty circles. Please kindly help how to modify the code to change the outlier shapes. Thank you. code package (lattice) dataN <-
2004 Jan 21
1
outlier identification: is there a redundancy-invariant substitution for mahalanobis distances?
Dear R-experts, Searching the help archives I found a recommendation to do multivariate outlier identification by mahalanobis distances based on a robustly estimated covariance matrix and compare the resulting distances to a chi^2-distribution with p (number of your variables) degrees of freedom. I understand that compared to euclidean distances this has the advantage of being scale-invariant.