similar to: some thoughts on outlier detection, need help!

Displaying 20 results from an estimated 3000 matches similar to: "some thoughts on outlier detection, need help!"

2004 Jan 21
1
outlier identification: is there a redundancy-invariant substitution for mahalanobis distances?
Dear R-experts, Searching the help archives I found a recommendation to do multivariate outlier identification by mahalanobis distances based on a robustly estimated covariance matrix and compare the resulting distances to a chi^2-distribution with p (number of your variables) degrees of freedom. I understand that compared to euclidean distances this has the advantage of being scale-invariant.
2011 Nov 16
2
outlier identify in qqplot
Dear Community, I want to identify outliers in my data. I don't know how to use identify command in the plots obtained. I've gone through help files and use mahalanobis example for my purpose: NormalMultivarianteComparefunc <- function(x) { Sx <- cov(x) D2 <- mahalanobis(x, colMeans(x), Sx) plot(density(D2, bw=.5), main="Squared Mahalanobis distances, n=nrow(x),
2005 Aug 08
2
computationally singular
Hi, I have a dataset which has around 138 variables and 30,000 cases. I am trying to calculate a mahalanobis distance matrix for them and my procedure is like this: Suppose my data is stored in mymatrix > S<-cov(mymatrix) # this is fine > D<-sapply(1:nrow(mymatrix), function(i) mahalanobis(mymatrix, mymatrix[i,], S)) Error in solve.default(cov, ...) : system is computationally
2000 Apr 21
1
outlier detection methods in r?
hi - if I sample from a normal distribution with something like n100<-rnorm(100,0,1) and add an outlier with n100[10]<-4 then qqnorm(n100) visually shows the point 4 as an outlier and calculating the probablity of a value of 4 or bigger in 100 samples of norm(0,1) gives > 1-exp(log(pnorm(4,0,1))*100) [1] 0.003162164 If I have more than 1 sample above outlier threshold the math is a
2005 Aug 08
2
selecting outliers
Hi everybody, I'd like to know if there's an easy way for extracting outliers record from a dataset, in order to perform further analysis on them. Thanks Alessandro
2004 May 26
0
Outlier identification according to Hardin & Rocke (1999)
I'm trying to use a paper by Hardin & Rocke: http://handel.cipic.ucdavis.edu/~dmrocke/Robdist5.pdf as a guide for a function to identify outliers in multivariate data. Attached below is a function that is my attempt to reproduce their method and also a test to see what fraction of the data are identified as outliers. Using this function I am able to reproduce their results regarding the
2009 Aug 05
0
get NA from outlier{randomForest}
Hi I have a data frame like this: V1 V2 V3 V4 Min. :0.01146 Min. :0.0006714 Min. :0.004912 Min. : 0 1st Qu.:0.03938 1st Qu.:0.0072805 1st Qu.:0.052719 1st Qu.:1150 Median :0.04224 Median :0.0077581 Median :0.056388 Median :1150 Mean :0.04010 Mean :0.0074669 Mean :0.052602 Mean :1173 3rd
2011 Sep 26
2
Mahalanobis Distance
Hello R helpers, I'm trying to use Mahalanobis distance to calculate distance of two time series, to make some comparations with euclidean distance, DTW, etc, but I'm having some dificults. I have, for example, two objects: s.1 <- c( 5.6324702, 1.3994353, -3.2572327, -3.8311846, -1.2248719, 0.9894694, -2.2835332, -5.1969285, -5.2823988, -3.1499400, -1.7307950, 2.8221209,
2004 Mar 26
1
Mahalanobis
Dear all Why isn'it possible to calculate Mahalanobis distances with R for a matrix with 1 row (observations) more than the number of columns (variables)? > mydata <- matrix(runif(12,-5,5), 4, 3) > mahalanobis(x=mydata, center=apply(mydata,2,mean), cov=var(mydata)) [1] 2.25 2.25 2.25 2.25 > mydata <- matrix(runif(420,-5,5), 21, 20) > mahalanobis(x=mydata,
2009 Jul 20
2
mahalanobis distance
http://www.nabble.com/file/p24569511/mahalanobis.txt mahalanobis.txt http://www.nabble.com/file/p24569511/concentrations.txt concentrations.txt Dear Forum members, I have a problem calculating mahalanobis distances. My data file mahalanobis.txt and categories file concentrations.txt are attached. I do the following steps: x <- as.matrix(read.table("mahalanobis.txt", header=TRUE))
2010 Jan 30
2
Questions on Mahalanobis Distance
Hello, I am a new R user and trying to learn how to implement the mahalanobis function to measure the distance between to 2 population centroids. I have used STATISTICA to calculate these differences, but was hoping to learn to do the analysis in R. I have implemented the code as below, but my results are very different from that of STATISTICA, and I believe I may not have interpreted the help
2011 Mar 22
1
Using the mahalanobis( ) function
Hello all, I am a 2 month newbie to R and am stumped. I have a data set that I've run multivariate stats on using the manova function (I included the data set). Now it comes time for a table of effect sizes with significance. The univariate tests are easy. Where I run into trouble filling in the table of effect sizes is the Mahalanobis D as an effect size. I've included the table so
2011 Mar 20
1
Using the Mahalanobis Function
Hello all, I am a 2 month newbie to R and am stumped. I have a data set that I've run multivariate stats on using the manova function (I included the data set). Now it comes time for a table of effect sizes with significance. The univariate tests are easy. Where I run into trouble filling in the table of effect sizes is the Mahalanobis D as an effect size. I've included the table so
2010 Jun 22
1
Mahalanobis distance
I am a new R user. i have a question about Mahalanobis distance.actually i have 300 rows and 7 columns. columns are different measurements, 300 rows are genes. since genes can classify into 4 categories. i used dist() with euclidean distance and cmdscale to do MDS plot. but find out Mahalanobis distance may be better. how do i use Mahalanobis() to generate similar dist object which i can use
2007 Feb 20
1
Mahalanobis distance and probability of group membership using Hotelling's T2 distribution
I want to calculate the probability that a group will include a particular point using the squared Mahalanobis distance to the centroid. I understand that the squared Mahalanobis distance is distributed as chi-squared but that for a small number of random samples from a multivariate normal population the Hotellings T2 (T squared) distribution should be used. I cannot find a function for
2008 Oct 09
2
vectorization instead of using loop
Dear all, I've sent this question 2 days ago and got response from Sarah. Thanks for that. But unfortunately, it did not really solve our problem. The main issue is that we want to use our own (manipulated) covariance matrix in the calculation of the mahalanobis distance. Does anyone know how to vectorize the below code instead of using a loop (which slows it down)? I'd really appreciate
2005 Dec 14
1
About help on 'mahalanobis'
Hi, help on 'mahalanobis' (in the stats package in Rv2.2.0) now says: "Description: Returns the Mahalanobis distance of all rows in 'x' and the vector mu='center' with respect to Sigma='cov'. This is (for vector 'x') defined as D^2 = (x - mu)' Sigma^{-1} (x - mu)" It does return D^2 as written. However,
2008 Dec 08
1
Clustering with Mahalanobis Distance
Dear R ExpeRts, I'm having memory difficulties using mahalanobis distance to trying to cluster in R. I was wondering if anyone has done it with a matrix of 6525x17 (or something similar to that size). I have a matrix of 6525 genes and 17 samples. I have my R memory increased to the max and am still getting "cannot allocate vector of size" errors. My matrix "x" is
2005 Jun 24
1
Mahalanobis distances
Dear R community Have just recently got back into R after a long break and have been amazed at how much it has grown, and how active the list is! Thank you so much to all those who contribute to this amazing project. My question: I am trying to calculate Mahalanobis distances for a matrix called "fgmatrix" >dim(fgmatrix) [1] 76 15 >fg.cov <- cov.wt(fgmatrix)
2005 Jul 06
1
Help: Mahalanobis distances between 'Species' from iris
Dear R list, I'm trying to calculate Mahalanobis distances for 'Species' of 'iris' data as obtained below: Squared Distance to Species From Species: Setosa Versicolor Virginica Setosa 0 89.86419 179.38471 Versicolor 89.86419 0 17.20107 Virginica 179.38471 17.20107 0 This distances above were obtained with proc