similar to: outlier detection methods in r?

Displaying 20 results from an estimated 1000 matches similar to: "outlier detection methods in r?"

2005 Aug 04
1
some thoughts on outlier detection, need help!
Dear listers: I have an idea to do the outlier detection and I need to use R to implement it first. Here I hope I can get some input from all the guru's here. I select distance-based approach--- step 1: calculate the distance of any two rows for a dataframe. considering the scaling among different variables, I choose mahalanobis, using variance as scaler. step 2: Let k be the number of
2004 Jan 21
1
outlier identification: is there a redundancy-invariant substitution for mahalanobis distances?
Dear R-experts, Searching the help archives I found a recommendation to do multivariate outlier identification by mahalanobis distances based on a robustly estimated covariance matrix and compare the resulting distances to a chi^2-distribution with p (number of your variables) degrees of freedom. I understand that compared to euclidean distances this has the advantage of being scale-invariant.
2011 Nov 16
2
outlier identify in qqplot
Dear Community, I want to identify outliers in my data. I don't know how to use identify command in the plots obtained. I've gone through help files and use mahalanobis example for my purpose: NormalMultivarianteComparefunc <- function(x) { Sx <- cov(x) D2 <- mahalanobis(x, colMeans(x), Sx) plot(density(D2, bw=.5), main="Squared Mahalanobis distances, n=nrow(x),
2024 Dec 02
2
EFI 64bit and Kernel 32 bit [redux]
Good day, Geoff. On 2024-12-02 03:22, Geoff Winkless wrote: > [...] > > I tried (a truncated version of) your instructions with my kernel and > it boots fine under qemu. Sadly that same kernel will not boot on my > real hardware (an ASRock N100-based board). > Just to confirm, are these 2 points all true?: 1. On the "ASRock N100-based board," your
2005 Aug 08
2
selecting outliers
Hi everybody, I'd like to know if there's an easy way for extracting outliers record from a dataset, in order to perform further analysis on them. Thanks Alessandro
2007 Aug 10
7
Help wit matrices
Hello all, I am working with a 1000x1000 matrix, and I would like to return a 1000x1000 matrix that tells me which value in the matrix is greater than a theshold value (1 or 0 indicator). i have tried mat2<-as.matrix(as.numeric(mat1>0.25)) but that returns a 1:100000 matrix. I have also tried for loops, but they are grossly inefficient. THanks for all your help in advance. Lanre
2023 Feb 20
1
[PATCH v2] ocfs2: fix non-auto defrag path not working issue
This fixes three issues on move extents ioctl without auto defrag: a) In ocfs2_find_victim_alloc_group(), we have to convert bits to block first in case of global bitmap. b) In ocfs2_probe_alloc_group(), when finding enough bits in block group bitmap, we have to back off move_len to start pos as well, otherwise it may corrupt filesystem. c) In ocfs2_ioctl_move_extents(), set me_threshold both for
2004 May 26
0
Outlier identification according to Hardin & Rocke (1999)
I'm trying to use a paper by Hardin & Rocke: http://handel.cipic.ucdavis.edu/~dmrocke/Robdist5.pdf as a guide for a function to identify outliers in multivariate data. Attached below is a function that is my attempt to reproduce their method and also a test to see what fraction of the data are identified as outliers. Using this function I am able to reproduce their results regarding the
2009 Aug 11
1
Help on a combinatorial task (lists?)
Hello! I have the following combinatorial problem. Consider the cumulative sums of all permutations of a given weight vector 'w'. I need to know how often weight in a certain position brings the cumulative sums equal or above the given threshold 'q'. In other words, how often each weight is decisive in raising the cumulative sum above 'q'? Here is what I do: w <-
2023 Feb 17
1
[PATCH] ocfs2: fix non-auto defrag path not working issue
This commit fixes three issues on non-auto defrag path (defragfs.ocfs2 doesn't set OCFS2_MOVE_EXT_FL_AUTO_DEFRAG on range.me_flags): - For ocfs2_find_victim_alloc_group(), old code forgot enlarge bitmap range for global_bitmap case. Old code could generate negative vict_bit. - For ocfs2_probe_alloc_group(), old code forgot back off move_len when finding enough bitmap space. Old code has
2005 Feb 25
2
outlier threshold
For the analysis of financial data wih a large variance, what is the best way to select an outlier threshold? Listed below, is there a best method to select an outlier threshold and how does R calculate it? In R, how do you find the outlier threshold through an interquartile range? In R, how do you find the outlier threshold using the hist command? In R, how do you find the outlier threshold
2012 Feb 09
1
Outlier removal techniques
Hello, I need to analyse a data matrix with dimensions of 30x100. Before analysing the data there is, however, a need to remove outliers from the data. I read quite a lot about outlier removal already and I think the most common technique for that seems to be Principal Component Analysis (PCA). However, I think that these technqiue is quite subjective. When is an outlier an outlier? I uploaded
2009 Feb 14
2
implementing Grubbs outlier test on a large dataframe
Hi! I'm trying to implement an outlier test once/row in a large dataframe. Ideally, I'd do this then add the Pvalue results and the number flagged as an outlier as two new separate columns to the dataframe. Grubbs outlier test requires a vector and I'm confused how to make each row of my dataframe a vector, followed by doing a Grubbs test for each row containing the vector of numbers
2004 Jun 30
1
outlier tests
I have been learning about some outlier tests -- Dixon and Grubb, specifically -- for small data sets. When I try help.start() and search for outlier tests, the only response I manage to find is the Bonferroni test avaiable from the CAR package... are there any other packages the offer outlier tests? Are the Dixon and Grubb tests "good" for small samples or are others more
2010 Jul 14
1
randomForest outlier return NA
Dear R-users, I have a problem with randomForest{outlier}. After running the following code ( that produces a silly data set and builds a model with randomForest ): ####################### library(randomForest) set.seed(0) ## build data set X <- rbind( matrix( runif(n=400,min=-1,max=1), ncol = 10 ) , rep(1,times= 10 ) ) Y <- matrix( nrow = nrow(X), ncol = 1) for( i in (1:nrow(X))){
2006 Apr 28
1
Error in rm.outlier method
Hi, I am trying to use rm.outlier method but encountering following error: > y <- rnorm(100) > rm.outlier(y) Error: Error in if (nrow(x) != ncol(x)) stop("x must be a square matrix") : argument is of length zero Whats wrong here? TIA Sachin __________________________________________________ [[alternative HTML version
2010 Nov 30
3
Outlier statistics question
I have a statistical question. The data sets I am working with are right-skewed so I have been plotting the log transformations of my data. I am using a Grubbs Test to detect outliers in the data, but I get different outcomes depending on whether I run the test on the original data or the log(data). Here is one of the problematic sets: fgf2p50=c(1.563,2.161,2.529,2.726,2.442,5.047)
2009 Sep 12
1
medcouple-based outlier detection in R
I need to detect outliers in a large data set which is highly right-skewed. I plan to use medcouple-based outlier detection. Is there any support for medcouple-based outlier detection in R? Are there any other routines in R to perform outlier detection in highly right-skewed data? Manuj Sharma See the Web&#39;s breaking stories, chosen by people like you. Check out Yahoo! Buzz.
2011 Sep 29
1
rm.outlier produces a list
Hello, Why does rm.outlier produce a list for me? I know its something about my data because I can't make a mock up that reproduces the issue. Any ideas? My data goes in as a matrix and comes out as a list: > class(dat) [1] "matrix" > dat = rm.outlier(dat) > class(dat) [1] "list" > Thanks, Ben [[alternative HTML version deleted]]
2005 Feb 25
4
Temporal Analysis of variable x; How to select the outlier threshold in R?
For a financial data set with large variance, I'm trying to find the outlier threshold of one variable "x" over a two year period. I qqplot(x2001, x2002) and found a normal distribution. The latter part of the normal distribution did not look linear though. Is there a suitable method in R to find the outlier threshold of this variable from 2001 and 2002 in R?