thr3ads.net - similar to: "outlier"

Displaying 20 results from an estimated 8000 matches similar to: "outlier"

2011 Dec 06

Why can't I figure this out? :S

Hi, so I don't speak computer and I have no idea what this code is telling the program to do, but I apparently need to be able to find and isolate influencial observations. Problem, I have no idea what the error means and where it may be from in the code. error I get is below the code { ## OLS results NameC<- lm(gpanew~female+female:lastinit+agenew+canadian+mom_ed+yearstudy) ## default:

problems using lqs()

2003 Feb 10

problems using lqs()

Dear List-members, I found a strange behaviour in the lqs function. Suppose I have the following data: y <- c(7.6, 7.7, 4.3, 5.9, 5.0, 6.5, 8.3, 8.2, 13.2, 12.6, 10.4, 10.8, 13.1, 12.3, 10.4, 10.5, 7.7, 9.5, 12.0, 12.6, 13.6, 14.1, 13.5, 11.5, 12.0, 13.0, 14.1, 15.1) x1 <- c(8.2, 7.6,, 4.6, 4.3, 5.9, 5.0, 6.5, 8.3, 10.1, 13.2, 12.6, 10.4, 10.8, 13.1, 13.3, 10.4, 10.5, 7.7, 10.0, 12.0,

how to identify the outliers

2002 Nov 26

how to identify the outliers

Hello R-users, Is there any more sophisticated way how to identify the dataset outliers other then seeing them in boxplot? I wanna exclude them from further analysis and I am interested in their position in my vector data. Rado -- Radoslav Bonk M.S. Dept. of Physical Geography and Geoecology Faculty of Sciences, Comenius University Mlynska Dolina 842 15, Bratislava, SLOVAKIA tel: +421 2 602

Ltsreg and nsamp="exact"

2003 Jun 18

Ltsreg and nsamp="exact"

I'm trying to use least trimmed squares using ltsreg with nsamp="exact". When I use the following: rg <- ltsreg(x,y,nsamp="exact") I get: Error in lqs.default(x, y, nsamp = "exact", method = "lts") : NAs in foreign function call (arg 10) In addition: Warning message: NAs introduced by coercion Incidentally, there are no missings in x or y,

a question about LMS and what constitutes outliers

2005 Oct 06

a question about LMS and what constitutes outliers

Hi, I have been using the lqs function with method='lms'. However the results I get are a little different from the results noted by Rousseeuw & Leroy (Robust Regression and Outlier Detection) and I was wondering how to use these results for outlier detection. I'm using the stackloss dataset, for which the original Rousseeuw et al. program points out that observations 1,2,3,4

outlier identify in qqplot

2011 Nov 16

outlier identify in qqplot

Dear Community, I want to identify outliers in my data. I don't know how to use identify command in the plots obtained. I've gone through help files and use mahalanobis example for my purpose: NormalMultivarianteComparefunc <- function(x) { Sx <- cov(x) D2 <- mahalanobis(x, colMeans(x), Sx) plot(density(D2, bw=.5), main="Squared Mahalanobis distances, n=nrow(x),

selecting outliers

2005 Aug 08

selecting outliers

Hi everybody, I'd like to know if there's an easy way for extracting outliers record from a dataset, in order to perform further analysis on them. Thanks Alessandro

how to test robustness of correlation

2006 Jan 25

how to test robustness of correlation

Hi, there: As you all know, correlation is not a very robust procedure. Sometimes correlation could be driven by a few outliers. There are a few ways to improve the robustness of correlation (pearson correlation), either by outlier removal procedure, or resampling technique. I am wondering if there is any R package or R code that have incorporated outlier removal or resampling procedure in

Average R-squared of model1 to model n

2004 Jun 06

Average R-squared of model1 to model n

Hi, We got a question about interpretating R-suqared. The actual outputs for a test dataset is X=(x1,x2, ..., xn). model 1 predicted the outputs as Y1=(y11,y12,..., y1n) model n predicted the outputs as Y2=(y21,y22,..., y2n) ... model m predicted the outputs as Ym=(ym1,ym2,..., ymn) Now we have two ways to calculate R squared to evaluate the average performance of committee model. (a)

outlier identification: is there a redundancy-invariant substitution for mahalanobis distances?

2004 Jan 21

outlier identification: is there a redundancy-invariant substitution for mahalanobis distances?

Dear R-experts, Searching the help archives I found a recommendation to do multivariate outlier identification by mahalanobis distances based on a robustly estimated covariance matrix and compare the resulting distances to a chi^2-distribution with p (number of your variables) degrees of freedom. I understand that compared to euclidean distances this has the advantage of being scale-invariant.

Simple simulation in R

2003 Aug 26

Simple simulation in R

Hello all I have a feeling this is very simple......but I am not sure how to do it My boss has two variables, one is an average of 4 numbers, the other is an average of 3 of those numbers i.e var1 = (X1 + X2 + X3 + X4)/4 var2 = (X1 + X2 + X3)/3 all of the X variables are supposed to be measuring similar constructs not surprisingly, these are highly correlated (r = .98), the question is how

Outlier identification according to Hardin & Rocke (1999)

2004 May 26

Outlier identification according to Hardin & Rocke (1999)

I'm trying to use a paper by Hardin & Rocke: http://handel.cipic.ucdavis.edu/~dmrocke/Robdist5.pdf as a guide for a function to identify outliers in multivariate data. Attached below is a function that is my attempt to reproduce their method and also a test to see what fraction of the data are identified as outliers. Using this function I am able to reproduce their results regarding the

Robust estimate of variance

1999 Feb 09

Robust estimate of variance

Has anybody written or located a robust verion of Var(X)? ______________________________________________________ Get Your Private, Free Email at http://www.hotmail.com -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in

Outlier statistics question

2010 Nov 30

Outlier statistics question

I have a statistical question. The data sets I am working with are right-skewed so I have been plotting the log transformations of my data. I am using a Grubbs Test to detect outliers in the data, but I get different outcomes depending on whether I run the test on the original data or the log(data). Here is one of the problematic sets: fgf2p50=c(1.563,2.161,2.529,2.726,2.442,5.047)

Outlier removal techniques

2012 Feb 09

Outlier removal techniques

Hello, I need to analyse a data matrix with dimensions of 30x100. Before analysing the data there is, however, a need to remove outliers from the data. I read quite a lot about outlier removal already and I think the most common technique for that seems to be Principal Component Analysis (PCA). However, I think that these technqiue is quite subjective. When is an outlier an outlier? I uploaded

Email eller vedhæftet fil blokeret

2004 Jul 16

Email eller vedhæftet fil blokeret

Email eller vedhæftet fil afsendt fra din adresse (eller med din adresse som afsender) er blevet afvist fra Allerød Kommune. Spam og virus bliver typisk sendt under dække af andre afsendere og den blokerede email behøver derfor ikke oprinde direkte fra dig. (Husk dog altid at have et opdateret antivirusprogram på din computer.) Du kan evt. scanne din computer med det gratis' værktøj

(OT) Does pearson correlation assume bivariate normality of the data?

2009 May 26

(OT) Does pearson correlation assume bivariate normality of the data?

Dear all, The other day I was reading this post [1] that slightly surprised me: "To reject the null of no correlation, an hypothsis test based on the normal distribution. If normality is not the base assumption your working from then p-values, significance tests and conf. intervals dont mean much (the value of the coefficient is not reliable) " (BOB SAMOHYL). To me this implied that in

outlier tests

2004 Jun 30

outlier tests

I have been learning about some outlier tests -- Dixon and Grubb, specifically -- for small data sets. When I try help.start() and search for outlier tests, the only response I manage to find is the Bonferroni test avaiable from the CAR package... are there any other packages the offer outlier tests? Are the Dixon and Grubb tests "good" for small samples or are others more

implementing Grubbs outlier test on a large dataframe

2009 Feb 14

implementing Grubbs outlier test on a large dataframe

Hi! I'm trying to implement an outlier test once/row in a large dataframe. Ideally, I'd do this then add the Pvalue results and the number flagged as an outlier as two new separate columns to the dataframe. Grubbs outlier test requires a vector and I'm confused how to make each row of my dataframe a vector, followed by doing a Grubbs test for each row containing the vector of numbers

randomForest outlier

2008 Jun 18

randomForest outlier

I try to use ?randomForest to find variables that are the most important to divide my dataset (continuous, categorical variables) in two given groups. But when I plot the outliers: plot(outlier(FemMalSex_NAavoid88.rf33, cls=FemMalSex_NAavoid88$Sex), type="h",col=c("red","green")[as.numeric(FemMalSex_NAavoid88$Sex)]) it seems to me that all my values appear as

similar to: outlier