similar to: implementing Grubbs outlier test on a large dataframe

Displaying 20 results from an estimated 1100 matches similar to: "implementing Grubbs outlier test on a large dataframe"

2010 Sep 15
1
cochran-grubbs tests results
Hello, I'm new in this R world and I don't know much about statistics, but now I have to analize some data and I've got some first queries yet: I have 5 sets of area mesures and each set has 5 repetitions. My first step is to check data looking for outliers. I've used the outliers package. I have to use the cochran test and the grubbs test in case I find any outlier. The problem
2010 Nov 30
3
Outlier statistics question
I have a statistical question. The data sets I am working with are right-skewed so I have been plotting the log transformations of my data. I am using a Grubbs Test to detect outliers in the data, but I get different outcomes depending on whether I run the test on the original data or the log(data). Here is one of the problematic sets: fgf2p50=c(1.563,2.161,2.529,2.726,2.442,5.047)
2005 Apr 14
2
grubbs.test
Dear All, I have small samples of data (between 6 and 15) for numerious time series points. I am assuming the data for each time point is normally distributed. The problem is that the data arrvies sporadically and I would like to detect the number of outliers after I have six data points for any time period. Essentially, I would like to detect the number of outliers when I have 6 data points then
2004 Jun 30
1
outlier tests
I have been learning about some outlier tests -- Dixon and Grubb, specifically -- for small data sets. When I try help.start() and search for outlier tests, the only response I manage to find is the Bonferroni test avaiable from the CAR package... are there any other packages the offer outlier tests? Are the Dixon and Grubb tests "good" for small samples or are others more
2006 Jul 20
2
(robust) mixed-effects model with covariate
Dear all, I am unsure about how to specify a model in R and I thought of asking some advice to the list. I have two groups ("Group"= A, B) of subjects, with each subject undertaking a test before and after a certain treatment ("Time"= pre, post). Additionally, I want to enter the age of the subject as a covariate (the performance on the test is affected by age),
2013 May 17
0
Using grubbs test for residuals to find outliers
Hi, I am a new user of R. This is a conceptual doubt regarding screeing out outliers from the dataset in regression. I read up that Cook's distance can be used and if we want to remove influential observations, we can use the metric (>4/n) (n=no of observations) to remove any outliers. I also came across Grubb's test to identify outliers in univariate distns. (assumed normal) but i
2012 Apr 18
1
Pierce's criterion
Hello all, I would like to rigorously test whether observations in my dataset are outliers. I guess all the main tests in R (Grubbs) impose the assumption of normality. My data is surely not normal, so I would like to use something else. As far as I can tell from wikipedia, Peirce's criterion is just that. The data I am interested in testing is: 1) Continuous on the unit interval 2)
2011 Dec 30
3
good method of removing outliers?
Happy holidays all! I know it's very subjective to determine whether some data is outlier or not... But are there reasonally good and realistic methods of identifying outliers in R? Thanks a lot! [[alternative HTML version deleted]]
2007 Apr 25
1
How to identify and exclude the outliers with R?
Hello, everyone, I want to ask a simple question. If I have a set of data,and I want to identify how many outliers there are in the data.Which packages and functions can I use? Thanks. Shao chunxuan. [[alternative HTML version deleted]]
2004 Sep 23
6
detection of outliers
Hi, this is both a statistical and a R question... what would the best way / test to detect an outlier value among a series of 10 to 30 values ? for instance if we have the following dataset: 10,11,12,15,20,22,25,30,500 I d like to have a way to identify the last data as an outlier (only one direction). One way would be to calculate abs(mean - median) and if elevated (to what extent ?) delete the
2017 Jun 03
2
New var
Thank you all for the useful suggestion. I did some of my homework. library(data.table) DFM <- read.table(header=TRUE, text='obs start end 1 2/1/2015 1/1/2017 2 4/11/2010 1/1/2011 3 1/4/2006 5/3/2007 4 10/1/2007 1/1/2008 5 6/1/2011 1/1/2012 6 10/5/2004 12/1/2004',stringsAsFactors = FALSE) DFM DFM$D =as.numeric(difftime(as.Date(DFM$end,format="%m/%d/%Y"),
2017 Jun 04
2
New var
Thank you Jeff and All, Within a given time period (say 700 days, from the start day), I am expecting measurements taken at each time interval;. In this case "0" means measurement taken, "1" not taken (stopped or opted out and " -1" don't consider that time period for that individual. This will be compared with the actual measurements taken (Observed-
2017 Jun 04
0
New var
# read.table is NOT part of the data.table package #library(data.table) DFM <- read.table( text= 'obs start end 1 2/1/2015 1/1/2017 2 4/11/2010 1/1/2011 3 1/4/2006 5/3/2007 4 10/1/2007 1/1/2008 5 6/1/2011 1/1/2012 6 10/5/2004 12/1/2004 ',header = TRUE, stringsAsFactors = FALSE) # cleaner way to compute D DFM$start <- as.Date( DFM$start, format="%m/%d/%Y" ) DFM$end
2017 Jun 04
0
New var
Since the number of choices is small (6), how about this? Starting with Jeff's initial DFM: DFM <- structure(list(obs = 1:6, start = structure(c(16467, 14710, 13152, 13787, 15126, 12696), class = "Date"), end = structure(c(17167, 14975, 13636, 13879, 15340, 12753), class = "Date"), D = c(700, 265, 484, 92, 214, 57), bin = structure(c(6L, 3L, 5L, 1L, 3L, 1L), .Label
2017 Jan 17
2
bug in rbind?
I suspect there may be a bug in base::rbind.data.frame Below there is minimal example of the problem: m <- matrix (1:12, 3) dfm <- data.frame (c = 1 : 3, m = I (m)) str (dfm) m.names <- m rownames (m.names) <- letters [1:3] dfm.names <- data.frame (c = 1 : 3, m = I (m.names)) str (dfm.names) rbind (m, m.names) rbind (m.names, m) rbind (dfm, dfm.names) #not working rbind
2011 Aug 27
2
Am having trouble calling a function
In my main R program, I have source("retaanalysis/Functions/doAirport.R") .... stuff to read data and calculate ads sapply(ads, function(x) {doAirport(x, base)} ) And doAirport has # analyze the flights for a given airport doAirport = function(df, base) { # Get rid of unused runway factor levels (from other airports) df$lrw <- drop.levels(df$lrw) # In gdata package #
2017 Jul 16
2
About doing figures
Hi R users, I still have the problem about plotting. I wanted to put the datasets on one figure, x-axis represents values B, y-axis represents values C, while different colors label column A. Each record uses a circle on the figure, while hollow circles represent DF=1 and solid circles represent DF=2. I put my code below, but the A labels do not correspond to the true record, so I don't know
2017 Jul 16
2
About doing figures
Hi Jim, For true color, I meant that the points in the figure do not correspond to the values from the dataframe. Also, why to use rainbow(9) here? And the legend is straight in the middle, is it possible to reformat it to the very bottom? Thanks again. On Sun, Jul 16, 2017 at 2:50 AM, Jim Lemon <drjimlemon at gmail.com> wrote: > Hi lily, > As I have no idea of what the "true
2017 Jul 16
0
About doing figures
Hi lily, As I have no idea of what the "true record" is, I can only guess. Maybe this will help: # get some fairly distinct colors rainbow_colors<-rainbow(9) # this should sort the numbers in dfm$A dfm$Acolor<-factor(dfm$A) plot(dfm$B,dfm$C,pch=ifelse(dfm$DF==1,1,19), col=rainbow_colors[as.numeric(dfm$Acolor)]) legend("bottom",legend=sort(unique(dfm$A)),
2017 Jul 16
0
About doing figures
For more than 10 records, how to reformat the colors? Also, how to show the first legend only, but at the bottom, while the second legend in your code is not necessary? In all, the same A values have the same color, but different symbols in DF==1 and DF==2. Thanks for your help. On Sun, Jul 16, 2017 at 9:28 AM, lily li <chocold12 at gmail.com> wrote: > Hi Jim, > > For true color,