similar to: Removing outliers

Displaying 20 results from an estimated 4000 matches similar to: "Removing outliers"

2011 Oct 21
1
cph/nomogram Design/RMS package hazard ratio: interquartile vs per unit
Hello, I am constructing a nomogram using cph and nomogram commands in Dr. Harrell's Design/RMS package. The HR that I obtain for dichotomous and categorical variables are identical to those that I obtain using STATA stcox. However, the inter-quartile HR I obtain for continuous variables is obviously different, since STATA gives me HR for each unit (year, centimeter, etc) like coxph would
2011 Feb 24
1
Boxplot not doing what I think it should
My box plot below is drawing its upper whisker all the way to the last point, instead of showing the point as an outlier. Am I misunderstanding, or is it a bug? Help(boxplot) states for the parameter ?range? that ?this determines how far the plot whiskers extend out from the box. If range is positive, the whiskers extend to the most extreme data point which is no more than range times the
2009 Jul 12
2
box and whisker (PR#13821)
In a Box and Whisker plot, I thought that when there are outliers both abov= e and below the whiskers, then the whiskers should both be the same length = (plus or minus 1.5 times the inter-quartile range). If you look at the plot for SilwoodWeather on p.155 of The R Book you will = see that for November (month =3D 11) the upper whisker is shorter than the = lower, while for other months with
2012 Oct 17
2
loop of quartile groups
Greetings R users, My goal is to generate quartile groups of each variable in my data set. I would like each experiment to have its designated group added as a subsequent column. I can accomplish this individually with the following code: brks <- with(data_variables, cut2(var2, g=4)) #I don't want the actual numbers, I need a numbered group data$test1=factor(brks,
2005 Feb 25
2
outlier threshold
For the analysis of financial data wih a large variance, what is the best way to select an outlier threshold? Listed below, is there a best method to select an outlier threshold and how does R calculate it? In R, how do you find the outlier threshold through an interquartile range? In R, how do you find the outlier threshold using the hist command? In R, how do you find the outlier threshold
2010 Oct 26
2
Forcing results from lm into datframe
Hi I need some help getting results from multiple linear models into a dataframe. Let me explain the problem. I have a dataframe with ejection fraction results measured over a number of quartiles and grouped by base_study. My dataframe (800 different base_studies) looks like > afvtprelvefs basestudy quartile ef ef_std entropy CBP0908020 1 21.6 0.53 3.27
2008 Jun 13
2
Quartile regression question
I have data that looks like lake,loglength,logweight 1,2.369215857,1.929418926 1,2.426511261,2.230448921 1,2.434568904,2.298853076 1,2.437750563,2.298853076 1,2.442479769,2.230448921 1,2.445604203,2.356025857 ... 102,2.722633923,3.310268367 102,2.781755375,3.502153893 102,2.836324116,3.683407299 102,2.802773725,3.583312152 102,2.790285164,3.546419267 102,2.806179974,3.599118565
2009 Sep 22
5
use of class variable in r as in Proc means of sas
Hi,everyone i need to calculate quartile values of a variable grouped by the other variable . same as in aggregate function(only median,mean or functions is possible-i think so) Could you please help me to achieve the same for other quartile values(5,10,25,75,90) as for median using aggregate. Thanks in advance. data : zip price 60000 567000 60001 478654 60004 485647 60001
2010 Jan 22
2
Quartiles and Inter-Quartile Range
Why am I getting a wrong result for quartiles? here is my code: > cbiomass = c(910, 1058, 929, 1103, 1056, 1022, 1255, 1121, 1111, 1192, > 1074, 1415) > summary(cbiomass) > IQR(cbiomass) The result R gives me is: For the summary > Min. 1st Qu. Median Mean 3rd Qu. Max. 910 1048 1088 1104 1139 1415 For IQR > 91.25 ********* The true Q1 is 1039
2003 Oct 28
4
random number generation
Hi every one, I am trying to generate a normally distributed random variable with the following descriptive statistics, min=1, max=99, variance=125, mean=38.32, 1st quartile=38, median=40, 3rd quartile=40, skewness=-0.274. I know the "rnorm" will allow me to simulate random numbers with mean 38.32 and Sd=11.18(sqrt(125)). But I need to have the above mentioned descriptive
2012 Aug 06
2
Splitting Data Into Different Series
Dear R Community, I'm trying to write a loop to split my data into different series. I need to make a new matrix (or series) according to the series code. For instance, every time the "code" column assumes the value "433" I need to save "date", "value", and "code" into the "dados433" matrix. Please take a look at the following
2011 Jul 08
1
Getting wrong NA values using "for" cmd
Hi There, I'm facing one problem to construct a vector using the "for" command: I have one matrix named 'dados' (same as /data/ from portuguese), for example: > dados[140:150,] [,1] [,2] [,3] [1,] 212.7298 0.14 0.11 [2,] 213.3778 0.14 0.11 [3,] 214.0257 0.15 0.11 [4,] 214.6737 0.15 0.12 [5,] 215.3217 0.15 0.12 [6,] 215.9696 0.15 0.12 [7,] 216.6176 0.16
2007 Oct 09
3
Summary vs fivenum results for Q3
I've just started using R and am still a neophyte, but I found the following curious result. I'm using the current version of R (2.5.1 (2007-06-27) ). Why are the results for the third quartile different in the output from the summary and fivenum commands? For the following data set 457 514 530 530 538 560 687 745 745 778 786 790 792
2017 May 18
2
Bug: floating point bug in nclass.FD can cause hist() to crash
Hello everybody, This is a bug involving functions in core R package: graphics::hist.default, grDevices::nclass.FD, and base::pretty.default. It is not yet on Bugzilla. I cannot submit it myself, as I do not have an account. Could somebody else add it for me, perhaps? That would be much appreciated. Kind regards, Sietse Sietse Brouwer Summary ------- Floating point errors can cause a data
2017 Oct 13
2
How to define proper breaks in RFM analysis
> On Oct 13, 2017, at 2:51 AM, PIKAL Petr <petr.pikal at precheza.cz> wrote: > > Hi > > You expect us to solve your problem but you ignore advice already recieved. > > Your data are unreadable, use dput(yourdata) instead. see ?dput > >> test<-read.table("clipboard", heade=T) > Error in scan(file = file, what = what, sep = sep, quote = quote,
2003 Mar 06
1
Problems with variable types.
Hi all, I have problems in a dataframe variables types. Look: from a loop function: for(...){ ... dados.fin <- rbind(dados.fin, c(L=j, A=j^2, Nsp=nsps, N=length(amosfin$SP), AmT="am",NAm=nam, AMST=amst)) dados.fin <- rbind(dados.fin, c(L=j, A=j^2,
2008 Jan 02
1
Plot.svm error
Hi all, Sorry to be bothering again with probably an easy error to fix, but I've been trying to solve the problem and haven't been able yet to do it. So I'm doing this: > dados<-read.table("b.txt",sep="",nrows=30000) >
2011 Jul 06
3
Tables and merge
----- Original Message ----- From: "Silvano" <silvano at uel.br> To: <r-help at r-project.org> Sent: Thursday, June 30, 2011 9:07 AM Subject: Tables and merge > Hi, > > I have 21 files which is common variable CODE. > Each file refers to a question. > > I would like to join the 21 files into one, to construct > tables for each question by CODE. >
2017 Oct 13
0
How to define proper breaks in RFM analysis
Hemant's problem is that the indicators are not distributed uniformly. With a uniform distribution, categorization gives a reasonably optimal separation of cases. One approach would be to drop categorization and calculate the overall score as the mean of the standardized indicator scores. Whether this is an option I do not know. I did offer an "eyeball" set of breaks in a previous
2008 Mar 28
1
Defining reference category for a cph model summary inside of a "for" loop
I have the following code. > f <- cph(formula = Surv(TimeToDeath, Dead == "Yes") ~1,data=single.dat, x=T, y=T, surv=T) > for(i in c('A', 'B', 'C', 'D', 'E', 'F')){ > f <-update(f,as.formula(paste('Surv(TimeToDeath, Dead == "Yes")~',i,sep=''))) > print(summary(f, paste(i,"=1st