similar to: Stat question: How to deal w/ negative outliers?

Displaying 20 results from an estimated 2000 matches similar to: "Stat question: How to deal w/ negative outliers?"

2013 Apr 03
5
Can package plyr also calculate the mode?
I am trying to replicate the SAS proc univariate in R. I got most of the stats I needed for a by grouping in a data frame using: all1 <- ddply(all,"ACT_NAME", summarise, mean=mean(COUNTS), sd=sd(COUNTS), q25=quantile(COUNTS,.25),median=quantile(COUNTS,.50), q75=quantile(COUNTS,.75), q90=quantile(COUNTS,.90), q95=quantile(COUNTS,.95), q99=quantile(COUNTS,.99) )
2013 May 23
3
Removing rows w/ smaller value from data frame
Hello, I have a column called max_date in my data frame and I only want to keep the bigger values for the same activity. How can I do that? data frame: activity max_dt A 2013-03-05 B 2013-03-28 A 2013-03-28 C 2013-03-28 B 2013-03-01 Thank you for your help -- View this message in context:
2013 Apr 05
2
How to perform a grouped shapiro wilk test on dataframe
Hello, I was wandering if it is possible to perform on a dataframe called 'all' a shapiro wilk normality test for COUNTS by variable Group ACTIVITY? Could it be done using plyer? I saw an eg that applies to an array but not to a dataframe: lapply(split(dataset1$Height,dataset1$Group),shapiro.test) Any thoughts would be much appreciated. My dataframe is in shape: dat ACTIVIT
2012 Nov 30
5
subset data frame by variable with missing value
Hello, I have a variable in a data frame that contains NA values. I just want to subset so that I get the obs where that variable is missing. In SAS I would do: data missing; set test; if myvalue=' '; run; How can I perform this simple task in R? Thanks in advance for your help. -- View this message in context:
2012 Oct 10
3
How to replicate SAS by group processing in R
Hello, I am trying to re-code all my programs from SAS into R. In SAS I use the following code: proc sort data=upper; by tdate stock_symbol expire strike; run; data upper1; set upper; by tdate stock_symbol expire strike; if first.expire then output; rename strike=astrike; run; on the following data set: tdate stock_symbol expiration strike 9/11/2012 C 9/16/2012
2012 Sep 18
4
Conditional operations in R
Hello, I am a newbie to R coming from SAS background. I am trying to program the following: I have a monthly data frame with 2 variables: client pct_total A 15% B 10% C 10% D 9% E 8% F 6% G 4% I need to come up w/ a monthly list of clients that make 50% or just above it every month so I can pass them to the rest of the program.
2012 Oct 19
4
Creating a new by variable in a dataframe
Hello, I have a dataframe w/ 3 variables of interest: transaction,date(tdate) & time(event_tim). How could I create a 4th variable (last_trans) that would flag the last transaction of the day for each day? In SAS I use: proc sort data=all6; by tdate event_tim; run; /*Create last transaction flag per day*/ data all6; set all6; by tdate event_tim; last_trans=last.tdate; Thanks
2012 Sep 13
3
Cannot install package xlsx
I get following error message: trying URL 'http://cran.stat.ucla.edu/bin/windows/contrib/2.15/xlsx_0.4.2.zip' Content type 'application/zip' length 365611 bytes (357 Kb) opened URL downloaded 357 Kb Error in read.dcf(file.path(pkgname, "DESCRIPTION"), c("Package", "Type")) : cannot open the connection In addition: Warning messages: 1: In
2012 Aug 31
2
Conditional merging in R & if then statement
1)I am wandering how the following SQL statement can be written in R language w/o using sqldf: create table detail2 as select a.* from detail a, pdetail b where a.TDATE=b.TDATE and (a.STIM >= b.STIM and a.STIM <=b.MAXTIM) 2) when try if then in R it only applies to the 1st row & not to whole dataset like in SAS. How do you get round that? in SAS: data summary; set all1;
2012 Nov 13
1
Using lubridate to increment date by business days only
Hello, I know how to increment a date by calendar date: ticker$ldate <- ticker$tdate + days(5) How do I increment it by business days only so that week-ends are not counted? So for example friday november 2 + 5days becomes friday november 9 & not wednesday nov 7. Thanks for your help. -- View this message in context:
2012 Aug 24
1
if then in R versus SAS
I am new to R and I have the following SAS statements: if otype='M' and ocond='1' and entry='a.Prop' then MOC=1; else MOC=0; How would I translate that into R code? Thanks in advance -- View this message in context: http://r.789695.n4.nabble.com/if-then-in-R-versus-SAS-tp4641225.html Sent from the R help mailing list archive at Nabble.com.
2012 Sep 13
1
Paasing values to sqlQuery like SAS macro
Hello, We lost our SAS licence & I am busy transfering my old SAS programs to R environment. I am very new to R. In 1 program I was creating SAS macro vars & passing them into a SQL query to run against the server. There are 3 variables firm, begindt, enddt. # of values for each varies month to month. Is there anyway I could do the same thing in R & pass the afore mentioned values
2011 Mar 14
0
Question re estimating SE for interquantile regression coefficients
Hi, I am a student new to R.  I would like to estimate the standard error for the difference in interquantile regression coefficients after but do not know how to do so.  For each quantile I estimated the regression coefficent, bootstrapped for the SE and saved the coefficient and then take the difference between the two, e.g. per90<-rq(y~x, tau = c(0.9),data = data, weights= mec),
2012 Aug 23
3
Concatenating data frames in R versus SAS
I am trying to concatenate 2 datasets that don't have exactly the same column. In SAS I did: data summary; set agency prop; run; No problem in R I get error message summary <-rbind(agency,prop) Error in match.names(clabs, names(xi)) : names do not match previous names But when I use rbin.fill, that overwrites the second file w/ first one. Is there a way to replicate the sas process
2012 Nov 15
3
Can you have a by variable in Lag function as in SAS
Hello, I want to use lag on a time variable but I have to take date into consideration ie I don't want days to overlap ie: I don't want my first time of today to match my last time of yeterday. In SAS I would use : data x; set y; by date tim; previous=lag(tim); if first.date then do; previous=.; end; run; How can I do something similar in R? I can't find
2007 Mar 28
2
fitting data with conditions
Mich besch?ftig folgende Fragestellung. Ich kenne die Verteilung (lognormal) zus?tzlich weiss ich das 99%, das 90% und das 1% Quantil. Gibt es in R eine M?glichkeit die Lognormalverteilung zu finden, das heisst den korrespondierenden logmean und logsd? Vielen Dank f?r ihre Hilfe Gruss Yvonne
2013 Jan 10
1
Subset in, not in
Hello, I need to subset my dataframe into 2 parts; in: mm <- subset(agr1, subset=lmpcrd %in% c(11697,149823,7654)) not in: but where do I stick the " !" in the above? I've tried every position. Thanks for your help. -- View this message in context: http://r.789695.n4.nabble.com/Subset-in-not-in-tp4655178.html Sent from the R help mailing list archive at Nabble.com.
2009 Aug 19
2
mild and extreme outliers in boxplot
dear all, could somebody tell me how I can plot mild outliers as a circle(?) and extreme outliers as an asterisk(*) in a box-whisker plot? Thanks very much in advance -- View this message in context: http://www.nabble.com/mild-and-extreme-outliers-in-boxplot-tp25040545p25040545.html Sent from the R help mailing list archive at Nabble.com.
2013 May 17
0
Using grubbs test for residuals to find outliers
Hi, I am a new user of R. This is a conceptual doubt regarding screeing out outliers from the dataset in regression. I read up that Cook's distance can be used and if we want to remove influential observations, we can use the metric (>4/n) (n=no of observations) to remove any outliers. I also came across Grubb's test to identify outliers in univariate distns. (assumed normal) but i
2012 Sep 05
1
How to effectively remove Outliers from a binary logistic regression in R
Hallo there, greetings from Germany. I have a simple question for you. I have run a binary logistic model, but there are lots of outliers distorting the real results. I have tried to get rid of the outliers using the following commands: remove = -c(56, 303, 365, 391, 512, 746, 859, 940, 1037, 1042, 1138, 1355) MIGRATION.rebuild <- glm(MIGRATION, subset=remove)