thr3ads.net - similar to: "Stat question: How to deal w/ negative outliers?"

Displaying 20 results from an estimated 2000 matches similar to: "Stat question: How to deal w/ negative outliers?"

Can package plyr also calculate the mode?

2013 Apr 03

Can package plyr also calculate the mode?

I am trying to replicate the SAS proc univariate in R. I got most of the stats I needed for a by grouping in a data frame using: all1 <- ddply(all,"ACT_NAME", summarise, mean=mean(COUNTS), sd=sd(COUNTS), q25=quantile(COUNTS,.25),median=quantile(COUNTS,.50), q75=quantile(COUNTS,.75), q90=quantile(COUNTS,.90), q95=quantile(COUNTS,.95), q99=quantile(COUNTS,.99) )

Removing rows w/ smaller value from data frame

2013 May 23

Removing rows w/ smaller value from data frame

Hello, I have a column called max_date in my data frame and I only want to keep the bigger values for the same activity. How can I do that? data frame: activity max_dt A 2013-03-05 B 2013-03-28 A 2013-03-28 C 2013-03-28 B 2013-03-01 Thank you for your help -- View this message in context:

How to perform a grouped shapiro wilk test on dataframe

2013 Apr 05

How to perform a grouped shapiro wilk test on dataframe

Hello, I was wandering if it is possible to perform on a dataframe called 'all' a shapiro wilk normality test for COUNTS by variable Group ACTIVITY? Could it be done using plyer? I saw an eg that applies to an array but not to a dataframe: lapply(split(dataset1$Height,dataset1$Group),shapiro.test) Any thoughts would be much appreciated. My dataframe is in shape: dat ACTIVIT

subset data frame by variable with missing value

2012 Nov 30

subset data frame by variable with missing value

Hello, I have a variable in a data frame that contains NA values. I just want to subset so that I get the obs where that variable is missing. In SAS I would do: data missing; set test; if myvalue=' '; run; How can I perform this simple task in R? Thanks in advance for your help. -- View this message in context:

How to replicate SAS by group processing in R

2012 Oct 10

How to replicate SAS by group processing in R

Hello, I am trying to re-code all my programs from SAS into R. In SAS I use the following code: proc sort data=upper; by tdate stock_symbol expire strike; run; data upper1; set upper; by tdate stock_symbol expire strike; if first.expire then output; rename strike=astrike; run; on the following data set: tdate stock_symbol expiration strike 9/11/2012 C 9/16/2012

Conditional operations in R

2012 Sep 18

Conditional operations in R

Hello, I am a newbie to R coming from SAS background. I am trying to program the following: I have a monthly data frame with 2 variables: client pct_total A 15% B 10% C 10% D 9% E 8% F 6% G 4% I need to come up w/ a monthly list of clients that make 50% or just above it every month so I can pass them to the rest of the program.

Creating a new by variable in a dataframe

2012 Oct 19

Creating a new by variable in a dataframe

Hello, I have a dataframe w/ 3 variables of interest: transaction,date(tdate) & time(event_tim). How could I create a 4th variable (last_trans) that would flag the last transaction of the day for each day? In SAS I use: proc sort data=all6; by tdate event_tim; run; /*Create last transaction flag per day*/ data all6; set all6; by tdate event_tim; last_trans=last.tdate; Thanks

Cannot install package xlsx

2012 Sep 13

Cannot install package xlsx

I get following error message: trying URL 'http://cran.stat.ucla.edu/bin/windows/contrib/2.15/xlsx_0.4.2.zip' Content type 'application/zip' length 365611 bytes (357 Kb) opened URL downloaded 357 Kb Error in read.dcf(file.path(pkgname, "DESCRIPTION"), c("Package", "Type")) : cannot open the connection In addition: Warning messages: 1: In

Conditional merging in R & if then statement

2012 Aug 31

Conditional merging in R & if then statement

1)I am wandering how the following SQL statement can be written in R language w/o using sqldf: create table detail2 as select a.* from detail a, pdetail b where a.TDATE=b.TDATE and (a.STIM >= b.STIM and a.STIM <=b.MAXTIM) 2) when try if then in R it only applies to the 1st row & not to whole dataset like in SAS. How do you get round that? in SAS: data summary; set all1;

Using lubridate to increment date by business days only

2012 Nov 13

Using lubridate to increment date by business days only

Hello, I know how to increment a date by calendar date: ticker$ldate <- ticker$tdate + days(5) How do I increment it by business days only so that week-ends are not counted? So for example friday november 2 + 5days becomes friday november 9 & not wednesday nov 7. Thanks for your help. -- View this message in context:

if then in R versus SAS

2012 Aug 24

if then in R versus SAS

I am new to R and I have the following SAS statements: if otype='M' and ocond='1' and entry='a.Prop' then MOC=1; else MOC=0; How would I translate that into R code? Thanks in advance -- View this message in context: http://r.789695.n4.nabble.com/if-then-in-R-versus-SAS-tp4641225.html Sent from the R help mailing list archive at Nabble.com.

Paasing values to sqlQuery like SAS macro

2012 Sep 13

Paasing values to sqlQuery like SAS macro

Hello, We lost our SAS licence & I am busy transfering my old SAS programs to R environment. I am very new to R. In 1 program I was creating SAS macro vars & passing them into a SQL query to run against the server. There are 3 variables firm, begindt, enddt. # of values for each varies month to month. Is there anyway I could do the same thing in R & pass the afore mentioned values

Question re estimating SE for interquantile regression coefficients

2011 Mar 14

Question re estimating SE for interquantile regression coefficients

Hi, I am a student new to R. I would like to estimate the standard error for the difference in interquantile regression coefficients after but do not know how to do so. For each quantile I estimated the regression coefficent, bootstrapped for the SE and saved the coefficient and then take the difference between the two, e.g. per90<-rq(y~x, tau = c(0.9),data = data, weights= mec),

Concatenating data frames in R versus SAS

2012 Aug 23

Concatenating data frames in R versus SAS

I am trying to concatenate 2 datasets that don't have exactly the same column. In SAS I did: data summary; set agency prop; run; No problem in R I get error message summary <-rbind(agency,prop) Error in match.names(clabs, names(xi)) : names do not match previous names But when I use rbin.fill, that overwrites the second file w/ first one. Is there a way to replicate the sas process

Can you have a by variable in Lag function as in SAS

2012 Nov 15

Can you have a by variable in Lag function as in SAS

Hello, I want to use lag on a time variable but I have to take date into consideration ie I don't want days to overlap ie: I don't want my first time of today to match my last time of yeterday. In SAS I would use : data x; set y; by date tim; previous=lag(tim); if first.date then do; previous=.; end; run; How can I do something similar in R? I can't find

fitting data with conditions

2007 Mar 28

fitting data with conditions

Mich besch?ftig folgende Fragestellung. Ich kenne die Verteilung (lognormal) zus?tzlich weiss ich das 99%, das 90% und das 1% Quantil. Gibt es in R eine M?glichkeit die Lognormalverteilung zu finden, das heisst den korrespondierenden logmean und logsd? Vielen Dank f?r ihre Hilfe Gruss Yvonne

Subset in, not in

2013 Jan 10

Subset in, not in

Hello, I need to subset my dataframe into 2 parts; in: mm <- subset(agr1, subset=lmpcrd %in% c(11697,149823,7654)) not in: but where do I stick the " !" in the above? I've tried every position. Thanks for your help. -- View this message in context: http://r.789695.n4.nabble.com/Subset-in-not-in-tp4655178.html Sent from the R help mailing list archive at Nabble.com.

mild and extreme outliers in boxplot

2009 Aug 19

mild and extreme outliers in boxplot

dear all, could somebody tell me how I can plot mild outliers as a circle(?) and extreme outliers as an asterisk(*) in a box-whisker plot? Thanks very much in advance -- View this message in context: http://www.nabble.com/mild-and-extreme-outliers-in-boxplot-tp25040545p25040545.html Sent from the R help mailing list archive at Nabble.com.

Using grubbs test for residuals to find outliers

2013 May 17

Using grubbs test for residuals to find outliers

Hi, I am a new user of R. This is a conceptual doubt regarding screeing out outliers from the dataset in regression. I read up that Cook's distance can be used and if we want to remove influential observations, we can use the metric (>4/n) (n=no of observations) to remove any outliers. I also came across Grubb's test to identify outliers in univariate distns. (assumed normal) but i

How to effectively remove Outliers from a binary logistic regression in R

2012 Sep 05

How to effectively remove Outliers from a binary logistic regression in R

Hallo there, greetings from Germany. I have a simple question for you. I have run a binary logistic model, but there are lots of outliers distorting the real results. I have tried to get rid of the outliers using the following commands: remove = -c(56, 303, 365, 391, 512, 746, 859, 940, 1037, 1042, 1138, 1355) MIGRATION.rebuild <- glm(MIGRATION, subset=remove)

similar to: Stat question: How to deal w/ negative outliers?