similar to: Data frame manipulation by eliminating rows containing extreme values

Displaying 20 results from an estimated 6000 matches similar to: "Data frame manipulation by eliminating rows containing extreme values"

2011 Oct 19
1
Subsetting data by eliminating redundant variables
Dear All, I am new to R, I have one question which might be easy. I have a large data with more than 250 variable, i am reducing number of variables by redun function as in the example below, n <- 100 x1 <- runif(n) x2 <- runif(n) x3 <- x1 + x2 + runif(n)/10 x4 <- x1 + x2 + x3 + runif(n)/10 x5 <- factor(sample(c('a','b','c'),n,replace=TRUE)) x6 <-
2016 Apr 19
5
Interquartile Range
That didn't work Jim! Thanks anyway On Mon, Apr 18, 2016 at 9:02 PM, Jim Lemon <drjimlemon at gmail.com> wrote: > Hi Michael, > At a guess, try this: > > iqr<-function(x) { > return(paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-") > } > > .col3_Range=iqr(datat$tenure) > > Jim > > > > On Tue, Apr 19, 2016 at
2009 Aug 19
2
mild and extreme outliers in boxplot
dear all, could somebody tell me how I can plot mild outliers as a circle(?) and extreme outliers as an asterisk(*) in a box-whisker plot? Thanks very much in advance -- View this message in context: http://www.nabble.com/mild-and-extreme-outliers-in-boxplot-tp25040545p25040545.html Sent from the R help mailing list archive at Nabble.com.
2011 Feb 23
5
mgcv: beta coefficient and 95%CI
Hi i am doing an environmental research The equation is as follow: gam(y1 ~ x1 + s(x2) + s(x3) + s(x4), family = gaussian, fit = true) I would like to obtain the beta coefficient and 95CI of x4 (or s(x4)), what should I do? Thanks, Lung -- View this message in context: http://r.789695.n4.nabble.com/mgcv-beta-coefficient-and-95-CI-tp3320491p3320491.html Sent from the R help mailing list
2016 Apr 19
2
Interquartile Range
Hi, I am trying to show an interquartile range while grouping values using the function ddply(). So my function call now is like groupedAll <- ddply(data ,~groupColumn ,summarise ,col1_mean=mean(col1) ,col2_mode=Mode(col2) #Function I wrote for getting the mode shown below
2012 Jul 13
2
significance test interquartile ranges
Hi, I have two non-normal distributions and use interquartile ranges as a dispersion measure. Now I am looking for a test, which tests whether the interquartile ranges from the two distributions are significantly different. Any idea? Thanks, joerg [[alternative HTML version deleted]]
2016 Apr 19
0
Interquartile Range
> That didn't work Jim! It always helps to say how the suggestion did not work. Jim's function had a typo in it - was that the problem? Or did you not change the call to ddply to use that function. Here is something that might "work" for you: library(plyr) data <- data.frame(groupColumn=rep(1:5,1:5), col1=2^(0:14)) myIqr <- function(x) {
2016 Apr 19
0
Interquartile Range
Hi Michael, At a guess, try this: iqr<-function(x) { return(paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-") } .col3_Range=iqr(datat$tenure) Jim On Tue, Apr 19, 2016 at 11:15 AM, Michael Artz <michaeleartz at gmail.com> wrote: > Hi, > I am trying to show an interquartile range while grouping values using > the function ddply(). So my function
2016 Apr 19
0
Interquartile Range
Are you aware that there *already is* a function that does this? ?IQR (also your "function" iqr" is just a character string and would have to be parsed and evaluated to become a function. But this is a TERRIBLE way to do things in R as it completely circumvents R's central functional programming paradigm). Cheers, Bert Bert Gunter "The trouble with having an open mind
2016 Apr 19
2
Interquartile Range
To be precise: paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-") is an expression that evaluates to a character string: "round(quantile(x,.25),0) - round(quantile(x,0.75),0)" no matter what the argument of your function, x. Hence return(paste(...)) will return this exact character string and never evaluates x. Cheers, Bert Bert Gunter "The
2002 Sep 15
7
loess crash
Hi, I have a data frame with 6563 observations. I can run a regression with loess using four explanatory variables. If I add a fifth, R crashes. There are no missings in the data, and if I run a regression with any four of the five explanatory variables, it works. Its only when I go from four to five that it crashes. This leads me to believe that it is not an obvious problem with the data,
2016 Apr 19
1
Interquartile Range
HI that did not work for me either. The value I got returned from that function was "<rounded mean> - <rounded mean>" :(. thanks for the reply through On Tue, Apr 19, 2016 at 10:34 AM, William Dunlap <wdunlap at tibco.com> wrote: > > That didn't work Jim! > > It always helps to say how the suggestion did not work. Jim's > function had a typo
2016 Apr 19
2
Interquartile Range
... and I'm getting another cup of coffee... -- Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Tue, Apr 19, 2016 at 10:30 AM, Bert Gunter <bgunter.4567 at gmail.com> wrote: > NO NO -- I am wrong! The paste() expression is
2016 Apr 19
2
Interquartile Range
If you show us, not just tell us about, a self-contained example someone might show you a non-hacky way of getting the job done. (I don't see an argument to plyr::ddply called 'transform'.) Bill Dunlap TIBCO Software wdunlap tibco.com On Tue, Apr 19, 2016 at 12:18 PM, Michael Artz <michaeleartz at gmail.com> wrote: > Oh thanks for that clarification Bert! Hope you enjoyed
2016 Apr 19
0
Interquartile Range
NO NO -- I am wrong! The paste() expression is of course evaluated. It's just that a character string is returned of the form "something - something". I apologize for the confusion. -- Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County"
2005 Sep 22
2
R: extracting elements in a matrix
Dear R-users For a given matrix of dimension, say (n,p), I'd like to extract for every column those elements that are bigger than twice the interquartile range of the corresponding column. Can I get these elements without using a loop? Thank you for your help Frank [[alternative HTML version deleted]]
2016 Apr 20
2
Interquartile Range
Well, instead of your functions try: Mode <- function(x) { tabx <- table(x) tabx[which.max(tabx)] } and use R's IQR function instead of yours. ... so I still don't get why you want to return a character string instead of a value for the IQR; and the mode of a sample defined as above is generally a bad estimator of the mode of the distribution. To say more than that would
2016 Apr 19
0
Interquartile Range
Oh thanks for that clarification Bert! Hope you enjoyed your coffee! I ended up just using the transform argument in the ddply function. It worked and it repeated, then I called a mode function in another call to ddply that summarised. Kinda hacky but oh well! On Tue, Apr 19, 2016 at 12:31 PM, Bert Gunter <bgunter.4567 at gmail.com> wrote: > ... and I'm getting another cup of
2012 Aug 08
1
dimnames in array
Hello, I'm working with an array; I'm trying to make it so that an array of dim(42,2,2) has names whose length corresponds to that of the array, and am hoping someone with experience with this can see what I'm not doing correctly: data11 = array(0,c(41,2,2)) y = lsoda(x0,times,fhn$fn.ode,pars)#This is make.fhn() from colloc infer package# y = y[,2:3]
2023 Jun 11
1
Problem with filling dataframe's column
Dear Rui; Many thanks for your email. I used one of your codes, "data2$LU[which(data2$Layer == "Level 12")] <- "Park"", and it works correctly for me. Actually I need to expand the codes so as to consider all "Levels" in the "Layer" column. There are more than hundred levels in the Layer column. If I use your provided code, I have to write it