Again, IQR returns two both a .25 and a .75 value and it failed, which is why I didn't use it before. Also, the first function just returns tha same value repeating. Since they are the same, before the second call, using the mode function is just a way to grab one value. I could have used average, min, max, they all would have returned the same thing. Mike On Tue, Apr 19, 2016 at 7:24 PM, Marc Schwartz <marc_schwartz at me.com> wrote:> Hi, > > Jumping into this thread mainly on the point of the mode of the > distribution, while also supporting Bert's comments below on theory. > > If the vector 'x' that is being passed to this function is an integer > vector, then a tabulation of the integers can yield a 'mode', presuming of > course that there is only one unique mode. You may have to decide how you > want to handle a multi-modal discrete distribution. > > If the vector 'x' is continuous (e.g. contains floating point values), > then a tabulation is going to be problematic for a variety of reasons. > > In that case, prior discussions on this point, have yielded the following > estimation of the mode of a continuous distribution by using: > > Mode <- function(x) { > D <- density(x) > D$x[which.max(D$y)] > } > > where the second line of the function gets you the value of 'x' at the > maximum of the density estimate. Of course, there is still the possibility > of a multi-modal distribution and the nuances of which kernel is used, > etc., etc. > > Food for thought. > > Regards, > > Marc Schwartz > > > > On Apr 19, 2016, at 7:07 PM, Bert Gunter <bgunter.4567 at gmail.com> wrote: > > > > Well, instead of your functions try: > > > > Mode <- function(x) { > > tabx <- table(x) > > tabx[which.max(tabx)] > > } > > > > and use R's IQR function instead of yours. > > > > ... so I still don't get why you want to return a character string > > instead of a value for the IQR; > > and the mode of a sample defined as above is generally a bad estimator > > of the mode of the distribution. To say more than that would take me > > too far afield. Post on stats.stackexchange.com if you want to know > > why (if it's even relevant). > > > > Cheers, > > Bert > > Bert Gunter > > > > "The trouble with having an open mind is that people keep coming along > > and sticking things into it." > > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > > > > On Tue, Apr 19, 2016 at 4:25 PM, Michael Artz <michaeleartz at gmail.com> > wrote: > >> Hi, > >> Here is what I am doing > >> > >> notGroupedAll <- ddply(data > >> ,~groupColumn > >> ,summarise > >> ,col1_mean=mean(col1) > >> ,col2_mode=Mode(col2) #Function I wrote for getting the > >> mode shown below > >> ,col3_Range=myIqr(col3) > >> ) > >> > >> groupedAll <- ddply(data > >> ,~groupColumn > >> ,summarise > >> ,col1_mean=mean(col1) > >> ,col2_mode=Mode(col2) #Function I wrote for getting the > >> mode shown below > >> ,col3_Range=Mode(col3) > >> ) > >> > >> #custom Mode function > >> Mode <- function(x) { > >> ux <- unique(x) > >> ux[which.max(tabulate(match(x, ux)))] > >> > >> #the range function > >> myIqr <- function(x) { > >> paste(round(quantile(x,0.375),0),round(quantile(x,0.625),0),sep="-") > >> } > >> > >> > >> } > >> > >> > >> Here is what I am doing!! :) > >> > >> > >> > >> On Tue, Apr 19, 2016 at 2:57 PM, William Dunlap <wdunlap at tibco.com> > wrote: > >>> > >>> If you show us, not just tell us about, a self-contained example > >>> someone might show you a non-hacky way of getting the job done. > >>> (I don't see an argument to plyr::ddply called 'transform'.) > >>> > >>> Bill Dunlap > >>> TIBCO Software > >>> wdunlap tibco.com > >>> > >>> On Tue, Apr 19, 2016 at 12:18 PM, Michael Artz <michaeleartz at gmail.com > > > >>> wrote: > >>>> > >>>> Oh thanks for that clarification Bert! Hope you enjoyed your > coffee! I > >>>> ended up just using the transform argument in the ddply function. It > worked > >>>> and it repeated, then I called a mode function in another call to > ddply that > >>>> summarised. Kinda hacky but oh well! > >>>> > >>>> On Tue, Apr 19, 2016 at 12:31 PM, Bert Gunter <bgunter.4567 at gmail.com > > > >>>> wrote: > >>>>> > >>>>> ... and I'm getting another cup of coffee... > >>>>> > >>>>> -- Bert > >>>>> Bert Gunter > >>>>> > >>>>> "The trouble with having an open mind is that people keep coming > along > >>>>> and sticking things into it." > >>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >>>>> > >>>>> > >>>>> On Tue, Apr 19, 2016 at 10:30 AM, Bert Gunter < > bgunter.4567 at gmail.com> > >>>>> wrote: > >>>>>> NO NO -- I am wrong! The paste() expression is of course evaluated. > >>>>>> It's just that a character string is returned of the form > "something - > >>>>>> something". > >>>>>> > >>>>>> I apologize for the confusion. > >>>>>> > >>>>>> -- Bert > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> Bert Gunter > >>>>>> > >>>>>> "The trouble with having an open mind is that people keep coming > along > >>>>>> and sticking things into it." > >>>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >>>>>> > >>>>>> > >>>>>> On Tue, Apr 19, 2016 at 10:25 AM, Bert Gunter < > bgunter.4567 at gmail.com> > >>>>>> wrote: > >>>>>>> To be precise: > >>>>>>> > >>>>>>> paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-") > >>>>>>> > >>>>>>> is an expression that evaluates to a character string: > >>>>>>> "round(quantile(x,.25),0) - round(quantile(x,0.75),0)" > >>>>>>> > >>>>>>> no matter what the argument of your function, x. Hence > >>>>>>> > >>>>>>> return(paste(...)) will return this exact character string and > never > >>>>>>> evaluates x. > >>>>>>> > >>>>>>> > >>>>>>> Cheers, > >>>>>>> Bert > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> Bert Gunter > >>>>>>> > >>>>>>> "The trouble with having an open mind is that people keep coming > >>>>>>> along > >>>>>>> and sticking things into it." > >>>>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >>>>>>> > >>>>>>> > >>>>>>> On Tue, Apr 19, 2016 at 8:34 AM, William Dunlap via R-help > >>>>>>> <r-help at r-project.org> wrote: > >>>>>>>>> That didn't work Jim! > >>>>>>>> > >>>>>>>> It always helps to say how the suggestion did not work. Jim's > >>>>>>>> function had a typo in it - was that the problem? Or did you not > >>>>>>>> change the call to ddply to use that function. Here is something > >>>>>>>> that might "work" for you: > >>>>>>>> > >>>>>>>> library(plyr) > >>>>>>>> > >>>>>>>> data <- data.frame(groupColumn=rep(1:5,1:5), col1=2^(0:14)) > >>>>>>>> myIqr <- function(x) { > >>>>>>>> > >>>>>>>> paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-") > >>>>>>>> } > >>>>>>>> ddply(data, ~groupColumn, summarise, col1_myIqr=myIqr(col1), > >>>>>>>> col1_IQR=stats::IQR(col1)) > >>>>>>>> # groupColumn col1_myIqr col1_IQR > >>>>>>>> #1 1 1-1 0 > >>>>>>>> #2 2 2-4 1 > >>>>>>>> #3 3 12-24 12 > >>>>>>>> #4 4 112-320 208 > >>>>>>>> #5 5 2048-8192 6144 > >>>>>>>> > >>>>>>>> The important point is that > >>>>>>>> > >>>>>>>> paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-") > >>>>>>>> is not a function, it is an expression. ddplyr wants functions. > >>>>>>>> > >>>>>>>> > >>>>>>>> Bill Dunlap > >>>>>>>> TIBCO Software > >>>>>>>> wdunlap tibco.com > >>>>>>>> > >>>>>>>> On Tue, Apr 19, 2016 at 7:56 AM, Michael Artz > >>>>>>>> <michaeleartz at gmail.com> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> That didn't work Jim! > >>>>>>>>> > >>>>>>>>> Thanks anyway > >>>>>>>>> > >>>>>>>>> On Mon, Apr 18, 2016 at 9:02 PM, Jim Lemon <drjimlemon at gmail.com > > > >>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> Hi Michael, > >>>>>>>>>> At a guess, try this: > >>>>>>>>>> > >>>>>>>>>> iqr<-function(x) { > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > return(paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-") > >>>>>>>>>> } > >>>>>>>>>> > >>>>>>>>>> .col3_Range=iqr(datat$tenure) > >>>>>>>>>> > >>>>>>>>>> Jim > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Tue, Apr 19, 2016 at 11:15 AM, Michael Artz > >>>>>>>>>> <michaeleartz at gmail.com> > >>>>>>>>>> wrote: > >>>>>>>>>>> Hi, > >>>>>>>>>>> I am trying to show an interquartile range while grouping > >>>>>>>>>>> values > >>>>>>>>> using > >>>>>>>>>>> the function ddply(). So my function call now is like > >>>>>>>>>>> > >>>>>>>>>>> groupedAll <- ddply(data > >>>>>>>>>>> ,~groupColumn > >>>>>>>>>>> ,summarise > >>>>>>>>>>> ,col1_mean=mean(col1) > >>>>>>>>>>> ,col2_mode=Mode(col2) #Function I wrote for > >>>>>>>>>>> getting > >>>>>>>>> the > >>>>>>>>>>> mode shown below > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > ,col3_Range=paste(as.character(round(quantile(datat$tenure,c(.25)))), > >>>>>>>>>>> as.character(round(quantile(data$tenure,c(.75)))), sep = "-") > >>>>>>>>>>> ) > >>>>>>>>>>> > >>>>>>>>>>> #custom Mode function > >>>>>>>>>>> Mode <- function(x) { > >>>>>>>>>>> ux <- unique(x) > >>>>>>>>>>> ux[which.max(tabulate(match(x, ux)))] > >>>>>>>>>>> } > >>>>>>>>>>> > >>>>>>>>>>> I am not sre what is going wrong on my interquartile range > >>>>>>>>>>> function, it > >>>>>>>>>>> works on its own outside of ddply() > >>>>>>>>>>> > >>>>>>>>>>> [[alternative HTML version deleted]] > >>>>>>>>>>> > >>>>>>>>>>> ______________________________________________ > >>>>>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, > >>>>>>>>>>> see > >>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>>>>>>>>> PLEASE do read the posting guide > >>>>>>>>>> http://www.R-project.org/posting-guide.html > >>>>>>>>>>> and provide commented, minimal, self-contained, reproducible > >>>>>>>>>>> code. > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> [[alternative HTML version deleted]] > >>>>>>>>> > >>>>>>>>> ______________________________________________ > >>>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, > see > >>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>>>>>>> PLEASE do read the posting guide > >>>>>>>>> http://www.R-project.org/posting-guide.html > >>>>>>>>> and provide commented, minimal, self-contained, reproducible > code. > >>>>>>>>> > >>>>>>>> > >>>>>>>> [[alternative HTML version deleted]] > >>>>>>>> > >>>>>>>> ______________________________________________ > >>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>>>>>> PLEASE do read the posting guide > >>>>>>>> http://www.R-project.org/posting-guide.html > >>>>>>>> and provide commented, minimal, self-contained, reproducible code. > >>>> > >>>> > >>> > >> > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > >[[alternative HTML version deleted]]
??? IQR returns a single number.> IQR(rnorm(10))[1] 1.090168 To your 2nd response: "I could have used average, min, max, they all would have returned the same thing., " I can only respond: huh?? Are all your values identical? You really need to provide a small reproducible example as requested by the posting guide -- I certainly don't get it, and I'm done guessing. Maybe others will see what I am missing and say something useful. I clearly can't. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Tue, Apr 19, 2016 at 5:29 PM, Michael Artz <michaeleartz at gmail.com> wrote:> Again, IQR returns two both a .25 and a .75 value and it failed, which is > why I didn't use it before. Also, the first function just returns tha same > value repeating. Since they are the same, before the second call, using the > mode function is just a way to grab one value. I could have used average, > min, max, they all would have returned the same thing. > > Mike > > On Tue, Apr 19, 2016 at 7:24 PM, Marc Schwartz <marc_schwartz at me.com> wrote: >> >> Hi, >> >> Jumping into this thread mainly on the point of the mode of the >> distribution, while also supporting Bert's comments below on theory. >> >> If the vector 'x' that is being passed to this function is an integer >> vector, then a tabulation of the integers can yield a 'mode', presuming of >> course that there is only one unique mode. You may have to decide how you >> want to handle a multi-modal discrete distribution. >> >> If the vector 'x' is continuous (e.g. contains floating point values), >> then a tabulation is going to be problematic for a variety of reasons. >> >> In that case, prior discussions on this point, have yielded the following >> estimation of the mode of a continuous distribution by using: >> >> Mode <- function(x) { >> D <- density(x) >> D$x[which.max(D$y)] >> } >> >> where the second line of the function gets you the value of 'x' at the >> maximum of the density estimate. Of course, there is still the possibility >> of a multi-modal distribution and the nuances of which kernel is used, etc., >> etc. >> >> Food for thought. >> >> Regards, >> >> Marc Schwartz >> >> >> > On Apr 19, 2016, at 7:07 PM, Bert Gunter <bgunter.4567 at gmail.com> wrote: >> > >> > Well, instead of your functions try: >> > >> > Mode <- function(x) { >> > tabx <- table(x) >> > tabx[which.max(tabx)] >> > } >> > >> > and use R's IQR function instead of yours. >> > >> > ... so I still don't get why you want to return a character string >> > instead of a value for the IQR; >> > and the mode of a sample defined as above is generally a bad estimator >> > of the mode of the distribution. To say more than that would take me >> > too far afield. Post on stats.stackexchange.com if you want to know >> > why (if it's even relevant). >> > >> > Cheers, >> > Bert >> > Bert Gunter >> > >> > "The trouble with having an open mind is that people keep coming along >> > and sticking things into it." >> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> > >> > >> > On Tue, Apr 19, 2016 at 4:25 PM, Michael Artz <michaeleartz at gmail.com> >> > wrote: >> >> Hi, >> >> Here is what I am doing >> >> >> >> notGroupedAll <- ddply(data >> >> ,~groupColumn >> >> ,summarise >> >> ,col1_mean=mean(col1) >> >> ,col2_mode=Mode(col2) #Function I wrote for getting the >> >> mode shown below >> >> ,col3_Range=myIqr(col3) >> >> ) >> >> >> >> groupedAll <- ddply(data >> >> ,~groupColumn >> >> ,summarise >> >> ,col1_mean=mean(col1) >> >> ,col2_mode=Mode(col2) #Function I wrote for getting the >> >> mode shown below >> >> ,col3_Range=Mode(col3) >> >> ) >> >> >> >> #custom Mode function >> >> Mode <- function(x) { >> >> ux <- unique(x) >> >> ux[which.max(tabulate(match(x, ux)))] >> >> >> >> #the range function >> >> myIqr <- function(x) { >> >> paste(round(quantile(x,0.375),0),round(quantile(x,0.625),0),sep="-") >> >> } >> >> >> >> >> >> } >> >> >> >> >> >> Here is what I am doing!! :) >> >> >> >> >> >> >> >> On Tue, Apr 19, 2016 at 2:57 PM, William Dunlap <wdunlap at tibco.com> >> >> wrote: >> >>> >> >>> If you show us, not just tell us about, a self-contained example >> >>> someone might show you a non-hacky way of getting the job done. >> >>> (I don't see an argument to plyr::ddply called 'transform'.) >> >>> >> >>> Bill Dunlap >> >>> TIBCO Software >> >>> wdunlap tibco.com >> >>> >> >>> On Tue, Apr 19, 2016 at 12:18 PM, Michael Artz >> >>> <michaeleartz at gmail.com> >> >>> wrote: >> >>>> >> >>>> Oh thanks for that clarification Bert! Hope you enjoyed your coffee! >> >>>> I >> >>>> ended up just using the transform argument in the ddply function. It >> >>>> worked >> >>>> and it repeated, then I called a mode function in another call to >> >>>> ddply that >> >>>> summarised. Kinda hacky but oh well! >> >>>> >> >>>> On Tue, Apr 19, 2016 at 12:31 PM, Bert Gunter >> >>>> <bgunter.4567 at gmail.com> >> >>>> wrote: >> >>>>> >> >>>>> ... and I'm getting another cup of coffee... >> >>>>> >> >>>>> -- Bert >> >>>>> Bert Gunter >> >>>>> >> >>>>> "The trouble with having an open mind is that people keep coming >> >>>>> along >> >>>>> and sticking things into it." >> >>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >>>>> >> >>>>> >> >>>>> On Tue, Apr 19, 2016 at 10:30 AM, Bert Gunter >> >>>>> <bgunter.4567 at gmail.com> >> >>>>> wrote: >> >>>>>> NO NO -- I am wrong! The paste() expression is of course >> >>>>>> evaluated. >> >>>>>> It's just that a character string is returned of the form >> >>>>>> "something - >> >>>>>> something". >> >>>>>> >> >>>>>> I apologize for the confusion. >> >>>>>> >> >>>>>> -- Bert >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> Bert Gunter >> >>>>>> >> >>>>>> "The trouble with having an open mind is that people keep coming >> >>>>>> along >> >>>>>> and sticking things into it." >> >>>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >>>>>> >> >>>>>> >> >>>>>> On Tue, Apr 19, 2016 at 10:25 AM, Bert Gunter >> >>>>>> <bgunter.4567 at gmail.com> >> >>>>>> wrote: >> >>>>>>> To be precise: >> >>>>>>> >> >>>>>>> paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-") >> >>>>>>> >> >>>>>>> is an expression that evaluates to a character string: >> >>>>>>> "round(quantile(x,.25),0) - round(quantile(x,0.75),0)" >> >>>>>>> >> >>>>>>> no matter what the argument of your function, x. Hence >> >>>>>>> >> >>>>>>> return(paste(...)) will return this exact character string and >> >>>>>>> never >> >>>>>>> evaluates x. >> >>>>>>> >> >>>>>>> >> >>>>>>> Cheers, >> >>>>>>> Bert >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> Bert Gunter >> >>>>>>> >> >>>>>>> "The trouble with having an open mind is that people keep coming >> >>>>>>> along >> >>>>>>> and sticking things into it." >> >>>>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >>>>>>> >> >>>>>>> >> >>>>>>> On Tue, Apr 19, 2016 at 8:34 AM, William Dunlap via R-help >> >>>>>>> <r-help at r-project.org> wrote: >> >>>>>>>>> That didn't work Jim! >> >>>>>>>> >> >>>>>>>> It always helps to say how the suggestion did not work. Jim's >> >>>>>>>> function had a typo in it - was that the problem? Or did you not >> >>>>>>>> change the call to ddply to use that function. Here is something >> >>>>>>>> that might "work" for you: >> >>>>>>>> >> >>>>>>>> library(plyr) >> >>>>>>>> >> >>>>>>>> data <- data.frame(groupColumn=rep(1:5,1:5), col1=2^(0:14)) >> >>>>>>>> myIqr <- function(x) { >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-") >> >>>>>>>> } >> >>>>>>>> ddply(data, ~groupColumn, summarise, col1_myIqr=myIqr(col1), >> >>>>>>>> col1_IQR=stats::IQR(col1)) >> >>>>>>>> # groupColumn col1_myIqr col1_IQR >> >>>>>>>> #1 1 1-1 0 >> >>>>>>>> #2 2 2-4 1 >> >>>>>>>> #3 3 12-24 12 >> >>>>>>>> #4 4 112-320 208 >> >>>>>>>> #5 5 2048-8192 6144 >> >>>>>>>> >> >>>>>>>> The important point is that >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-") >> >>>>>>>> is not a function, it is an expression. ddplyr wants functions. >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> Bill Dunlap >> >>>>>>>> TIBCO Software >> >>>>>>>> wdunlap tibco.com >> >>>>>>>> >> >>>>>>>> On Tue, Apr 19, 2016 at 7:56 AM, Michael Artz >> >>>>>>>> <michaeleartz at gmail.com> >> >>>>>>>> wrote: >> >>>>>>>> >> >>>>>>>>> That didn't work Jim! >> >>>>>>>>> >> >>>>>>>>> Thanks anyway >> >>>>>>>>> >> >>>>>>>>> On Mon, Apr 18, 2016 at 9:02 PM, Jim Lemon >> >>>>>>>>> <drjimlemon at gmail.com> >> >>>>>>>>> wrote: >> >>>>>>>>> >> >>>>>>>>>> Hi Michael, >> >>>>>>>>>> At a guess, try this: >> >>>>>>>>>> >> >>>>>>>>>> iqr<-function(x) { >> >>>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> return(paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-") >> >>>>>>>>>> } >> >>>>>>>>>> >> >>>>>>>>>> .col3_Range=iqr(datat$tenure) >> >>>>>>>>>> >> >>>>>>>>>> Jim >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> On Tue, Apr 19, 2016 at 11:15 AM, Michael Artz >> >>>>>>>>>> <michaeleartz at gmail.com> >> >>>>>>>>>> wrote: >> >>>>>>>>>>> Hi, >> >>>>>>>>>>> I am trying to show an interquartile range while grouping >> >>>>>>>>>>> values >> >>>>>>>>> using >> >>>>>>>>>>> the function ddply(). So my function call now is like >> >>>>>>>>>>> >> >>>>>>>>>>> groupedAll <- ddply(data >> >>>>>>>>>>> ,~groupColumn >> >>>>>>>>>>> ,summarise >> >>>>>>>>>>> ,col1_mean=mean(col1) >> >>>>>>>>>>> ,col2_mode=Mode(col2) #Function I wrote for >> >>>>>>>>>>> getting >> >>>>>>>>> the >> >>>>>>>>>>> mode shown below >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> ,col3_Range=paste(as.character(round(quantile(datat$tenure,c(.25)))), >> >>>>>>>>>>> as.character(round(quantile(data$tenure,c(.75)))), sep = "-") >> >>>>>>>>>>> ) >> >>>>>>>>>>> >> >>>>>>>>>>> #custom Mode function >> >>>>>>>>>>> Mode <- function(x) { >> >>>>>>>>>>> ux <- unique(x) >> >>>>>>>>>>> ux[which.max(tabulate(match(x, ux)))] >> >>>>>>>>>>> } >> >>>>>>>>>>> >> >>>>>>>>>>> I am not sre what is going wrong on my interquartile range >> >>>>>>>>>>> function, it >> >>>>>>>>>>> works on its own outside of ddply() >> >>>>>>>>>>> >> >>>>>>>>>>> [[alternative HTML version deleted]] >> >>>>>>>>>>> >> >>>>>>>>>>> ______________________________________________ >> >>>>>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, >> >>>>>>>>>>> see >> >>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >> >>>>>>>>>>> PLEASE do read the posting guide >> >>>>>>>>>> http://www.R-project.org/posting-guide.html >> >>>>>>>>>>> and provide commented, minimal, self-contained, reproducible >> >>>>>>>>>>> code. >> >>>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> [[alternative HTML version deleted]] >> >>>>>>>>> >> >>>>>>>>> ______________________________________________ >> >>>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, >> >>>>>>>>> see >> >>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >> >>>>>>>>> PLEASE do read the posting guide >> >>>>>>>>> http://www.R-project.org/posting-guide.html >> >>>>>>>>> and provide commented, minimal, self-contained, reproducible >> >>>>>>>>> code. >> >>>>>>>>> >> >>>>>>>> >> >>>>>>>> [[alternative HTML version deleted]] >> >>>>>>>> >> >>>>>>>> ______________________________________________ >> >>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> >>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >> >>>>>>>> PLEASE do read the posting guide >> >>>>>>>> http://www.R-project.org/posting-guide.html >> >>>>>>>> and provide commented, minimal, self-contained, reproducible >> >>>>>>>> code. >> >>>> >> >>>> >> >>> >> >> >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> >
I already found a solution, you suggested I try to find a non hacky solution, which was not really my priority. I should have declined politely, which I will do now. Or, ifyou just want me to post reproducible code because you are bored or because you like solving problems then let me know and I will accommodate. You have been helpful and I wouldnt mind in that case. Also, IQR was not a help from the beginning. If it supplies one value, then its not even a candidate to be helpful for my problem. I already talked about the format i was looking for. I dont think I violated any posting guideline, I asked for help, and people pointed me in a direction and it helped me. Thanks again, I appreciate it. On Apr 19, 2016 10:53 PM, "Bert Gunter" <bgunter.4567 at gmail.com> wrote:> ??? > > IQR returns a single number. > > > IQR(rnorm(10)) > [1] 1.090168 > > To your 2nd response: > "I could have used average, min, max, they all would have returned the > same thing., " > > I can only respond: huh?? Are all your values identical? > > You really need to provide a small reproducible example as requested > by the posting guide -- I certainly don't get it, and I'm done > guessing. Maybe others will see what I am missing and say something > useful. I clearly can't. > > Cheers, > Bert > > > > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Tue, Apr 19, 2016 at 5:29 PM, Michael Artz <michaeleartz at gmail.com> > wrote: > > Again, IQR returns two both a .25 and a .75 value and it failed, which is > > why I didn't use it before. Also, the first function just returns tha > same > > value repeating. Since they are the same, before the second call, using > the > > mode function is just a way to grab one value. I could have used average, > > min, max, they all would have returned the same thing. > > > > Mike > > > > On Tue, Apr 19, 2016 at 7:24 PM, Marc Schwartz <marc_schwartz at me.com> > wrote: > >> > >> Hi, > >> > >> Jumping into this thread mainly on the point of the mode of the > >> distribution, while also supporting Bert's comments below on theory. > >> > >> If the vector 'x' that is being passed to this function is an integer > >> vector, then a tabulation of the integers can yield a 'mode', presuming > of > >> course that there is only one unique mode. You may have to decide how > you > >> want to handle a multi-modal discrete distribution. > >> > >> If the vector 'x' is continuous (e.g. contains floating point values), > >> then a tabulation is going to be problematic for a variety of reasons. > >> > >> In that case, prior discussions on this point, have yielded the > following > >> estimation of the mode of a continuous distribution by using: > >> > >> Mode <- function(x) { > >> D <- density(x) > >> D$x[which.max(D$y)] > >> } > >> > >> where the second line of the function gets you the value of 'x' at the > >> maximum of the density estimate. Of course, there is still the > possibility > >> of a multi-modal distribution and the nuances of which kernel is used, > etc., > >> etc. > >> > >> Food for thought. > >> > >> Regards, > >> > >> Marc Schwartz > >> > >> > >> > On Apr 19, 2016, at 7:07 PM, Bert Gunter <bgunter.4567 at gmail.com> > wrote: > >> > > >> > Well, instead of your functions try: > >> > > >> > Mode <- function(x) { > >> > tabx <- table(x) > >> > tabx[which.max(tabx)] > >> > } > >> > > >> > and use R's IQR function instead of yours. > >> > > >> > ... so I still don't get why you want to return a character string > >> > instead of a value for the IQR; > >> > and the mode of a sample defined as above is generally a bad estimator > >> > of the mode of the distribution. To say more than that would take me > >> > too far afield. Post on stats.stackexchange.com if you want to know > >> > why (if it's even relevant). > >> > > >> > Cheers, > >> > Bert > >> > Bert Gunter > >> > > >> > "The trouble with having an open mind is that people keep coming along > >> > and sticking things into it." > >> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >> > > >> > > >> > On Tue, Apr 19, 2016 at 4:25 PM, Michael Artz <michaeleartz at gmail.com > > > >> > wrote: > >> >> Hi, > >> >> Here is what I am doing > >> >> > >> >> notGroupedAll <- ddply(data > >> >> ,~groupColumn > >> >> ,summarise > >> >> ,col1_mean=mean(col1) > >> >> ,col2_mode=Mode(col2) #Function I wrote for getting > the > >> >> mode shown below > >> >> ,col3_Range=myIqr(col3) > >> >> ) > >> >> > >> >> groupedAll <- ddply(data > >> >> ,~groupColumn > >> >> ,summarise > >> >> ,col1_mean=mean(col1) > >> >> ,col2_mode=Mode(col2) #Function I wrote for getting > the > >> >> mode shown below > >> >> ,col3_Range=Mode(col3) > >> >> ) > >> >> > >> >> #custom Mode function > >> >> Mode <- function(x) { > >> >> ux <- unique(x) > >> >> ux[which.max(tabulate(match(x, ux)))] > >> >> > >> >> #the range function > >> >> myIqr <- function(x) { > >> >> paste(round(quantile(x,0.375),0),round(quantile(x,0.625),0),sep="-") > >> >> } > >> >> > >> >> > >> >> } > >> >> > >> >> > >> >> Here is what I am doing!! :) > >> >> > >> >> > >> >> > >> >> On Tue, Apr 19, 2016 at 2:57 PM, William Dunlap <wdunlap at tibco.com> > >> >> wrote: > >> >>> > >> >>> If you show us, not just tell us about, a self-contained example > >> >>> someone might show you a non-hacky way of getting the job done. > >> >>> (I don't see an argument to plyr::ddply called 'transform'.) > >> >>> > >> >>> Bill Dunlap > >> >>> TIBCO Software > >> >>> wdunlap tibco.com > >> >>> > >> >>> On Tue, Apr 19, 2016 at 12:18 PM, Michael Artz > >> >>> <michaeleartz at gmail.com> > >> >>> wrote: > >> >>>> > >> >>>> Oh thanks for that clarification Bert! Hope you enjoyed your > coffee! > >> >>>> I > >> >>>> ended up just using the transform argument in the ddply function. > It > >> >>>> worked > >> >>>> and it repeated, then I called a mode function in another call to > >> >>>> ddply that > >> >>>> summarised. Kinda hacky but oh well! > >> >>>> > >> >>>> On Tue, Apr 19, 2016 at 12:31 PM, Bert Gunter > >> >>>> <bgunter.4567 at gmail.com> > >> >>>> wrote: > >> >>>>> > >> >>>>> ... and I'm getting another cup of coffee... > >> >>>>> > >> >>>>> -- Bert > >> >>>>> Bert Gunter > >> >>>>> > >> >>>>> "The trouble with having an open mind is that people keep coming > >> >>>>> along > >> >>>>> and sticking things into it." > >> >>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >> >>>>> > >> >>>>> > >> >>>>> On Tue, Apr 19, 2016 at 10:30 AM, Bert Gunter > >> >>>>> <bgunter.4567 at gmail.com> > >> >>>>> wrote: > >> >>>>>> NO NO -- I am wrong! The paste() expression is of course > >> >>>>>> evaluated. > >> >>>>>> It's just that a character string is returned of the form > >> >>>>>> "something - > >> >>>>>> something". > >> >>>>>> > >> >>>>>> I apologize for the confusion. > >> >>>>>> > >> >>>>>> -- Bert > >> >>>>>> > >> >>>>>> > >> >>>>>> > >> >>>>>> > >> >>>>>> Bert Gunter > >> >>>>>> > >> >>>>>> "The trouble with having an open mind is that people keep coming > >> >>>>>> along > >> >>>>>> and sticking things into it." > >> >>>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip > ) > >> >>>>>> > >> >>>>>> > >> >>>>>> On Tue, Apr 19, 2016 at 10:25 AM, Bert Gunter > >> >>>>>> <bgunter.4567 at gmail.com> > >> >>>>>> wrote: > >> >>>>>>> To be precise: > >> >>>>>>> > >> >>>>>>> > paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-") > >> >>>>>>> > >> >>>>>>> is an expression that evaluates to a character string: > >> >>>>>>> "round(quantile(x,.25),0) - round(quantile(x,0.75),0)" > >> >>>>>>> > >> >>>>>>> no matter what the argument of your function, x. Hence > >> >>>>>>> > >> >>>>>>> return(paste(...)) will return this exact character string and > >> >>>>>>> never > >> >>>>>>> evaluates x. > >> >>>>>>> > >> >>>>>>> > >> >>>>>>> Cheers, > >> >>>>>>> Bert > >> >>>>>>> > >> >>>>>>> > >> >>>>>>> > >> >>>>>>> > >> >>>>>>> > >> >>>>>>> > >> >>>>>>> > >> >>>>>>> > >> >>>>>>> Bert Gunter > >> >>>>>>> > >> >>>>>>> "The trouble with having an open mind is that people keep coming > >> >>>>>>> along > >> >>>>>>> and sticking things into it." > >> >>>>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic > strip ) > >> >>>>>>> > >> >>>>>>> > >> >>>>>>> On Tue, Apr 19, 2016 at 8:34 AM, William Dunlap via R-help > >> >>>>>>> <r-help at r-project.org> wrote: > >> >>>>>>>>> That didn't work Jim! > >> >>>>>>>> > >> >>>>>>>> It always helps to say how the suggestion did not work. Jim's > >> >>>>>>>> function had a typo in it - was that the problem? Or did you > not > >> >>>>>>>> change the call to ddply to use that function. Here is > something > >> >>>>>>>> that might "work" for you: > >> >>>>>>>> > >> >>>>>>>> library(plyr) > >> >>>>>>>> > >> >>>>>>>> data <- data.frame(groupColumn=rep(1:5,1:5), col1=2^(0:14)) > >> >>>>>>>> myIqr <- function(x) { > >> >>>>>>>> > >> >>>>>>>> > >> >>>>>>>> > paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-") > >> >>>>>>>> } > >> >>>>>>>> ddply(data, ~groupColumn, summarise, col1_myIqr=myIqr(col1), > >> >>>>>>>> col1_IQR=stats::IQR(col1)) > >> >>>>>>>> # groupColumn col1_myIqr col1_IQR > >> >>>>>>>> #1 1 1-1 0 > >> >>>>>>>> #2 2 2-4 1 > >> >>>>>>>> #3 3 12-24 12 > >> >>>>>>>> #4 4 112-320 208 > >> >>>>>>>> #5 5 2048-8192 6144 > >> >>>>>>>> > >> >>>>>>>> The important point is that > >> >>>>>>>> > >> >>>>>>>> > >> >>>>>>>> > paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-") > >> >>>>>>>> is not a function, it is an expression. ddplyr wants > functions. > >> >>>>>>>> > >> >>>>>>>> > >> >>>>>>>> Bill Dunlap > >> >>>>>>>> TIBCO Software > >> >>>>>>>> wdunlap tibco.com > >> >>>>>>>> > >> >>>>>>>> On Tue, Apr 19, 2016 at 7:56 AM, Michael Artz > >> >>>>>>>> <michaeleartz at gmail.com> > >> >>>>>>>> wrote: > >> >>>>>>>> > >> >>>>>>>>> That didn't work Jim! > >> >>>>>>>>> > >> >>>>>>>>> Thanks anyway > >> >>>>>>>>> > >> >>>>>>>>> On Mon, Apr 18, 2016 at 9:02 PM, Jim Lemon > >> >>>>>>>>> <drjimlemon at gmail.com> > >> >>>>>>>>> wrote: > >> >>>>>>>>> > >> >>>>>>>>>> Hi Michael, > >> >>>>>>>>>> At a guess, try this: > >> >>>>>>>>>> > >> >>>>>>>>>> iqr<-function(x) { > >> >>>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> > return(paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-") > >> >>>>>>>>>> } > >> >>>>>>>>>> > >> >>>>>>>>>> .col3_Range=iqr(datat$tenure) > >> >>>>>>>>>> > >> >>>>>>>>>> Jim > >> >>>>>>>>>> > >> >>>>>>>>>> > >> >>>>>>>>>> > >> >>>>>>>>>> On Tue, Apr 19, 2016 at 11:15 AM, Michael Artz > >> >>>>>>>>>> <michaeleartz at gmail.com> > >> >>>>>>>>>> wrote: > >> >>>>>>>>>>> Hi, > >> >>>>>>>>>>> I am trying to show an interquartile range while grouping > >> >>>>>>>>>>> values > >> >>>>>>>>> using > >> >>>>>>>>>>> the function ddply(). So my function call now is like > >> >>>>>>>>>>> > >> >>>>>>>>>>> groupedAll <- ddply(data > >> >>>>>>>>>>> ,~groupColumn > >> >>>>>>>>>>> ,summarise > >> >>>>>>>>>>> ,col1_mean=mean(col1) > >> >>>>>>>>>>> ,col2_mode=Mode(col2) #Function I wrote for > >> >>>>>>>>>>> getting > >> >>>>>>>>> the > >> >>>>>>>>>>> mode shown below > >> >>>>>>>>>>> > >> >>>>>>>>>>> > >> >>>>>>>>>>> > >> >>>>>>>>>>> > ,col3_Range=paste(as.character(round(quantile(datat$tenure,c(.25)))), > >> >>>>>>>>>>> as.character(round(quantile(data$tenure,c(.75)))), sep > "-") > >> >>>>>>>>>>> ) > >> >>>>>>>>>>> > >> >>>>>>>>>>> #custom Mode function > >> >>>>>>>>>>> Mode <- function(x) { > >> >>>>>>>>>>> ux <- unique(x) > >> >>>>>>>>>>> ux[which.max(tabulate(match(x, ux)))] > >> >>>>>>>>>>> } > >> >>>>>>>>>>> > >> >>>>>>>>>>> I am not sre what is going wrong on my interquartile range > >> >>>>>>>>>>> function, it > >> >>>>>>>>>>> works on its own outside of ddply() > >> >>>>>>>>>>> > >> >>>>>>>>>>> [[alternative HTML version deleted]] > >> >>>>>>>>>>> > >> >>>>>>>>>>> ______________________________________________ > >> >>>>>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and > more, > >> >>>>>>>>>>> see > >> >>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help > >> >>>>>>>>>>> PLEASE do read the posting guide > >> >>>>>>>>>> http://www.R-project.org/posting-guide.html > >> >>>>>>>>>>> and provide commented, minimal, self-contained, reproducible > >> >>>>>>>>>>> code. > >> >>>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> [[alternative HTML version deleted]] > >> >>>>>>>>> > >> >>>>>>>>> ______________________________________________ > >> >>>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, > >> >>>>>>>>> see > >> >>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help > >> >>>>>>>>> PLEASE do read the posting guide > >> >>>>>>>>> http://www.R-project.org/posting-guide.html > >> >>>>>>>>> and provide commented, minimal, self-contained, reproducible > >> >>>>>>>>> code. > >> >>>>>>>>> > >> >>>>>>>> > >> >>>>>>>> [[alternative HTML version deleted]] > >> >>>>>>>> > >> >>>>>>>> ______________________________________________ > >> >>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, > see > >> >>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help > >> >>>>>>>> PLEASE do read the posting guide > >> >>>>>>>> http://www.R-project.org/posting-guide.html > >> >>>>>>>> and provide commented, minimal, self-contained, reproducible > >> >>>>>>>> code. > >> >>>> > >> >>>> > >> >>> > >> >> > >> > > >> > ______________________________________________ > >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> > https://stat.ethz.ch/mailman/listinfo/r-help > >> > PLEASE do read the posting guide > >> > http://www.R-project.org/posting-guide.html > >> > and provide commented, minimal, self-contained, reproducible code. > >> > > >[[alternative HTML version deleted]]