I have a problem where I need to calculate the corresponding cohort percentile ranks for each of several variables. Essentially, what I need is a function that will calculate the distribution-free percentiles from each variable's data vector, returning a corresponding vector of percentiles: e.g.: percentile.my.data<-/function/(my.data) I tried to make ecdf() perform this task but was unsuccessful. I'd be grateful for any help or advice... -- View this message in context: http://r.789695.n4.nabble.com/data-vector-to-corresonding-percentile-ranks-tp4228971p4228971.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]]
I'm not sure I understand the question, but does quantile() do what you want? On Fri, Dec 23, 2011 at 10:28 AM, Steve Jones <sjones64 at jhmi.edu> wrote:> I have a problem where I need to calculate the corresponding cohort > percentile ranks for each of several variables. > > Essentially, what I need is a function that will calculate the > distribution-free percentiles from each variable's data vector, returning a > corresponding vector of percentiles: > > e.g.: > > percentile.my.data<-/function/(my.data) > > > I tried to make ecdf() perform this task but was unsuccessful. > > I'd be grateful for any help or advice... > >-- Sarah Goslee http://www.functionaldiversity.org
On Dec 23, 2011, at 10:28 AM, Steve Jones wrote:> I have a problem where I need to calculate the corresponding cohort > percentile ranks for each of several variables. > > Essentially, what I need is a function that will calculate the > distribution-free percentiles from each variable's data vector, > returning a > corresponding vector of percentiles: > > e.g.: > > percentile.my.data<-/function/(my.data) > > > I tried to make ecdf() perform this task but was unsuccessful.Unsuccessful? How? Seems like a reasonable strategy: set.seed(123) x <- rnorm(1000) xCdist <- ecdf(x) Seems to give sensible results. > x[1] [1] -0.7104066 > 100*xCdist(x[1]) [1] 23.4 > x[2] [1] 0.2568837 > 100*xCdist(x[2]) [1] 60> > I'd be grateful for any help or advice...My advice would be to post what code you were trying so that you can get help understand what difficulties you need to overcome. -- David Winsemius, MD West Hartford, CT
It's far more useful to send your explanation to the list than it is to send it just to me. I've taken the liberty of doing so. But this does sound like a job for ecdf() - what did you do, and what went wrong? On Fri, Dec 23, 2011 at 12:23 PM, Steven Jones <sjones64 at jhmi.edu> wrote:> Actually, what I am trying to do is very simple. ?I want to take a large vector of data, about 1.5 million observations and return a corresponding vector of percentiles, each corresponding to the elements of the original data vector in a 1:1 correspondence. > > Original data vector: ? ? ?x<--(x1,x2,x3,...xn) > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?^ ?^ ?^ > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| ?| ?| > Derived percentile vector: p<--(p1,p2,p3,...pn) > > Where pn<--is the percentile value of xn determined from the rank in the data vector x. > > Ideally, there is a function which could return a vector p from my original data vector x. > > My understanding, which may be incorrect, is that quantile() function returns the value of x for a specified percentile rank in a vector. > > Steve > > -----Original Message----- > From: Sarah Goslee [mailto:sarah.goslee at gmail.com] > Sent: Friday, December 23, 2011 12:11 PM > To: Steven Jones > Cc: r-help at r-project.org > Subject: Re: [R] data vector to corresonding percentile ranks > > I'm not sure I understand the question, but does quantile() do what you want? > > On Fri, Dec 23, 2011 at 10:28 AM, Steve Jones <sjones64 at jhmi.edu> wrote: >> I have a problem where I need to calculate the corresponding cohort >> percentile ranks for each of several variables. >> >> Essentially, what I need is a function that will calculate the >> distribution-free percentiles from each variable's data vector, returning a >> corresponding vector of percentiles: >> >> e.g.: >> >> percentile.my.data<-/function/(my.data) >> >> >> I tried to make ecdf() perform this task but was unsuccessful. >> >> I'd be grateful for any help or advice... >> >> >