Dear list, I'm trying to implement the following function, but what I get is an error message and I don't understand where is the error: #outliers'identification: iqr=lapply(bb,function(){ inner_fencesl=quantile(x,0.25)-1.5*IQR(x) inner_fencesh=quantile(x,0.75)+1.5*IQR(x) outer_fencesl=quantile(x,0.25)-3*IQR(x) outer_fencesh=quantile(x,0.75)+3*IQR(x)}) where bb is a dataframe containing all the variables over wich the function must be applied. thanks of your attention! [[alternative HTML version deleted]]
Try this: lapply(bb, function(x)quantile(x, c(0.25, 0.75)) - matrix(IQR(x) * c(1.5, 3), nrow = 2) %*% c(-1, 1)) On Wed, May 12, 2010 at 1:44 PM, n.vialma@libero.it <n.vialma@libero.it>wrote:> > Dear list, > I'm trying to implement the following function, but what I get is an error > message and I don't understand where is the error: > #outliers'identification: > iqr=lapply(bb,function(){ > inner_fencesl=quantile(x,0.25)-1.5*IQR(x) > inner_fencesh=quantile(x,0.75)+1.5*IQR(x) > outer_fencesl=quantile(x,0.25)-3*IQR(x) > outer_fencesh=quantile(x,0.75)+3*IQR(x)}) > where bb is a dataframe containing all the variables over wich the function > must be applied. > thanks of your attention! > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O [[alternative HTML version deleted]]
There is too little information to answer your question definitively. However, an obvious reason is that you want to apply the function over columns of a data.frame, which is done with apply(), but you try to apply the function over elements of a list using lapply(). A list is not a data.frame and vice versa, which would be a good reason for your function to fail. The below example works: data=data.frame(y=rnorm(100),x=rnorm(100),e=rnorm(100)) f=function(x){quantile(x,probs=0.25)} apply(data,2,f) Also, for debugging, you want to disassemble the whole and check whether the individual parts work. Does the "function()" work when applied on just one column, for example? If not, then you also have a problem with the definition of the function. If so, you would have to check, whether the quantile or IQR function fails in one of the instances. HTH, Daniel -- View this message in context: http://r.789695.n4.nabble.com/function-tp2196408p2196445.html Sent from the R help mailing list archive at Nabble.com.
On May 12, 2010, at 1:28 PM, Daniel Malter wrote:> > There is too little information to answer your question definitively. > > However, an obvious reason is that you want to apply the function over > columns of a data.frame, which is done with apply(), but you try to > apply > the function over elements of a list using lapply(). A list is not a > data.frame and vice versa,Not correct. Using your example below: > is.list(data) [1] TRUE> which would be a good reason for your function to > fail.Maybe, maybe not. lapply() works quite well with dataframes: > lapply(data, sum) $y [1] 2.982636 $x [1] -4.718842 $e [1] 0.969399 -- David.> The below example works: > > data=data.frame(y=rnorm(100),x=rnorm(100),e=rnorm(100)) > > f=function(x){quantile(x,probs=0.25)} > > apply(data,2,f) > > Also, for debugging, you want to disassemble the whole and check > whether the > individual parts work. Does the "function()" work when applied on > just one > column, for example? If not, then you also have a problem with the > definition of the function. If so, you would have to check, whether > the > quantile or IQR function fails in one of the instances. > > HTH, > Daniel > -- > View this message in context: http://r.789695.n4.nabble.com/function-tp2196408p2196445.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT
> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of David Winsemius > Sent: Wednesday, May 12, 2010 10:41 AM > To: Daniel Malter > Cc: r-help at r-project.org > Subject: Re: [R] function > > > On May 12, 2010, at 1:28 PM, Daniel Malter wrote: > > > > > There is too little information to answer your question > definitively. > > > > However, an obvious reason is that you want to apply the > function over > > columns of a data.frame, which is done with apply(), but > you try to > > apply > > the function over elements of a list using lapply(). A list is not a > > data.frame and vice versa, > > Not correct. Using your example below: > > > is.list(data) > [1] TRUE > > > which would be a good reason for your function to > > fail. > > Maybe, maybe not. lapply() works quite well with dataframes: > > > lapply(data, sum) > $y > [1] 2.982636 > > $x > [1] -4.718842 > > $e > [1] 0.969399Furthermore, apply() can work quite badly on data.frames. If all columns have the same type it is generally ok, as in > d <- data.frame(x=c(1,-10,31.4159265),y=c(666,.05,9.999)) > apply(d,2,max) x y 31.41593 666.00000 But if you add a character (or POSIXct or ...) column you get bad results > d$name <- c("Joe", "Jack", "Socrates") > apply(d,2,max) x y name " 31.41593" " 9.999" "Socrates" Use plyr or [ls]apply on data.frames, not apply. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> > -- > David. > > > The below example works: > > > > data=data.frame(y=rnorm(100),x=rnorm(100),e=rnorm(100)) > > > > f=function(x){quantile(x,probs=0.25)} > > > > apply(data,2,f) > > > > Also, for debugging, you want to disassemble the whole and check > > whether the > > individual parts work. Does the "function()" work when applied on > > just one > > column, for example? If not, then you also have a problem with the > > definition of the function. If so, you would have to check, > whether > > the > > quantile or IQR function fails in one of the instances. > > > > HTH, > > Daniel > > -- > > View this message in context: > http://r.789695.n4.nabble.com/function-tp2196408p2196445.html > > Sent from the R help mailing list archive at Nabble.com. > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
On May 12, 2010, at 2:50 PM, Daniel Malter wrote:> > Fair enough, my mistake. However, I am quite fascinated how that > focuses > everybody else on picking on the intitial answer and diverts > everybody away > from anwering the actual question. All the more it points to the > second > paragraph of my reply, namely that all modular components of the > function > should be checked. > > For example: > > lapply(data,function(){quantile(x,probs=0.25)}) > > #does not work, but > > lapply(data,function(x){quantile(x,probs=0.25)}) > > #works > > So perhaps missing out on the x in the definition of the function is > the > only problem. But for that, we would need the error message.Or as the message says (over and over): "... reproducible code." Maybe everyone that posts in html could be returned a mail-server generated reply that has that text highlighted in blinking color.> > Daniel > > ---- David Winsemius, MD West Hartford, CT
Dear list, I would like to ask you a question. I'm trying to build the time series' production with the Divisia index. The final step would require to do the following calculations: a)PROD(2006)=PROD(2007)/1+[DELTA_PROD(2007)] b)PROD(2005)=PROD(2006)+[1+DELTA_PROD(2006)] c)PROD(2004)=PROD(2005)+[1+DELTA_PROD(2005)] my question is how can I tell R to take the value generated in the previous step (for example is the case of the produciton of 2005 that need the value of the production of 2006) in order to generate the time series production?? (PS:my data.frame is not set as a time series) Thanks for your attention!! [[alternative HTML version deleted]]
That's a bit more clear.> Prod2007=2> Delta=c(4,3,5)> Delta <- 1+Delta/100> Series <- Prod2007+cumsum(Delta) > Series[1] 3.04 4.07 5.12 On Thu, Jun 3, 2010 at 1:21 PM, n.vialma@libero.it <n.vialma@libero.it>wrote:> What I would like to do is for example: > Suppose that I have the following value > > a)PROD(2006)=PROD(2007)/1+[DELTA_PROD(2007)] > b)PROD(2005)=PROD(2006)+[1+DELTA_PROD(2006)] > c)PROD(2004)=PROD(2005)+[1+DELTA_PROD(2005)] > > > where prod(2007)=2 > > DELTA_PROD(2007)=4 > > DELTA_PROD(2006)=3 > > DELTA_PROD(2005)=5 > > so prod(2007) is like the starting value of production from wich starts the > construction of its the time series. So: > > prod(2006)=2+[1+4/100] which is equal to 3.04 > > so i will have: > > prod(2005)=3.04+ [1+3/100] > > and so on > > > > > > > > ----Messaggio originale---- > Da: jorismeys@gmail.com > Data: 03/06/2010 13.05 > A: "n.vialma@libero.it"<n.vialma@libero.it> > Cc: <r-help@r-project.org> > Ogg: Re: [R] function > > > This is what you asked for. > > > Prod2007 <- 1:10 > > > Prod2006 <- Prod2007/1+c(0,diff(Prod2007)) > > > Prod2005 <- Prod2006+(1+c(0,diff(Prod2006))) > > > Prod2004 <- Prod2005+(1+c(0,diff(Prod2005))) > > > Prod2006 > [1] 1 3 4 5 6 7 8 9 10 11 > > > Prod2005 > [1] 2 6 6 7 8 9 10 11 12 13 > > > Prod2004 > [1] 3 11 7 9 10 11 12 13 14 15 > > Sure that's what you want? > > On Thu, Jun 3, 2010 at 12:30 PM, n.vialma@libero.it <n.vialma@libero.it>wrote: > >> >> Dear list, >> I would like to ask you a question. I'm trying to build the time series' >> production with the Divisia index. The final step would require to do the >> following calculations: >> a)PROD(2006)=PROD(2007)/1+[DELTA_PROD(2007)] >> b)PROD(2005)=PROD(2006)+[1+DELTA_PROD(2006)] >> c)PROD(2004)=PROD(2005)+[1+DELTA_PROD(2005)] >> my question is how can I tell R to take the value generated in the >> previous step (for example is the case of the produciton of 2005 that need >> the value of the production of 2006) in order to generate the time series >> production?? >> (PS:my data.frame is not set as a time series) >> Thanks for your attention!! >> >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Joris Meys > Statistical Consultant > > Ghent University > Faculty of Bioscience Engineering > Department of Applied mathematics, biometrics and process control > > Coupure Links 653 > B-9000 Gent > > tel : +32 9 264 59 87 > Joris.Meys@Ugent.be > ------------------------------- > Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php > > >-- Joris Meys Statistical Consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control Coupure Links 653 B-9000 Gent tel : +32 9 264 59 87 Joris.Meys@Ugent.be ------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]]
Dear list, I would like to ask you a question. I'm trying to build the time series' production with the Divisia index. The final step would require to do the following calculations: a)PROD(2006)=PROD(2007)/[1+DELTA_PROD(2007)/100] b)PROD(2005)=PROD(2006)/[1+DELTA_PROD(2006)/100] c)PROD(2004)=PROD(2005)/[1+DELTA_PROD(2005)/100] my question is how can I tell R to take the value generated in the previous step (for example is the case of the produciton of 2005 that need the value of the production of 2006) in order to generate the time series production?? I have the value of the following variables: prod(2007)=2 delta_prod(2007)=3 delta_prod(2006)=5 delta_prod(2005)=4 What i would like to do is prod(2006)=2/(1+3/100) which is equal to 1.95 prod(2005)=1.95/(1+5/100) and so on... (PS:my data.frame is not set as a time series) Thanks for your attention!! [[alternative HTML version deleted]]