chipmaney
2010-Jan-27 19:40 UTC
[R] using functions with multiple arguments in the "apply" family
typically, the apply family wants you to use vectors to run functions on. However, I have a function, kruskal.test, that requires 2 arguments. kruskal.test(Herb.df$Score,Herb.df$Year) This easily computes the KW ANOVA statistic for any difference across years.... However, my data has multiple sites on which KW needs to be run... here's the data: Herb.df<- data.frame(Score=rep(c(2,4,6,6,6,5,7,8,6,9),2),Year=rep(c(rep(1,5),rep(2,5)),2),Site=c(rep(3,10),rep(4,10))) However, if I try this: tapply(Herb.df,Herb.df$Site,function(.data) kruskal.test(.data$Indicator_Rating,.data$Year))>Error in tapply(Herb.df, Herb.df$ID, function(.data)kruskal.test(.data$Indicator_Rating, : arguments must have same length How can I vectorize the kruskal.test() for all sites using tapply() in lieu of a loop? -- View this message in context: http://n4.nabble.com/using-functions-with-multiple-arguments-in-the-apply-family-tp1312027p1312027.html Sent from the R help mailing list archive at Nabble.com.
Peter Ehlers
2010-Jan-28 19:53 UTC
[R] using functions with multiple arguments in the "apply" family
chipmaney wrote:> typically, the apply family wants you to use vectors to run functions on. > However, I have a function, kruskal.test, that requires 2 arguments. > > kruskal.test(Herb.df$Score,Herb.df$Year) > > This easily computes the KW ANOVA statistic for any difference across > years.... > > However, my data has multiple sites on which KW needs to be run... > > here's the data: > > Herb.df<- > data.frame(Score=rep(c(2,4,6,6,6,5,7,8,6,9),2),Year=rep(c(rep(1,5),rep(2,5)),2),Site=c(rep(3,10),rep(4,10))) > > However, if I try this: > > tapply(Herb.df,Herb.df$Site,function(.data) > kruskal.test(.data$Indicator_Rating,.data$Year)) > > >> Error in tapply(Herb.df, Herb.df$ID, function(.data) > kruskal.test(.data$Indicator_Rating, : > arguments must have same length > > > How can I vectorize the kruskal.test() for all sites using tapply() in lieu > of a loop?Your example data makes little sense; you have precisely the same data for both sites and you have only two sites (why do kruskal.test on two sites?). Finally, you need to decide what your response variable is: 'Score' or 'Indicator_Rating'. So here's some made-up data and the use of by() to apply the test to each site: dat <- data.frame(y = rnorm(60), yr=gl(4,5,60), st=gl(3,20)) with(dat, by(dat, st, function(x) kruskal.test(y~yr, data=x))) See the last example in ?by. -Peter Ehlers> > > >-- Peter Ehlers University of Calgary
casalott
2010-Aug-12 00:50 UTC
[R] using functions with multiple arguments in the "apply" family
I can actually answer this!! I was trying to figure out how to use sapply for a function I wrote with multiple arguments. Suppose the function is called FUN(a,b), where "a" is a number and "b" is a number You can use mapply(FUN, a = VECTOR, b = VECTOR) where each vector is your input arguments. It will output a vector or a matrix (depending on the output of your function). This will also work: mapply(FUN, a = VECTOR, b = NUMBER) and will apply your function with each element of "a" but the same argument for "b" each time. Let me know if it works! -- View this message in context: http://r.789695.n4.nabble.com/using-functions-with-multiple-arguments-in-the-apply-family-tp1312027p2322067.html Sent from the R help mailing list archive at Nabble.com.