Hi everybody I am a recent convert from SAS so please excuse me if this is all very obvious: I want to use the runs test {runs.test() in package tseries} to test the randomness of a certain variable in a survey for each interviewer. I tried to us the by() statement but it doesn't seem to work with runs.test() as the function. Here is what I have: Consider a data frame with two variables and 40 observations. Column 1 is the name of the interviewer and 2 is a variable that could be either 0 or 1. ] Thus "interviewer" and "var". This generates such a data frame exampledata<-data.frame(interviewer=rep(letters[1:2], 1), var=round(var=runif(40))) I do the runs test on "var" and it works runs.test(as.factor(exampledata$var)) Runs Test data: as.factor(exampledata$var) Standard Normal = -1.626, p-value = 0.1039 alternative hypothesis: two.sided I can catagorise the data by "interviewer" and get means using the by() statement and that works perfectly by(exampledata$var, exampledata$interviewer, mean) exampledata$interviewer: a [1] 0.4 ------------------------------------------------------------ exampledata$interviewer: b [1] 0.35 Why is it impossible to use runs.test() as the function in the by() statement instead of mean by(exampledata, exampledata$interviewer, runs.test(as.factor(exampledata$var))) Error in FUN(X[[1L]], ...) : could not find function "FUN" Can someone please tell me why this is the case. I tried aggregate() too but with the same result thanks Christiaan [[alternative HTML version deleted]]
Thanks to Patric and Christian for swift and accurate help I still had to learn the differnce between calling a function and passing a call to a function. I had to coerce the first variable to be a function but then it worked. by(as.factor(exampledata$var), exampledata$interviewer, runs.test) Regards Christiaan On 14/01/2009, christiaan pauw <cjpauw at gmail.com> wrote:> exampledata<-data.frame(interviewer=rep(letters[1:2], 1), > var=round(var=runif(40))) > > I do the runs test on "var" and it works > runs.test(as.factor(exampledata$var))> I can catagorise the data by "interviewer" and get means using the by() > statement and that works perfectly > by(exampledata$var, exampledata$interviewer, mean) >> > Why is it impossible to use runs.test() as the function in the by() > statement instead of mean > by(exampledata, exampledata$interviewer, > runs.test(as.factor(exampledata$var))) > Error in FUN(X[[1L]], ...) : could not find function "FUN" > > thanks > Christiaan >
I always have trouble with "by" myself. I don't usually get an answer as quickly as I did here. I usually give up and move to "aggregate" or "ave". First I limited the data given to "by" in the first argument to just the variable under question. Then by the time R sends the information to your function it may not be named what it was in the global environment, so I packaged those two functions "runs.test" and "as.factor" inside a packaging function that converted its name to "x". Seems to give possibly sensible results. See if this works for you. > by(exampledata$var, exampledata$interviewer, function(x) {runs.test(as.factor(x))} ) exampledata$interviewer: a Runs Test data: as.factor(x) Standard Normal = -0.8823, p-value = 0.3776 alternative hypothesis: two.sided ---------------------------------------------------------------------------------------------- exampledata$interviewer: b Runs Test data: as.factor(x) Standard Normal = 0, p-value = 1 alternative hypothesis: two.sided -- David Winsemius On Jan 14, 2009, at 3:13 AM, christiaan pauw wrote:> by(exampledata, exampledata$interviewer, > runs.test(as.factor(exampledata$var)))