Hi everybody
I am a recent convert from SAS so please excuse me if this is all very
obvious:
I want to use the runs test {runs.test() in package tseries} to test the
randomness of a certain variable in a survey for each interviewer. I tried
to us the by() statement but it doesn't seem to work with runs.test() as the
function.
Here is what I have: Consider a data frame with two variables and 40
observations. Column 1 is the name of the interviewer and 2 is a variable
that could be either 0 or 1. ] Thus "interviewer" and
"var". This generates
such a data frame
exampledata<-data.frame(interviewer=rep(letters[1:2], 1),
var=round(var=runif(40)))
I do the runs test on "var" and it works
runs.test(as.factor(exampledata$var))
Runs Test
data: as.factor(exampledata$var)
Standard Normal = -1.626, p-value = 0.1039
alternative hypothesis: two.sided
I can catagorise the data by "interviewer" and get means using the
by()
statement and that works perfectly
by(exampledata$var, exampledata$interviewer, mean)
exampledata$interviewer: a
[1] 0.4
------------------------------------------------------------
exampledata$interviewer: b
[1] 0.35
Why is it impossible to use runs.test() as the function in the by()
statement instead of mean
by(exampledata, exampledata$interviewer,
runs.test(as.factor(exampledata$var)))
Error in FUN(X[[1L]], ...) : could not find function "FUN"
Can someone please tell me why this is the case. I tried aggregate() too but
with the same result
thanks
Christiaan
[[alternative HTML version deleted]]
Thanks to Patric and Christian for swift and accurate help I still had to learn the differnce between calling a function and passing a call to a function. I had to coerce the first variable to be a function but then it worked. by(as.factor(exampledata$var), exampledata$interviewer, runs.test) Regards Christiaan On 14/01/2009, christiaan pauw <cjpauw at gmail.com> wrote:> exampledata<-data.frame(interviewer=rep(letters[1:2], 1), > var=round(var=runif(40))) > > I do the runs test on "var" and it works > runs.test(as.factor(exampledata$var))> I can catagorise the data by "interviewer" and get means using the by() > statement and that works perfectly > by(exampledata$var, exampledata$interviewer, mean) >> > Why is it impossible to use runs.test() as the function in the by() > statement instead of mean > by(exampledata, exampledata$interviewer, > runs.test(as.factor(exampledata$var))) > Error in FUN(X[[1L]], ...) : could not find function "FUN" > > thanks > Christiaan >
I always have trouble with "by" myself. I don't usually get an
answer
as quickly as I did here. I usually give up and move to "aggregate" or
"ave".
First I limited the data given to "by" in the first argument to just
the variable under question. Then by the time R sends the information
to your function it may not be named what it was in the global
environment, so I packaged those two functions "runs.test" and
"as.factor" inside a packaging function that converted its name to
"x". Seems to give possibly sensible results. See if this works for
you.
> by(exampledata$var, exampledata$interviewer,
function(x) {runs.test(as.factor(x))} )
exampledata$interviewer: a
Runs Test
data: as.factor(x)
Standard Normal = -0.8823, p-value = 0.3776
alternative hypothesis: two.sided
----------------------------------------------------------------------------------------------
exampledata$interviewer: b
Runs Test
data: as.factor(x)
Standard Normal = 0, p-value = 1
alternative hypothesis: two.sided
--
David Winsemius
On Jan 14, 2009, at 3:13 AM, christiaan pauw wrote:
> by(exampledata, exampledata$interviewer,
> runs.test(as.factor(exampledata$var)))