Hi everyone,
I have data from an experiment in which human participants were
instructed to generate a random sequence of yes/no answers under 4
different conditions. I want to test how successful they were in doing
this. More specifically, I want to test the null hypothesis that the 4
conditions come from a single population with a given level of
randomness. Some searching turned up the runs.test() function in the
tseries package. This looks promising, but I'm not sure how to
proceed.
A simplified version of the data structure is
> Data <- data.frame(participant=c(rep(1,4),rep(2,4), rep(3,4), rep(4,4)),
question = factor(rep(c("one", "two", "three",
"four"), 4)), condition = factor(c(rep(1,8), rep(2,8))),
answer=(factor(sign(rnorm(16)), labels=c("yes", "no"))))
> Data
participant question condition answer
1 1 one 1 no
2 1 two 1 yes
3 1 three 1 yes
4 1 four 1 no
5 2 one 1 yes
6 2 two 1 yes
7 2 three 1 no
8 2 four 1 yes
9 3 one 2 yes
10 3 two 2 no
11 3 three 2 yes
12 3 four 2 yes
13 4 one 2 no
14 4 two 2 no
15 4 three 2 no
16 4 four 2 no
My questions are: 1) Can I test my hypothesis using the runs.test()
function? If no, is there a better approach? 2) Does it make sense to
do a runs.test() for each condition, ignoring the participant
variable? Or do I need to do a runs.test() separately for each
participant? If the latter, how can I combine the information to test
for differences across conditions?
Thanks,
Ista