Ein eingebundener Text mit undefiniertem Zeichensatz wurde abgetrennt. Name: nicht verf?gbar URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20100416/d33e3e7f/attachment.pl>
On Fri, 16 Apr 2010, Fischer, Felix wrote:> Hello everyone, > > i have a question regarding the sampling process in boot()."PLEASE ... provide commented, minimal, self-contained, reproducible code." Which means something a correspondent could actually run. But before that, a careful reading of ?boot should get you started. Note these bits: Arguments: data: The data as a vector, ... statistic: A function which when applied to data returns a vector containing the statistic(s) of interest. When sim="parametric", [snip] In all other cases statistic must take at least two arguments. The first argument passed will always be the original data. The second will be a vector of indices, frequencies or weights which define the bootstrap sample. ... HTH, Chuck> > I try to bootstrap F-values for a repeated measures ANOVA to get a > confidence interval of F-values. Unfortunately, while the aov works > fine, it fails in the boot()-function. I think the problem might be that > the resampling process fails to select both lines of data representing > the 2 measuring times for one subject and I therefore get missing cases. > > The data is organised like this: > subject ort mz PHQ > 1 1 1 x > 1 1 2 y > 2 1 1 z > 2 1 2 zz > ... > > > Is there any way to specify, that both lines need to be selected? > > > Thanks a lot! > Felix Fischer > > P.S. If you need to have a look to my code: > > F_values <- function(formula, data, indices) { > d <- data[indices,] # allows boot to select sample > fit=aov(formula,data=d) #fit model > return(c(summary(fit)[1][[1]][[1]]$`F value`, summary(fit)[2][[1]][[1]]$`F value`)) #return F-values > } > > results <- boot(data=anova.daten, statistic=F_values, > R=10, formula=PHQ_Sum_score~mz*ort+Error(subject/mz)) > > > Dipl. Psych. Felix Fischer > > Medizinische Klinik mit Schwerpunkt Psychosomatik > Charit? -- Universit?tsmedizin Berlin > Luisenstr. 13a > 10117 Berlin > > Tel.: 030 - 450 553575 > Email: felix.fischer at charite.de<mailto:felix.fischer at charite.de> > > > [[alternative HTML version deleted]] > >Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cberry at tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
Thank you for your answer. Sorry for the missing example.
In fact, i think, i solved the issue by some data-manipulations in the function.
I splitted the data (one set for each measuring time), selected the cases at
random, and then combined the two measuring times again. Results look promising
to me, but if someone is aware of problems, please let me know.
This code should run:
library(boot)
anova.daten=data.frame(subject=sort(rep(1:10,2)), mz=rep(1:2,10),
ort=sort(rep(1:2,10)),PHQ_Sum_score=rnorm(20,10,2))  #generate data
summary(aov(PHQ_Sum_score~mz*ort+Error(subject/mz),data=anova.daten))
 F_values <- function(formula, data1, indices) {
    data2=subset(data1, data1$mz==2)  #subsetting data for each measuring time
    data3=subset(data1, data1$mz==1)
    data4 <- data3[indices,] # allows boot to select sample
    subjekte=na.omit(data4$subject)
    data5=rbind(data3[subjekte,], data2[subjekte,]) #combine data
    data5$subject=factor(rep(1:length(subjekte),2)) #convert repeated subjects
to unique subjects
    fit=aov(formula,data=data5)                #fit model
    return(c(summary(fit)[1][[1]][[1]]$`F value`, summary(fit)[2][[1]][[1]]$`F
value`))     #return F-values
    }
  
  results <- boot(data=anova.daten, statistic=F_values,           
     R=10, formula=PHQ_Sum_score~mz*ort+Error(subject/mz))      #bootstrap
Thanks a lot,
Felix Fischer