Ein eingebundener Text mit undefiniertem Zeichensatz wurde abgetrennt. Name: nicht verf?gbar URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20100416/d33e3e7f/attachment.pl>
On Fri, 16 Apr 2010, Fischer, Felix wrote:> Hello everyone, > > i have a question regarding the sampling process in boot()."PLEASE ... provide commented, minimal, self-contained, reproducible code." Which means something a correspondent could actually run. But before that, a careful reading of ?boot should get you started. Note these bits: Arguments: data: The data as a vector, ... statistic: A function which when applied to data returns a vector containing the statistic(s) of interest. When sim="parametric", [snip] In all other cases statistic must take at least two arguments. The first argument passed will always be the original data. The second will be a vector of indices, frequencies or weights which define the bootstrap sample. ... HTH, Chuck> > I try to bootstrap F-values for a repeated measures ANOVA to get a > confidence interval of F-values. Unfortunately, while the aov works > fine, it fails in the boot()-function. I think the problem might be that > the resampling process fails to select both lines of data representing > the 2 measuring times for one subject and I therefore get missing cases. > > The data is organised like this: > subject ort mz PHQ > 1 1 1 x > 1 1 2 y > 2 1 1 z > 2 1 2 zz > ... > > > Is there any way to specify, that both lines need to be selected? > > > Thanks a lot! > Felix Fischer > > P.S. If you need to have a look to my code: > > F_values <- function(formula, data, indices) { > d <- data[indices,] # allows boot to select sample > fit=aov(formula,data=d) #fit model > return(c(summary(fit)[1][[1]][[1]]$`F value`, summary(fit)[2][[1]][[1]]$`F value`)) #return F-values > } > > results <- boot(data=anova.daten, statistic=F_values, > R=10, formula=PHQ_Sum_score~mz*ort+Error(subject/mz)) > > > Dipl. Psych. Felix Fischer > > Medizinische Klinik mit Schwerpunkt Psychosomatik > Charit? -- Universit?tsmedizin Berlin > Luisenstr. 13a > 10117 Berlin > > Tel.: 030 - 450 553575 > Email: felix.fischer at charite.de<mailto:felix.fischer at charite.de> > > > [[alternative HTML version deleted]] > >Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cberry at tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
Thank you for your answer. Sorry for the missing example. In fact, i think, i solved the issue by some data-manipulations in the function. I splitted the data (one set for each measuring time), selected the cases at random, and then combined the two measuring times again. Results look promising to me, but if someone is aware of problems, please let me know. This code should run: library(boot) anova.daten=data.frame(subject=sort(rep(1:10,2)), mz=rep(1:2,10), ort=sort(rep(1:2,10)),PHQ_Sum_score=rnorm(20,10,2)) #generate data summary(aov(PHQ_Sum_score~mz*ort+Error(subject/mz),data=anova.daten)) F_values <- function(formula, data1, indices) { data2=subset(data1, data1$mz==2) #subsetting data for each measuring time data3=subset(data1, data1$mz==1) data4 <- data3[indices,] # allows boot to select sample subjekte=na.omit(data4$subject) data5=rbind(data3[subjekte,], data2[subjekte,]) #combine data data5$subject=factor(rep(1:length(subjekte),2)) #convert repeated subjects to unique subjects fit=aov(formula,data=data5) #fit model return(c(summary(fit)[1][[1]][[1]]$`F value`, summary(fit)[2][[1]][[1]]$`F value`)) #return F-values } results <- boot(data=anova.daten, statistic=F_values, R=10, formula=PHQ_Sum_score~mz*ort+Error(subject/mz)) #bootstrap Thanks a lot, Felix Fischer