Dear R-experts, While comparing groups, it is better to assess confidence intervals of those differences rather than comparing confidence intervals for each group. I am trying to calculate the CIs of the difference between the two Cramer's V and not the CI to the estimate of each group?s Cramer's V. Here below my toy R example. There are error messages. Any help would be highly appreciated. ############################## library(questionr) library(boot) gender1<-c("M","F","F","F","M","M","F","F","F","M","M","F","M","M","F","M","M","F","M","F","F","F","M","M","M","F","F","M","M","M","F","M","F","F","F","M","M","F","M","F") color1<-c("blue","green","black","black","green","green","blue","blue","green","black","blue","green","blue","black","black","blue","green","blue","green","black","blue","blue","black","black","green","green","blue","green","black","green","blue","black","black","blue","green","green","green","blue","blue","black") gender2<-c("F","F","F","M","M","F","M","M","M","F","F","M","F","M","F","F","M","M","M","F","M","M","M","F","F","F","M","M","M","F","M","M","M","F","F","F","M","F","F","F") color2<-c("green","blue","black","blue","blue","blue","green","blue","green","black","blue","black","blue","blue","black","blue","blue","green","blue","black","blue","blue","black","black","green","blue","black","green","blue","green","black","blue","black","blue","green","blue","green","green","blue","black") f1=data.frame(gender1,color1) tab1<-table(gender1,color1) e1<-cramer.v(tab1) f2=data.frame(gender2,color2) tab2<-table(gender2,color2) e2<-cramer.v(tab2) f3<-data.frame(e1-e2) cramerdiff=function(x,w){ y<-tapply(x[w,1], x[w,2],cramer.v) y[1]-y[2] } results<-boot(data=f3,statistic=cramerdiff,R=2000) results boot.ci(results,type="all") ############################## ?
Daniel Nordlund
2022-Jun-04 07:31 UTC
[R] bootstrap CI of the difference between 2 Cramer's V
On 5/28/2022 11:21 AM, varin sacha via R-help wrote:> Dear R-experts, > > While comparing groups, it is better to assess confidence intervals of those differences rather than comparing confidence intervals for each group. > I am trying to calculate the CIs of the difference between the two Cramer's V and not the CI to the estimate of each group?s Cramer's V. > > Here below my toy R example. There are error messages. Any help would be highly appreciated. > > ############################## > library(questionr) > library(boot) > > gender1<-c("M","F","F","F","M","M","F","F","F","M","M","F","M","M","F","M","M","F","M","F","F","F","M","M","M","F","F","M","M","M","F","M","F","F","F","M","M","F","M","F") > color1<-c("blue","green","black","black","green","green","blue","blue","green","black","blue","green","blue","black","black","blue","green","blue","green","black","blue","blue","black","black","green","green","blue","green","black","green","blue","black","black","blue","green","green","green","blue","blue","black") > > gender2<-c("F","F","F","M","M","F","M","M","M","F","F","M","F","M","F","F","M","M","M","F","M","M","M","F","F","F","M","M","M","F","M","M","M","F","F","F","M","F","F","F") > color2<-c("green","blue","black","blue","blue","blue","green","blue","green","black","blue","black","blue","blue","black","blue","blue","green","blue","black","blue","blue","black","black","green","blue","black","green","blue","green","black","blue","black","blue","green","blue","green","green","blue","black") > > f1=data.frame(gender1,color1) > tab1<-table(gender1,color1) > e1<-cramer.v(tab1) > > f2=data.frame(gender2,color2) > tab2<-table(gender2,color2) > e2<-cramer.v(tab2) > > f3<-data.frame(e1-e2) > > cramerdiff=function(x,w){ > y<-tapply(x[w,1], x[w,2],cramer.v) > y[1]-y[2] > } > > results<-boot(data=f3,statistic=cramerdiff,R=2000) > results > > boot.ci(results,type="all") > ############################## > > > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.I don't know if someone responded offline, but if not, there are a couple of problems with your code. ? First, the f3 dataframe is not what you think it is.? Second, your cramerdiff function isn't going to produce the results that you want. I would put your data into a single dataframe with a variable designating which group data came from.? Then use that variable as the strata variable in the boot function to resample within groups.? So something like this: f1 <- data.frame(gender=gender1,color=color1,group='grp1') f2 <- data.frame(gender=gender2,color=color2,group='grp2') f3 <- rbind(f1,f2) cramerdiff <- function(x, ndx) { ?? # calculate cramer.v for group 1 bootstrap sample ?? g1 <-x[ndx,][x[,3]=='grp1',] ?? cramer_g1 <- cramer.v(table(g1[,1:2])) ?? # calculate cramer.v for group 2 bootstrap sample ?? g2 <-x[ndx,][x[,3]=='grp2',] ?? cramer_g2 <- cramer.v(table(g2[,1:2])) ?? # calculate difference ?? cramer_g1-cramer_g2 ?? } # use strata parameter in function boot to resample within each group results <- boot(data=f3,statistic=cramerdiff, strata=as.factor(f3$group),R=2000) results boot.ci(results) Hope this is helpful, Dan -- Daniel Nordlund Port Townsend, WA USA -- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus