thr3ads.net - R help - [R] bootstrap CI of the difference between 2 Cramer's V [Jun 2022]

If this information is useful, please help other people find it:
Share via:

varin sacha

2022-Jun-05 16:21 UTC

[R] bootstrap CI of the difference between 2 Cramer's V

Dear Daniel,
Dear R-experts,

I really thank you a lot Daniel. Nobody had answered to me offline. So, thanks.
I have tried in the same vein for the Goodman-Kruskal gamma for ordinal data.
There is an error message at the end of the code. Thanks for your help.


##############################
library(ryouready)
library(boot)

shopping1<-c("tr?s important","important","pas
important","pas important","important","tr?s
important","important","pas important","tr?s
important","tr?s important","important","pas
important","pas important","important","tr?s
important","tr?s important","important","pas
important","pas important","important","tr?s
important","tr?s important","important","pas
important","pas important","important","tr?s
important","tr?s important","important","pas
important","pas important","important","tr?s
important","tr?s important","important","pas
important","pas important","important","tr?s
important","important")

statut1<-c("riche","pas riche","moyennement
riche","moyennement riche","riche","pas
riche","moyennement riche","moyennement
riche","riche","pas riche","moyennement
riche","riche","pas riche","pas
riche","riche","moyennement
riche","riche","pas riche","pas
riche","pas
riche","riche","riche","moyennement
riche","riche","riche","moyennement
riche","moyennement riche","moyennement
riche","pas riche","pas
riche","riche","pas riche","riche","pas
riche","riche","moyennement
riche","riche","pas riche","moyennement
riche","riche")

shopping2<-c("important","pas important","tr?s
important","tr?s important","important","tr?s
important","pas important","important","pas
important","tr?s
important","important","important","important","important","pas
important","tr?s important","tr?s
important","important","pas important","tr?s
important","pas important","tr?s important","pas
important","tr?s important","important","tr?s
important","important","pas important","pas
important","important","pas important","tr?s
important","pas important","pas
important","important","important","tr?s
important","tr?s important","pas important","pas
important")

statut2<-c("moyennement riche","pas
riche","riche","moyennement riche","moyennement
riche","moyennement riche","pas
riche","riche","riche","pas
riche","moyennement
riche","riche","riche","riche","riche","riche","pas
riche","moyennement riche","moyennement
riche","pas riche","moyennement riche","pas
riche","pas riche","pas riche","moyennement
riche","riche","moyennement
riche","riche","pas
riche","riche","moyennement
riche","blue","moyennement riche","pas
riche","pas riche","riche","riche","pas
riche","pas riche","pas riche")

f1 <- data.frame(shopping=shopping1,statut=statut1,group='grp1')
f2 <- data.frame(shopping=shopping2,statut=statut2,group='grp2')
f3 <- rbind(f1,f2)

G <- function(x, index) {
?? 
# calculate goodman for group 1 bootstrap sample
?? g1 <-x[index,][x[,3]=='grp1',]
?? goodman_g1 <- cor(data[index,][1,2])
??
?# calculate goodman for group 2 bootstrap sample
?? g2 <-x[index,][x[,3]=='grp2',]
?? goodman_g2 <- cor(data[index,][3,4])
??
?# calculate difference
?? goodman_g1-goodman_g2
?? }
?

# use strata parameter in function boot to resample within each group
results <- boot(data=f3,statistic=G, strata=as.factor(f3$group),R=2000)

results
boot.ci(results)
##############################



Le samedi 4 juin 2022 ? 09:31:36 UTC+2, Daniel Nordlund <djnordlund at
gmail.com> a ?crit :





On 5/28/2022 11:21 AM, varin sacha via R-help wrote:> Dear R-experts,
>
> While comparing groups, it is better to assess confidence intervals of
those differences rather than comparing confidence intervals for each group.
> I am trying to calculate the CIs of the difference between the two
Cramer's V and not the CI to the estimate of each group?s Cramer's V.
>
> Here below my toy R example. There are error messages. Any help would be
highly appreciated.
>
> ##############################
> library(questionr)
> library(boot)
>
>
gender1<-c("M","F","F","F","M","M","F","F","F","M","M","F","M","M","F","M","M","F","M","F","F","F","M","M","M","F","F","M","M","M","F","M","F","F","F","M","M","F","M","F")
>
color1<-c("blue","green","black","black","green","green","blue","blue","green","black","blue","green","blue","black","black","blue","green","blue","green","black","blue","blue","black","black","green","green","blue","green","black","green","blue","black","black","blue","green","green","green","blue","blue","black")
>
>
gender2<-c("F","F","F","M","M","F","M","M","M","F","F","M","F","M","F","F","M","M","M","F","M","M","M","F","F","F","M","M","M","F","M","M","M","F","F","F","M","F","F","F")
>
color2<-c("green","blue","black","blue","blue","blue","green","blue","green","black","blue","black","blue","blue","black","blue","blue","green","blue","black","blue","blue","black","black","green","blue","black","green","blue","green","black","blue","black","blue","green","blue","green","green","blue","black")
>
> f1=data.frame(gender1,color1)
> tab1<-table(gender1,color1)
> e1<-cramer.v(tab1)
>
> f2=data.frame(gender2,color2)
> tab2<-table(gender2,color2)
> e2<-cramer.v(tab2)
>
> f3<-data.frame(e1-e2)
>
> cramerdiff=function(x,w){
> y<-tapply(x[w,1], x[w,2],cramer.v)
> y[1]-y[2]
> }
>
> results<-boot(data=f3,statistic=cramerdiff,R=2000)
> results
>
> boot.ci(results,type="all")
> ##############################
>
>? 
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
I don't know if someone responded offline, but if not, there are a 
couple of problems with your code. ? First, the f3 dataframe is not what 
you think it is.? Second, your cramerdiff function isn't going to 
produce the results that you want.

I would put your data into a single dataframe with a variable 
designating which group data came from.? Then use that variable as the
strata variable in the boot function to resample within groups.? So 
something like this:

f1 <- data.frame(gender=gender1,color=color1,group='grp1')
f2 <- data.frame(gender=gender2,color=color2,group='grp2')
f3 <- rbind(f1,f2)

cramerdiff <- function(x, ndx) {
?? # calculate cramer.v for group 1 bootstrap sample
?? g1 <-x[ndx,][x[,3]=='grp1',]
?? cramer_g1 <- cramer.v(table(g1[,1:2]))
?? # calculate cramer.v for group 2 bootstrap sample
?? g2 <-x[ndx,][x[,3]=='grp2',]
?? cramer_g2 <- cramer.v(table(g2[,1:2]))
?? # calculate difference
?? cramer_g1-cramer_g2
?? }
# use strata parameter in function boot to resample within each group
results <- boot(data=f3,statistic=cramerdiff, 
strata=as.factor(f3$group),R=2000)

results
boot.ci(results)



Hope this is helpful,

Dan

-- 
Daniel Nordlund
Port Townsend, WA? USA


-- 
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

Daniel Nordlund

2022-Jun-06 00:40 UTC

head link

[R] bootstrap CI of the difference between 2 Cramer's V

There are a few problems with the "rewrite" of the code, both 
syntactically and conceptually.
1. Goodman-Kruskal gamma is for ordinal data.? You should create your 
"shopping" and "statut" variables as factors, ordered from
lowest to
highest using the levels= parameter in? the function, factor
2. In your function, G, you use "data[index,][1,2]"? where you should 
have used either "g1[,c(1,2)]", or "g2[,c(1,2)]".? You
should read up on
Indexing using [] on data frames, to make sure you understand what the 
original code was doing.
3.? The base cor function does not calculate a Goodman-Kruskal gamma 
(unless somebody has written a new version).? So you need to find an 
appropriate function and you may need to structure your data differently 
for calculating gamma, depending on what parameters the function 
demands.? Google is your friend here, search for??? "R Goodman Kruskal 
gamma"

Since this is looking like homework to me, I suggest you ask your 
instructor about some of this.

Best of luck,

Dan


On 6/5/2022 9:21 AM, varin sacha wrote:> Dear Daniel,
> Dear R-experts,
>
> I really thank you a lot Daniel. Nobody had answered to me offline. So,
thanks.
> I have tried in the same vein for the Goodman-Kruskal gamma for ordinal
data. There is an error message at the end of the code. Thanks for your help.
>
>
> ##############################
> library(ryouready)
> library(boot)
>
> shopping1<-c("tr?s important","important","pas
important","pas important","important","tr?s
important","important","pas important","tr?s
important","tr?s important","important","pas
important","pas important","important","tr?s
important","tr?s important","important","pas
important","pas important","important","tr?s
important","tr?s important","important","pas
important","pas important","important","tr?s
important","tr?s important","important","pas
important","pas important","important","tr?s
important","tr?s important","important","pas
important","pas important","important","tr?s
important","important")
>
> statut1<-c("riche","pas riche","moyennement
riche","moyennement riche","riche","pas
riche","moyennement riche","moyennement
riche","riche","pas riche","moyennement
riche","riche","pas riche","pas
riche","riche","moyennement
riche","riche","pas riche","pas
riche","pas
riche","riche","riche","moyennement
riche","riche","riche","moyennement
riche","moyennement riche","moyennement
riche","pas riche","pas
riche","riche","pas riche","riche","pas
riche","riche","moyennement
riche","riche","pas riche","moyennement
riche","riche")
>
> shopping2<-c("important","pas important","tr?s
important","tr?s important","important","tr?s
important","pas important","important","pas
important","tr?s
important","important","important","important","important","pas
important","tr?s important","tr?s
important","important","pas important","tr?s
important","pas important","tr?s important","pas
important","tr?s important","important","tr?s
important","important","pas important","pas
important","important","pas important","tr?s
important","pas important","pas
important","important","important","tr?s
important","tr?s important","pas important","pas
important")
>
> statut2<-c("moyennement riche","pas
riche","riche","moyennement riche","moyennement
riche","moyennement riche","pas
riche","riche","riche","pas
riche","moyennement
riche","riche","riche","riche","riche","riche","pas
riche","moyennement riche","moyennement
riche","pas riche","moyennement riche","pas
riche","pas riche","pas riche","moyennement
riche","riche","moyennement
riche","riche","pas
riche","riche","moyennement
riche","blue","moyennement riche","pas
riche","pas riche","riche","riche","pas
riche","pas riche","pas riche")
>
> f1 <- data.frame(shopping=shopping1,statut=statut1,group='grp1')
> f2 <- data.frame(shopping=shopping2,statut=statut2,group='grp2')
> f3 <- rbind(f1,f2)
>
> G <- function(x, index) {
>     
> # calculate goodman for group 1 bootstrap sample
>  ?? g1 <-x[index,][x[,3]=='grp1',]
>  ?? goodman_g1 <- cor(data[index,][1,2])
>    
>  ?# calculate goodman for group 2 bootstrap sample
>  ?? g2 <-x[index,][x[,3]=='grp2',]
>  ?? goodman_g2 <- cor(data[index,][3,4])
>    
>  ?# calculate difference
>  ?? goodman_g1-goodman_g2
>  ?? }
>   
>
> # use strata parameter in function boot to resample within each group
> results <- boot(data=f3,statistic=G, strata=as.factor(f3$group),R=2000)
>
> results
> boot.ci(results)
> ##############################
>
>
>
> Le samedi 4 juin 2022 ? 09:31:36 UTC+2, Daniel Nordlund <djnordlund at
gmail.com> a ?crit :
>
>
>
>
>
> On 5/28/2022 11:21 AM, varin sacha via R-help wrote:
>> Dear R-experts,
>>
>> While comparing groups, it is better to assess confidence intervals of
those differences rather than comparing confidence intervals for each group.
>> I am trying to calculate the CIs of the difference between the two
Cramer's V and not the CI to the estimate of each group?s Cramer's V.
>>
>> Here below my toy R example. There are error messages. Any help would
be highly appreciated.
>>
>> ##############################
>> library(questionr)
>> library(boot)
>>
>>
gender1<-c("M","F","F","F","M","M","F","F","F","M","M","F","M","M","F","M","M","F","M","F","F","F","M","M","M","F","F","M","M","M","F","M","F","F","F","M","M","F","M","F")
>>
color1<-c("blue","green","black","black","green","green","blue","blue","green","black","blue","green","blue","black","black","blue","green","blue","green","black","blue","blue","black","black","green","green","blue","green","black","green","blue","black","black","blue","green","green","green","blue","blue","black")
>>
>>
gender2<-c("F","F","F","M","M","F","M","M","M","F","F","M","F","M","F","F","M","M","M","F","M","M","M","F","F","F","M","M","M","F","M","M","M","F","F","F","M","F","F","F")
>>
color2<-c("green","blue","black","blue","blue","blue","green","blue","green","black","blue","black","blue","blue","black","blue","blue","green","blue","black","blue","blue","black","black","green","blue","black","green","blue","green","black","blue","black","blue","green","blue","green","green","blue","black")
>>
>> f1=data.frame(gender1,color1)
>> tab1<-table(gender1,color1)
>> e1<-cramer.v(tab1)
>>
>> f2=data.frame(gender2,color2)
>> tab2<-table(gender2,color2)
>> e2<-cramer.v(tab2)
>>
>> f3<-data.frame(e1-e2)
>>
>> cramerdiff=function(x,w){
>> y<-tapply(x[w,1], x[w,2],cramer.v)
>> y[1]-y[2]
>> }
>>
>> results<-boot(data=f3,statistic=cramerdiff,R=2000)
>> results
>>
>> boot.ci(results,type="all")
>> ##############################
>>
>>    
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> I don't know if someone responded offline, but if not, there are a
> couple of problems with your code. ? First, the f3 dataframe is not what
> you think it is.? Second, your cramerdiff function isn't going to
> produce the results that you want.
>
> I would put your data into a single dataframe with a variable
> designating which group data came from.? Then use that variable as the
> strata variable in the boot function to resample within groups.? So
> something like this:
>
> f1 <- data.frame(gender=gender1,color=color1,group='grp1')
> f2 <- data.frame(gender=gender2,color=color2,group='grp2')
> f3 <- rbind(f1,f2)
>
> cramerdiff <- function(x, ndx) {
>  ?? # calculate cramer.v for group 1 bootstrap sample
>  ?? g1 <-x[ndx,][x[,3]=='grp1',]
>  ?? cramer_g1 <- cramer.v(table(g1[,1:2]))
>  ?? # calculate cramer.v for group 2 bootstrap sample
>  ?? g2 <-x[ndx,][x[,3]=='grp2',]
>  ?? cramer_g2 <- cramer.v(table(g2[,1:2]))
>  ?? # calculate difference
>  ?? cramer_g1-cramer_g2
>  ?? }
> # use strata parameter in function boot to resample within each group
> results <- boot(data=f3,statistic=cramerdiff,
> strata=as.factor(f3$group),R=2000)
>
> results
> boot.ci(results)
>
>
>
> Hope this is helpful,
>
> Dan
>
-- 
Daniel Nordlund
Port Townsend, WA  USA


-- 
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

Rui Barradas

2022-Jun-06 19:17 UTC

head link

[R] bootstrap CI of the difference between 2 Cramer's V

Hello,

Here is your code, corrected. It uses a Goodman-Kruskal gamma function 
in package DescTools.
Function G probably doesn't need tryCatch but I had errors in some test 
runs.


library(boot)

G <- function(x, index, cols = 1:2) {
   y <- x[index, cols]    # bootstrapped data
   g <- x$group[index]    # and groups
   # calculate gamma for each group bootstrap sample
   # (trap errors in case one of the groups is empty)
   g <- tryCatch(
     lapply(split(y[1:2], g), \(x) {
       tbl <- table(x)
       DescTools::GoodmanKruskalGamma(tbl)
     }),
     error = function(e) list(NA_real_, NA_real_)
   )
   # calculate difference
   g[[1]] - g[[2]]
}

set.seed(2022)
# use strata parameter in function boot to resample within each group
results <- boot(
   data = f3,
   statistic = G,
   R = 2000,
   strata = as.factor(f3$group),
   cols = 1:2
)

results
boot.ci(results)


Hope this helps,

Rui Barradas


?s 17:21 de 05/06/2022, varin sacha via R-help escreveu:> Dear Daniel,
> Dear R-experts,
> 
> I really thank you a lot Daniel. Nobody had answered to me offline. So,
thanks.
> I have tried in the same vein for the Goodman-Kruskal gamma for ordinal
data. There is an error message at the end of the code. Thanks for your help.
> 
> 
> ##############################
> library(ryouready)
> library(boot)
> 
> shopping1<-c("tr?s important","important","pas
important","pas important","important","tr?s
important","important","pas important","tr?s
important","tr?s important","important","pas
important","pas important","important","tr?s
important","tr?s important","important","pas
important","pas important","important","tr?s
important","tr?s important","important","pas
important","pas important","important","tr?s
important","tr?s important","important","pas
important","pas important","important","tr?s
important","tr?s important","important","pas
important","pas important","important","tr?s
important","important")
> 
> statut1<-c("riche","pas riche","moyennement
riche","moyennement riche","riche","pas
riche","moyennement riche","moyennement
riche","riche","pas riche","moyennement
riche","riche","pas riche","pas
riche","riche","moyennement
riche","riche","pas riche","pas
riche","pas
riche","riche","riche","moyennement
riche","riche","riche","moyennement
riche","moyennement riche","moyennement
riche","pas riche","pas
riche","riche","pas riche","riche","pas
riche","riche","moyennement
riche","riche","pas riche","moyennement
riche","riche")
> 
> shopping2<-c("important","pas important","tr?s
important","tr?s important","important","tr?s
important","pas important","important","pas
important","tr?s
important","important","important","important","important","pas
important","tr?s important","tr?s
important","important","pas important","tr?s
important","pas important","tr?s important","pas
important","tr?s important","important","tr?s
important","important","pas important","pas
important","important","pas important","tr?s
important","pas important","pas
important","important","important","tr?s
important","tr?s important","pas important","pas
important")
> 
> statut2<-c("moyennement riche","pas
riche","riche","moyennement riche","moyennement
riche","moyennement riche","pas
riche","riche","riche","pas
riche","moyennement
riche","riche","riche","riche","riche","riche","pas
riche","moyennement riche","moyennement
riche","pas riche","moyennement riche","pas
riche","pas riche","pas riche","moyennement
riche","riche","moyennement
riche","riche","pas
riche","riche","moyennement
riche","blue","moyennement riche","pas
riche","pas riche","riche","riche","pas
riche","pas riche","pas riche")
> 
> f1 <- data.frame(shopping=shopping1,statut=statut1,group='grp1')
> f2 <- data.frame(shopping=shopping2,statut=statut2,group='grp2')
> f3 <- rbind(f1,f2)
> 
> G <- function(x, index) {
>     
> # calculate goodman for group 1 bootstrap sample
>  ?? g1 <-x[index,][x[,3]=='grp1',]
>  ?? goodman_g1 <- cor(data[index,][1,2])
>    
>  ?# calculate goodman for group 2 bootstrap sample
>  ?? g2 <-x[index,][x[,3]=='grp2',]
>  ?? goodman_g2 <- cor(data[index,][3,4])
>    
>  ?# calculate difference
>  ?? goodman_g1-goodman_g2
>  ?? }
>   
> 
> # use strata parameter in function boot to resample within each group
> results <- boot(data=f3,statistic=G, strata=as.factor(f3$group),R=2000)
> 
> results
> boot.ci(results)
> ##############################
> 
> 
> 
> Le samedi 4 juin 2022 ? 09:31:36 UTC+2, Daniel Nordlund <djnordlund at
gmail.com> a ?crit :
> 
> 
> 
> 
> 
> On 5/28/2022 11:21 AM, varin sacha via R-help wrote:
>> Dear R-experts,
>>
>> While comparing groups, it is better to assess confidence intervals of
those differences rather than comparing confidence intervals for each group.
>> I am trying to calculate the CIs of the difference between the two
Cramer's V and not the CI to the estimate of each group?s Cramer's V.
>>
>> Here below my toy R example. There are error messages. Any help would
be highly appreciated.
>>
>> ##############################
>> library(questionr)
>> library(boot)
>>
>>
gender1<-c("M","F","F","F","M","M","F","F","F","M","M","F","M","M","F","M","M","F","M","F","F","F","M","M","M","F","F","M","M","M","F","M","F","F","F","M","M","F","M","F")
>>
color1<-c("blue","green","black","black","green","green","blue","blue","green","black","blue","green","blue","black","black","blue","green","blue","green","black","blue","blue","black","black","green","green","blue","green","black","green","blue","black","black","blue","green","green","green","blue","blue","black")
>>
>>
gender2<-c("F","F","F","M","M","F","M","M","M","F","F","M","F","M","F","F","M","M","M","F","M","M","M","F","F","F","M","M","M","F","M","M","M","F","F","F","M","F","F","F")
>>
color2<-c("green","blue","black","blue","blue","blue","green","blue","green","black","blue","black","blue","blue","black","blue","blue","green","blue","black","blue","blue","black","black","green","blue","black","green","blue","green","black","blue","black","blue","green","blue","green","green","blue","black")
>>
>> f1=data.frame(gender1,color1)
>> tab1<-table(gender1,color1)
>> e1<-cramer.v(tab1)
>>
>> f2=data.frame(gender2,color2)
>> tab2<-table(gender2,color2)
>> e2<-cramer.v(tab2)
>>
>> f3<-data.frame(e1-e2)
>>
>> cramerdiff=function(x,w){
>> y<-tapply(x[w,1], x[w,2],cramer.v)
>> y[1]-y[2]
>> }
>>
>> results<-boot(data=f3,statistic=cramerdiff,R=2000)
>> results
>>
>> boot.ci(results,type="all")
>> ##############################
>>
>>    
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> I don't know if someone responded offline, but if not, there are a
> couple of problems with your code. ? First, the f3 dataframe is not what
> you think it is.? Second, your cramerdiff function isn't going to
> produce the results that you want.
> 
> I would put your data into a single dataframe with a variable
> designating which group data came from.? Then use that variable as the
> strata variable in the boot function to resample within groups.? So
> something like this:
> 
> f1 <- data.frame(gender=gender1,color=color1,group='grp1')
> f2 <- data.frame(gender=gender2,color=color2,group='grp2')
> f3 <- rbind(f1,f2)
> 
> cramerdiff <- function(x, ndx) {
>  ?? # calculate cramer.v for group 1 bootstrap sample
>  ?? g1 <-x[ndx,][x[,3]=='grp1',]
>  ?? cramer_g1 <- cramer.v(table(g1[,1:2]))
>  ?? # calculate cramer.v for group 2 bootstrap sample
>  ?? g2 <-x[ndx,][x[,3]=='grp2',]
>  ?? cramer_g2 <- cramer.v(table(g2[,1:2]))
>  ?? # calculate difference
>  ?? cramer_g1-cramer_g2
>  ?? }
> # use strata parameter in function boot to resample within each group
> results <- boot(data=f3,statistic=cramerdiff,
> strata=as.factor(f3$group),R=2000)
> 
> results
> boot.ci(results)
> 
> 
> 
> Hope this is helpful,
> 
> Dan
>

R help - Jun 2022 - bootstrap CI of the difference between 2 Cramer's V

[R] bootstrap CI of the difference between 2 Cramer's V

[R] bootstrap CI of the difference between 2 Cramer's V

[R] bootstrap CI of the difference between 2 Cramer's V