Wuming Gong
2005-Sep-02 06:49 UTC
[R] Calculating Goodman-Kurskal's gamma using delta method
Dear list, I have a problem on calculating the standard error of Goodman-Kurskal's gamma using delta method. I exactly follow the method and forumla described in Problem 3.27 of Alan Agresti's Categorical Data Analysis (2nd edition). The data I used is also from the job satisfaction vs. income example from that book. job <- matrix(c(1, 3, 10, 6, 2, 3, 10, 7, 1, 6, 14, 12, 0, 1, 9, 11), nrow = 4, ncol = 4, byrow = TRUE, dimnames = list(c("< 15,000", "15,000 - 25,000", "25,000 - 40,000", "> 40,000"), c("VD", "LD", "MS", "VS"))) The following code is for calculating gamma value, which is consistent with the result presented in section 2.4.5 of that book. C <- 0 D <- 0 for (i in 1:nrow(job)){ for (j in 1:ncol(job)){ pi_c <- 0 pi_d <- 0 for (h in 1:nrow(job)){ for (k in 1:ncol(job)){ if ((h > i & k > j) | (h < i & k < j)){ pi_c <- pi_c + job[h, k]/sum(job) } if ((h > i & k < j) | (h < i & k > j)){ pi_d <- pi_d + job[h, k]/sum(job) } } } C <- C + job[i, j] * pi_c D <- D + job[i, j] * pi_d } } gamma <- (C - D) / (C + D) # gamma = 0.221 for this example. The following code is for calculating stardard error of gamma. sigma.squared <- 0 for (i in 1:nrow(job)){ for (j in 1:ncol(job)){ pi_c <- 0 pi_d <- 0 for (h in 1:nrow(job)){ for (k in 1:ncol(job)){ if ((h > i & k > j) | (h < i & k < j)){ pi_c <- pi_c + job[h, k]/sum(job) } if ((h > i & k < j) | (h < i & k > j)){ pi_d <- pi_d + job[h, k]/sum(job) } } } phi <- 4 * (pi_c * D - pi_d * C) / (C + D)^2 sigma.squared <- sigma.squared + phi^2 } } se <- (sigma.squared/sum(job))^.5 # 0.00748, which is different from the SE 0.117 given in section 3.4.3 of that book. I am not able to figure out what is the problem with my code... Could anyone point out what the problem is? Thanks. Wuming
Frank E Harrell Jr
2005-Sep-02 13:03 UTC
[R] Calculating Goodman-Kurskal's gamma using delta method
Wuming Gong wrote:> Dear list, > > I have a problem on calculating the standard error of > Goodman-Kurskal's gamma using delta method. I exactly follow the > method and forumla described in Problem 3.27 of Alan Agresti's > Categorical Data Analysis (2nd edition). The data I used is also from > the job satisfaction vs. income example from that book. > > job <- matrix(c(1, 3, 10, 6, 2, 3, 10, 7, 1, 6, 14, 12, 0, 1, 9, 11), > nrow = 4, ncol = 4, byrow = TRUE, dimnames = list(c("< 15,000", > "15,000 - 25,000", "25,000 - 40,000", "> 40,000"), c("VD", "LD", "MS", > "VS"))) > > The following code is for calculating gamma value, which is consistent > with the result presented in section 2.4.5 of that book. > > C <- 0 > D <- 0 > for (i in 1:nrow(job)){ > for (j in 1:ncol(job)){ > pi_c <- 0 > pi_d <- 0 > for (h in 1:nrow(job)){ > for (k in 1:ncol(job)){ > if ((h > i & k > j) | (h < i & k < j)){ > pi_c <- pi_c + job[h, k]/sum(job) > } > > if ((h > i & k < j) | (h < i & k > j)){ > pi_d <- pi_d + job[h, k]/sum(job) > } > } > } > > C <- C + job[i, j] * pi_c > D <- D + job[i, j] * pi_d > } > } > gamma <- (C - D) / (C + D) # gamma = 0.221 for this example. > > The following code is for calculating stardard error of gamma. > sigma.squared <- 0 > for (i in 1:nrow(job)){ > for (j in 1:ncol(job)){ > pi_c <- 0 > pi_d <- 0 > for (h in 1:nrow(job)){ > for (k in 1:ncol(job)){ > if ((h > i & k > j) | (h < i & k < j)){ > pi_c <- pi_c + job[h, k]/sum(job) > } > > if ((h > i & k < j) | (h < i & k > j)){ > pi_d <- pi_d + job[h, k]/sum(job) > } > } > } > phi <- 4 * (pi_c * D - pi_d * C) / (C + D)^2 > > sigma.squared <- sigma.squared + phi^2 > } > } > > se <- (sigma.squared/sum(job))^.5 # 0.00748, which is different from > the SE 0.117 given in section 3.4.3 of that book. > > I am not able to figure out what is the problem with my code... Could > anyone point out what the problem is? > > Thanks. > > WumingSave your time (and much execution time) by using the Hmisc package's rcorr.cens function with the argument outx=TRUE. rcorr.cens using a standard U-statistic variance estimator. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University