Stewart Schultz
2007-May-23 11:42 UTC
[R] Replicated LR goodness-of-fit tests, heterogeneity G, with loglm?
I have numerous replicated goodness-of-fit experiments (observed compared to expected counts in categories) and these replicates are nested within a factor. The expected counts in each cell are external (from a scientific model being tested). The calculations I need within each level of the nesting factor are a heterogeneity G test, with the total G and the pooled G across replicates. Then I would like to form an F ratio equal to the ratio of pooled G divided by its degrees of freedom to heterogeneity G divided by its degrees of freedom. The F ratio would (I think) test the hypothesis that the badness-of-fit in the pooled data is greater than would be expected by chance from the heterogeneity among replicates. It seems that the function loglm is the closest within R to what I want. But I can¢t see how it can be used when the expected proportions are externally provided. I¢ve appended here a function I wrote that more or less does what I want (with the nesting factor ignored) but I would prefer to use something like loglm because of the additional information it offers and its flexibility with hierarchical models. Thanks, Stu --------- hetg = function() { #creating some fake data tran=gl(10,10) #tran are the random replicates cov= factor(rep(1:10,10)) #10 levels of this factor for each level of tran tleng = c(219, 312, 178, 322, 311, 242, 235, 235, 266, 193) #weighting for #each transect (of each level of tran) obscounts = rpois(100, 50) #these are the observed response count data expprop = rep(c(.1, .1, .1, .05, .15, .1, .1, .1, .1, .1), 10) #expected # proportion of counts within 10 levels within each level of tran vv = tapply(obscounts, tran, sum) #get marginal sum of counts rr = rep(vv, each=10) #rep the marginal sums across each cell expcounts = expprop*rr #the vector of expected counts #Now calculate the G for each level of tran (likelihood ratios) G = vector() for (i in levels(tran)) { obsi obscounts[tran==i] expi expcounts[tran==i] G[i] 2*sum((obsi*log(obsi/expi))) } dfs = rep(max(as.integer(levels(cov)))-1, max(as.integer(levels(cov)))) probs = pchisq(G, dfs, lower.tail=F) #and the lower tail probability of the G #get a weighted average for the pooled expectations weightexp = list() weights = vector() for (i in levels(tran)) { expi = expprop[tran==i] lengi tleng[as.integer(i)] counti sum(obscounts[tran==i]) weightexp[[i]] expi*lengi*counti weights[i] lengi*counti } sum = rep(0, length(weightexp[[1]])) for (i in 1:length(weightexp)) { sum = sum + weightexp[[i]] } expproppooled sum/sum(weights) #Now the pooled G obs tapply(obscounts,cov,sum) exp expproppooled*sum(obs) Gp 2*sum(obs*log(obs/exp)) dfp = max(as.integer(levels(cov)))-1 probp = pchisq(Gp, dfp, lower.tail=F) #total G Gt = sum(G) dft = sum(dfs) probt = pchisq(Gt, dft, lower.tail=F) #heterogeneity G Gh = Gt-Gp dfh = dft - dfp probh = pchisq(Gh, dfh, lower.tail=F) #F ratio Fratio (Gp/dfp)/(Gh/dfh) probf = pf(Fratio, dfp, dfh, lower.tail=F) res = list(TotG=c(Gt, dft, probt), PooledG=c(Gp, dfp, probp), HetG=c(Gh, dfh, probh), Fratio=c(Fratio, dfp, dfh, probf)) res } #end hetg ____________________________________________________________________________________Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. [[alternative HTML version deleted]]