Stewart Schultz
2007-May-23 11:42 UTC
[R] Replicated LR goodness-of-fit tests, heterogeneity G, with loglm?
I have numerous replicated goodness-of-fit experiments (observed compared to
expected counts in categories) and these
replicates are nested within a factor.
The expected counts in each cell are external (from a
scientific model being tested). The
calculations I need within each level of the nesting factor are a heterogeneity
G test, with the total G and the pooled G across replicates. Then I would like
to form an F ratio equal to
the ratio of pooled G divided by its degrees of freedom to heterogeneity G
divided
by its degrees of freedom. The F ratio
would (I think) test the hypothesis that the badness-of-fit in the pooled data
is greater than would be expected by chance from the heterogeneity among
replicates.
It seems that the function loglm is the closest within R to what I want. But I
can¢t see
how it can be used when the expected
proportions are externally provided.
I¢ve appended here a function I wrote that more or less does
what I want (with the nesting factor ignored) but I would prefer to use
something
like loglm because of the additional information it offers and its
flexibility with hierarchical models.
Thanks,
Stu
---------
hetg = function() {
#creating some fake data
tran=gl(10,10) #tran
are the random replicates
cov= factor(rep(1:10,10))
#10 levels of this factor for each level of tran
tleng = c(219, 312, 178, 322, 311, 242, 235, 235, 266, 193)
#weighting for
#each transect (of
each level of tran)
obscounts = rpois(100, 50)
#these are the observed response count data
expprop = rep(c(.1, .1, .1, .05, .15, .1, .1, .1, .1, .1),
10) #expected
# proportion of
counts within 10 levels within each level of tran
vv = tapply(obscounts, tran, sum) #get marginal sum of
counts
rr = rep(vv, each=10) #rep the marginal sums across each
cell
expcounts = expprop*rr #the vector of expected counts
#Now calculate the G for each level of tran (likelihood
ratios)
G = vector()
for (i in levels(tran)) {
obsi obscounts[tran==i]
expi expcounts[tran==i]
G[i] 2*sum((obsi*log(obsi/expi)))
}
dfs = rep(max(as.integer(levels(cov)))-1, max(as.integer(levels(cov))))
probs = pchisq(G, dfs, lower.tail=F) #and the lower tail probability of the G
#get a weighted
average for the pooled expectations
weightexp = list()
weights = vector()
for (i in
levels(tran)) {
expi = expprop[tran==i]
lengi tleng[as.integer(i)]
counti sum(obscounts[tran==i])
weightexp[[i]] expi*lengi*counti
weights[i] lengi*counti
}
sum = rep(0,
length(weightexp[[1]]))
for (i in
1:length(weightexp)) {
sum = sum +
weightexp[[i]]
}
expproppooled sum/sum(weights)
#Now the pooled G
obs tapply(obscounts,cov,sum)
exp expproppooled*sum(obs)
Gp 2*sum(obs*log(obs/exp))
dfp = max(as.integer(levels(cov)))-1
probp = pchisq(Gp,
dfp, lower.tail=F)
#total G
Gt = sum(G)
dft = sum(dfs)
probt = pchisq(Gt,
dft, lower.tail=F)
#heterogeneity G
Gh = Gt-Gp
dfh = dft - dfp
probh = pchisq(Gh,
dfh, lower.tail=F)
#F ratio
Fratio (Gp/dfp)/(Gh/dfh)
probf = pf(Fratio,
dfp, dfh, lower.tail=F)
res = list(TotG=c(Gt, dft, probt), PooledG=c(Gp, dfp,
probp),
HetG=c(Gh, dfh, probh), Fratio=c(Fratio, dfp, dfh, probf))
res
} #end hetg
____________________________________________________________________________________Take
the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news,
photos & more.
[[alternative HTML version deleted]]
