thr3ads.net - R help - [R] Replicated LR goodness-of-fit tests, heterogeneity G, with loglm? [May 2007]

If this information is useful, please help other people find it:
Share via:

Stewart Schultz

2007-May-23 11:42 UTC

[R] Replicated LR goodness-of-fit tests, heterogeneity G, with loglm?

I have numerous replicated goodness-of-fit experiments (observed compared to
expected counts in categories) and these
replicates are nested within a factor. 
The expected counts in each cell are external (from a
scientific model being tested).  The
calculations I need within each level of the nesting factor are a heterogeneity
G test, with the total G and the pooled G across replicates.  Then I would like
to form an F ratio equal to
the ratio of pooled G divided by its degrees of freedom to heterogeneity G
divided
by its degrees of freedom.  The F ratio
would (I think) test the hypothesis that the badness-of-fit in the pooled data
is greater than would be expected by chance from the heterogeneity among
replicates.


 


It seems that the function loglm is the closest within R to what I want.  But I
can¢t see
how it can be used when the expected
proportions are externally provided.


 


I¢ve appended here a function I wrote that more or less does
what I want (with the nesting factor ignored) but I would prefer to use
something
like loglm because of the additional information it offers and its
flexibility with hierarchical models.



 


Thanks,


Stu


 


---------


 


hetg = function() {


#creating some fake data


tran=gl(10,10)   #tran
are the random replicates


cov= factor(rep(1:10,10))
#10 levels of this factor for each level of tran


tleng = c(219, 312, 178, 322, 311, 242, 235, 235, 266, 193)
#weighting for


   #each transect (of
each level of tran)


obscounts = rpois(100, 50) 
#these are the observed response count data


expprop = rep(c(.1, .1, .1, .05, .15, .1, .1, .1, .1, .1),
10)  #expected 


   # proportion of
counts within 10 levels within each level of tran


vv = tapply(obscounts, tran, sum) #get marginal sum of
counts


rr = rep(vv, each=10) #rep the marginal sums across each
cell


expcounts = expprop*rr #the vector of expected counts


 


#Now calculate the G for each level of tran (likelihood
ratios)


G = vector()


for (i in levels(tran)) {


   obsi obscounts[tran==i]


   expi expcounts[tran==i]


   G[i] 2*sum((obsi*log(obsi/expi)))


}


 


dfs = rep(max(as.integer(levels(cov)))-1, max(as.integer(levels(cov))))










probs = pchisq(G, dfs, lower.tail=F)  #and the lower tail probability of the G


   #get a weighted
average for the pooled expectations


    weightexp = list()


    weights = vector()


    for (i in
levels(tran)) {


      expi = expprop[tran==i]


      lengi tleng[as.integer(i)]


      counti sum(obscounts[tran==i])


      weightexp[[i]] expi*lengi*counti


      weights[i] lengi*counti


    }


    sum = rep(0,
length(weightexp[[1]]))


    for (i in
1:length(weightexp)) {     


      sum = sum +
weightexp[[i]]    


    }


    expproppooled sum/sum(weights)


    


    #Now the pooled G


    obs tapply(obscounts,cov,sum)


    exp expproppooled*sum(obs)


    Gp 2*sum(obs*log(obs/exp))


    dfp =  max(as.integer(levels(cov)))-1


    probp = pchisq(Gp,
dfp, lower.tail=F)


    


    #total G


    Gt = sum(G)


    dft = sum(dfs)


    probt = pchisq(Gt,
dft, lower.tail=F)


    


    #heterogeneity G


    Gh = Gt-Gp


    dfh = dft - dfp


    probh = pchisq(Gh,
dfh, lower.tail=F)


    


    #F ratio


    Fratio (Gp/dfp)/(Gh/dfh)


    probf = pf(Fratio,
dfp, dfh, lower.tail=F)


 


res = list(TotG=c(Gt, dft, probt), PooledG=c(Gp, dfp,
probp),


HetG=c(Gh, dfh, probh), Fratio=c(Fratio, dfp, dfh, probf))


 


res


 } #end hetg






       
____________________________________________________________________________________Take
the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news,
photos & more.

	[[alternative HTML version deleted]]

Reasonably Related Threads

Search for more apparently analagous threads

R help - May 2007 - Replicated LR goodness-of-fit tests, heterogeneity G, with loglm?

[R] Replicated LR goodness-of-fit tests, heterogeneity G, with loglm?

Reasonably Related Threads

Wisdom of the Ancients