Hi, I'm analyzing experimental results where two different events ("T1" and "T2") can occur or not during an experiment. I made my experiments with one factor ("Substrate") with two levels ("Sand" and "Clay"). I would like to know wether or not "Substrate" affects the occurrence probability of the two events. Moreover, for each condition I would like to test the heterogeneity of my experimental contingency table with a theoretical one (from simulations). However, my problem is that several cells have sampling zeroes. My experiments can't be done again to fill these cells. Thus Chi-square requirements are not fulfilled and I have to find another statistical method. After spending hours searching for a solution, I thought I could use loglinear model to answer my questions, but : - I'm not sure I can use loglinear model = do I fulfill the required conditions ? - would this method answer to my hypothesis ? - I not sure to really understand how I have to use loglin()? Here is the data frame of my results. DF<-data.frame(Subs=c(rep("Sand",4),rep("Clay",4)),T1=rep(c ("YES","YES","NO","NO"),2),T2=rep(c("YES","NO","YES","NO"),2),Freq=c (12,5,0,7,24,1,0,0)) What do you think of such datas ? Can I use any statistical method to test my hypothesis ? Any advice ? Thanks, Etienne Toffin ------------------------------------------------------------------- Etienne Toffin, PhD Student Unit of Social Ecology Universit? Libre de Bruxelles, CP 231 Boulevard du Triomphe B-1050 Brussels Belgium Tel: +32(0)2/650.55.30 Fax: +32(0)2/650.57.67 http://www.ulb.ac.be/sciences/use/toffin.html
Charles C. Berry
2009-Oct-20 16:09 UTC
[R] 2x2 Contingency table with much sampling zeroes
On Tue, 20 Oct 2009, Etienne Toffin wrote:> Hi, > > I'm analyzing experimental results where two different events ("T1" and "T2") > can occur or not during an experiment. I made my experiments with one factor > ("Substrate") with two levels ("Sand" and "Clay"). > I would like to know wether or not "Substrate" affects the occurrence > probability of the two events.It is not clear to me what you mean by 'affects the occurence ...'. This sounds like 'Independence of Substrate from the two other variables', which is a 3 degree of freedom hypothesis (at least in the example you give). Is that what you are after or are only some of those contrasts interesting? Moreover, for each condition I would like to> test the heterogeneity of my experimental contingency table with a > theoretical one (from simulations). >Do you mean you have some prior values for the counts or proportions? If so a standard goodness of fit test should do. If not, you need to describe the problem in more detail.> However, my problem is that several cells have sampling zeroes. My > experiments can't be done again to fill these cells. Thus Chi-square > requirements are not fulfilled and I have to find another statistical method. >Sampling zeroes in the cells are not a problem as long as the marginal tables do not have such zeroes. Depending on the hypotheses you want to test, the marginal tables may be OK. 'Substrate' is OK and so is 'T1 by T2', so you can do the 3 degree of freedom test implied by those margins.> After spending hours searching for a solution, I thought I could use > loglinear model to answer my questions, but : > - I'm not sure I can use loglinear model = do I fulfill the required > conditions ?Have you studied the Agresti reference listed in the help page?? I'll bet it addresses 'the required conditions' - which go to the sampling distribution of the counts.> - would this method answer to my hypothesis ? > - I not sure to really understand how I have to use loglin()? >run example(loglin) and reread ?loglin The example is the same setup as you have here (albeit with more degrees of freedom), so you might emulate it.> Here is the data frame of my results. > > DF<-data.frame(Subs=c(rep("Sand",4),rep("Clay",4)),T1=rep(c("YES","YES","NO","NO"),2),T2=rep(c("YES","NO","YES","NO"),2),Freq=c(12,5,0,7,24,1,0,0)) > > What do you think of such datas ? Can I use any statistical method to test my > hypothesis ? Any advice ?Recruit a statistician to your committee. Questions like these are better hashed out in front of a blackboard than over the internet. HTH, Chuck> > Thanks, > > Etienne Toffin > > > ------------------------------------------------------------------- > Etienne Toffin, PhD Student > Unit of Social Ecology > Universit? Libre de Bruxelles, CP 231 > Boulevard du Triomphe > B-1050 Brussels > Belgium > > Tel: +32(0)2/650.55.30 > Fax: +32(0)2/650.57.67 > http://www.ulb.ac.be/sciences/use/toffin.html > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cberry at tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
Possibly Parallel Threads
- How to compare parameters of non linear fitting curves
- Rép : How to compare parameters of non linear fitting curves - COMPLETE REPLY -
- slightly OT: (un)supervised clustering?
- How to compare two regression line slopes
- Indexing in anova summary output of the form: summary(aov(y ~ x1, Error = (x1/x2)))