Hello- I have been digging around in the FAQ's and online looking for an answer to my questions, and perhaps someone here can help me. For a statistical experiment, I need to run 3,000,000 ANOVAs, which is taking me a very long time. As a result, I have recoded my analyses in C. However, I cannot find the formula to calculate either the type I or type III sums of squares (in the case of my model, the two are equivalent). I know that the formula must be in the R source code, as they are able to calculate it, but I am not sure where. Does anyone know where I can find the explicit procedure for calculating this? A mathematical formula or the source code would be equally helpful. I am aware of the formula in matrix algebra, but is there a formulation that does not use matrix algebra? thanks very much in advance, Paul Litvak Department of Human Genetics University of Michigan
Paul - Your question is best answered by a textbook reference, because that will supply all the context needed to fully answer your question. A good, basic reference is: George W. Snedecor and William G. Cochran (1980) Statistical Methods, 7th edition. Iowa State Univ. Press. ISBN: 0-8138-1560-6; LC: QA 276.12 .S591 1980 (I have the Taubman copy already checked out - others in the Science Library.) A more advanced reference is: George A. Milliken and Dallas E. Johnson (1984) Analysis of messy data (2 vols.) Van Nostrand Reinhold, NY ISBN: 0-534-02713-x; LC: QA 279 .M481 1984 (Science library only, more recent edition in Public Health library.) The terms "type I" and "type III" are specific to SAS software. Their precise definitions are given in the SAS documentation. I don't have a copy handy. George Milliken was a contributor to the SAS software, so his definitions will coincide with SAS's. HTH - tom blackwell - program in bioinformatics and department of human genetics - u michigan medical school - ann arbor - On Mon, 18 Aug 2003, Paul Litvak wrote:> I have been digging around in the FAQ's and online looking for an answer > to my questions, and perhaps someone here can help me. > > For a statistical experiment, I need to run 3,000,000 ANOVAs, which is > taking me a very long time. As a result, I have recoded my analyses in > C. However, I cannot find the formula to calculate either the type I or > type III sums of squares (in the case of my model, the two are > equivalent). I know that the formula must be in the R source code, as > they are able to calculate it, but I am not sure where. Does anyone know > where I can find the explicit procedure for calculating this? A > mathematical formula or the source code would be equally helpful. I am > aware of the formula in matrix algebra, but is there a formulation that > does not use matrix algebra? > > thanks very much in advance, > Paul Litvak > Department of Human Genetics > University of Michigan
Not knowing any more details about your experiment and data, we can only speculate. If the reason (or part of the reason) that you need to run ANOVA 3 million times is that you have that many responses collected from the same experiment (or several experiments, but not 3 million different experiments), you should be able to do the ANOVA computation in R very efficiently. E.g., assuming you actually have one experiment with 3m responses, you can compute the hat matrix once and apply it to the response matrix, rather than computing the same hat matrix 3M times. Just a thought. HTH. Andy> -----Original Message----- > From: Paul Litvak [mailto:plitwak at umich.edu] > Sent: Monday, August 18, 2003 2:18 PM > To: r-help at stat.math.ethz.ch > Subject: [R] type I and type III sums of squares > > > Hello- > > I have been digging around in the FAQ's and online looking > for an answer > to my questions, and perhaps someone here can help me. > > For a statistical experiment, I need to run 3,000,000 ANOVAs, > which is > taking me a very long time. As a result, I have recoded my > analyses in > C. However, I cannot find the formula to calculate either the > type I or > type III sums of squares (in the case of my model, the two are > equivalent). I know that the formula must be in the R source code, as > they are able to calculate it, but I am not sure where. Does > anyone know > where I can find the explicit procedure for calculating this? A > mathematical formula or the source code would be equally > helpful. I am > aware of the formula in matrix algebra, but is there a > formulation that > does not use matrix algebra? > > thanks very much in advance, > Paul Litvak > Department of Human Genetics > University of Michigan > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo> /r-help >------------------------------------------------------------------------------ Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp & Dohme or MSD) that may be confidential, proprietary copyrighted and/or legally privileged, and is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please immediately return this by e-mail and then delete it.