John Sorkin
2006-Aug-03 17:51 UTC
[R] Looking for transformation to overcome heterogeneity ofvariances
Peter You question is difficult to answer without more information about the distribution of your residuals. Different residual patterns call for different transformations to stabilize the variance. One very common form of heterocedasticity is increasing variance with increasing values of an independent predictor, i.e. the variance of the residuals of y=x increase as x increases. In this case a log transformation of some, or all, of the independent variables of the helps. Please also note the comment by Bert Gunter (included below) in which some important points are raised, particularly about extreme values. If you want more help, please describe the pattern of your residuals. John Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics Baltimore VA Medical Center GRECC, University of Maryland School of Medicine Claude D. Pepper OAIC, University of Maryland Clinical Nutrition Research Unit, and Baltimore VA Center Stroke of Excellence University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) jsorkin at grecc.umaryland.edu>>> Berton Gunter <gunter.berton at gene.com> 8/3/2006 11:56:28 AM >>>I know I'm coming late to this, but ...> > Is someone able to suggest to me a transformation to overcome the > > problem of heterocedasticity?It is not usually useful to worry about this. In my experience, the gain in efficiency from using an essentially ideal weighted analysis vs. an approximate unweighted one is usually small and unimportant (transformation to simplify a model is another issue ...). Of far greater importance usually is the loss in efficiency due to the presence of a few "unusual" extreme values; have you carefully checked to make sure that none of the large sample variances you have are due merely to the presence of a small number of highly discrepant values? -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA "The business of the statistician is to catalyze the scientific learning process." - George E. P. Box ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Peter Dalgaard
2006-Aug-03 18:43 UTC
[R] Looking for transformation to overcome heterogeneity ofvariances
[Resending -- recipient list length issue] "John Sorkin" <jsorkin at grecc.umaryland.edu> writes:> PeterErm, that was Paul's question, not mine! If you want to help, please look at the pattern of residuals which he put up on the web on my request....> You question is difficult to answer without more information about the > distribution of your residuals. Different residual patterns call for > different transformations to stabilize the variance. One very common > form of heterocedasticity is increasing variance with increasing values > of an independent predictor, i.e. the variance of the residuals of y=x > increase as x increases. In this case a log transformation of some, or > all, of the independent variables of the helps. Please also note the > comment by Bert Gunter (included below) in which some important points > are raised, particularly about extreme values. > > If you want more help, please describe the pattern of your residuals. > > > John Sorkin M.D., Ph.D. > Chief, Biostatistics and Informatics > Baltimore VA Medical Center GRECC, > University of Maryland School of Medicine Claude D. Pepper OAIC, > University of Maryland Clinical Nutrition Research Unit, and > Baltimore VA Center Stroke of Excellence > > University of Maryland School of Medicine > Division of Gerontology > Baltimore VA Medical Center > 10 North Greene Street > GRECC (BT/18/GR) > Baltimore, MD 21201-1524 > > (Phone) 410-605-7119 > (Fax) 410-605-7913 (Please call phone number above prior to faxing) > jsorkin at grecc.umaryland.edu > > >>> Berton Gunter <gunter.berton at gene.com> 8/3/2006 11:56:28 AM >>> > I know I'm coming late to this, but ... > > > > Is someone able to suggest to me a transformation to overcome the > > > problem of heterocedasticity? > > It is not usually useful to worry about this. In my experience, the > gain in > efficiency from using an essentially ideal weighted analysis vs. an > approximate unweighted one is usually small and unimportant > (transformation > to simplify a model is another issue ...). Of far greater importance > usually > is the loss in efficiency due to the presence of a few "unusual" > extreme > values; have you carefully checked to make sure that none of the large > sample variances you have are due merely to the presence of a small > number > of highly discrepant values? > > > -- Bert Gunter > Genentech Non-Clinical Statistics > South San Francisco, CA > > "The business of the statistician is to catalyze the scientific > learning > process." - George E. P. Box > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907