Michael Haenlein
2011-Dec-06 13:08 UTC
[R] Regression model when dependent variable can only take positive values
Dear all, I would like to run a regression of the form lm(y ~ x1+x2) where the dependent variable y can only take positive values. Assume, for example, that y is the height of a person (measured in cm), x1 is the gender (measured as a binary indicator with 0=male and 1=female) and x2 is the age of the person (measured in years). When I run a simple lm(y ~ x1+x2), I obtain an intercept value that is negative. I interpret that in a way that a person who is male (x1=0) and just born (x2=0), has a negative height. This evidently does not make sense. I therefore assume that my estimates might be biased and that I need to use some other form of estimation that takes account of the fact that y>0 for all observations. Could anybody please tell me which type of regression would be most recommendable for this type of analysis? Thanks very much in advance, Michael [[alternative HTML version deleted]]
ONKELINX, Thierry
2011-Dec-06 13:15 UTC
[R] Regression model when dependent variable can only take positive values
Dear Michael, Did you measure newborns? If not center age to a value that makes sense in relation with the range of age in your dataset. Then the intercept will be the height at the reference age. And most likeli non-negative. Best regards, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -----Oorspronkelijk bericht----- Van: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Namens Michael Haenlein Verzonden: dinsdag 6 december 2011 14:09 Aan: r-help at r-project.org Onderwerp: [R] Regression model when dependent variable can only take positive values Dear all, I would like to run a regression of the form lm(y ~ x1+x2) where the dependent variable y can only take positive values. Assume, for example, that y is the height of a person (measured in cm), x1 is the gender (measured as a binary indicator with 0=male and 1=female) and x2 is the age of the person (measured in years). When I run a simple lm(y ~ x1+x2), I obtain an intercept value that is negative. I interpret that in a way that a person who is male (x1=0) and just born (x2=0), has a negative height. This evidently does not make sense. I therefore assume that my estimates might be biased and that I need to use some other form of estimation that takes account of the fact that y>0 for all observations. Could anybody please tell me which type of regression would be most recommendable for this type of analysis? Thanks very much in advance, Michael [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.