pigpigmeow
2011-Jun-19 07:39 UTC
[R] please help! what are the different using log-link function and log transformation?
I'm new R-programming user, I need to use gam function. y<-gam(a~s(b),family=gaussian(link=log),data) y<-gam(loga~s(b), family =gaussian (link=identity),data) why these two command results are different? I guess these two command results are same, but actally these two command results are different, Why? -- View this message in context: http://r.789695.n4.nabble.com/please-help-what-are-the-different-using-log-link-function-and-log-transformation-tp3608931p3608931.html Sent from the R help mailing list archive at Nabble.com.
Rubén Roa
2011-Jun-19 11:19 UTC
[R] please help! what are the different using log-link function and log transformation?
The problem is not that you are new to R. This is a conceptual issue. Let y be the response variable and let {x_i} be a set of predictors. Your first model (identity response and log-link) is saying that y=f(x_1)f(x_2)...f(x_n) + e, e~Normal(0,sigma) i.e. this is an additive observation-error model with constant variance. Your second model (log-response and identity link) is saying that y=f(x_1)f(x_2)...f(x_n)u, u=exp(e), e~Normal(0,sigma) i.e. this a multiplicative observation-error model with variance proportional to the mean. Plot the data versus response and visually examine whether you have heteroscedasticity. If this is true, use your second model, otherwise use the first one. One key to understand these kind of dichotomies is to realize that statistical models are composed of a process part and an observation part. In your models the process part is deterministic and multiplicative but after that, you still have two choices, make the random observation part additive (your first model) or multiplicative (your second model). Needless to say (but I am saying it anyways) these two models will give different results, at the very least because one assumes constant variance (your first model) whereas the other assumes a variance proportional to the mean. In my experience with multiplicative process models, the random observation part shall usually be multiplicative as well because of heteroscedasticity. HTH Rubén ----------------------------------------- Rubén H. Roa-Ureta, Ph. D. AZTI Tecnalia, Txatxarramendi Ugartea z/g, Sukarrieta, Bizkaia, SPAIN -----Original Message----- From: r-help-bounces@r-project.org on behalf of pigpigmeow Sent: Sun 6/19/2011 9:39 AM To: r-help@r-project.org Subject: [R] please help! what are the different using log-link function and log transformation? I'm new R-programming user, I need to use gam function. y<-gam(a~s(b),family=gaussian(link=log),data) y<-gam(loga~s(b), family =gaussian (link=identity),data) why these two command results are different? I guess these two command results are same, but actally these two command results are different, Why? -- View this message in context: http://r.789695.n4.nabble.com/please-help-what-are-the-different-using-log-link-function-and-log-transformation-tp3608931p3608931.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
Bill.Venables at csiro.au
2011-Jun-19 12:36 UTC
[R] please help! what are the different using log-link function and log transformation?
The two commands you give below are certain to lead to very different results, because they are fitting very different models. The first is a gaussian model for the response with a log link, and constant variance. The second is a gaussian model for a log-transformed response and identity link. On the original scale this model would imply a constant coefficient of variation and hence a variance proportional to the square of the mean, and not constant. Your problem is not particularly an R issue, but a difficulty with understanding generalized linear models (and hence generalized additive models, which are based on them). Bill Venables. ________________________________________ From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] On Behalf Of pigpigmeow [glorykwok at hotmail.com] Sent: 19 June 2011 17:39 To: r-help at r-project.org Subject: [R] please help! what are the different using log-link function and log transformation? I'm new R-programming user, I need to use gam function. y <- gam(a ~ s(b), family = gaussian(link=log), data) y <- gam(log(a) ~ s(b), family = gaussian (link=identity), data) why [do] these two command [give different] results? I guess these two command results are same, but actally these two command results are different, Why? -- View this message in context: http://r.789695.n4.nabble.com/please-help-what-are-the-different-using-log-link-function-and-log-transformation-tp3608931p3608931.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.