pigpigmeow
2011-Jun-19 07:39 UTC
[R] please help! what are the different using log-link function and log transformation?
I'm new R-programming user, I need to use gam function. y<-gam(a~s(b),family=gaussian(link=log),data) y<-gam(loga~s(b), family =gaussian (link=identity),data) why these two command results are different? I guess these two command results are same, but actally these two command results are different, Why? -- View this message in context: http://r.789695.n4.nabble.com/please-help-what-are-the-different-using-log-link-function-and-log-transformation-tp3608931p3608931.html Sent from the R help mailing list archive at Nabble.com.
Rubén Roa
2011-Jun-19 11:19 UTC
[R] please help! what are the different using log-link function and log transformation?
The problem is not that you are new to R. This is a conceptual issue.
Let y be the response variable and let {x_i} be a set of predictors. Your first
model (identity response and log-link) is saying that
y=f(x_1)f(x_2)...f(x_n) + e, e~Normal(0,sigma)
i.e. this is an additive observation-error model with constant variance.
Your second model (log-response and identity link) is saying that
y=f(x_1)f(x_2)...f(x_n)u, u=exp(e), e~Normal(0,sigma)
i.e. this a multiplicative observation-error model with variance proportional to
the mean.
Plot the data versus response and visually examine whether you have
heteroscedasticity. If this is true, use your second model, otherwise use the
first one.
One key to understand these kind of dichotomies is to realize that statistical
models are composed of a process part and an observation part. In your models
the process part is deterministic and multiplicative but after that, you still
have two choices, make the random observation part additive (your first model)
or multiplicative (your second model).
Needless to say (but I am saying it anyways) these two models will give
different results, at the very least because one assumes constant variance (your
first model) whereas the other assumes a variance proportional to the mean.
In my experience with multiplicative process models, the random observation part
shall usually be multiplicative as well because of heteroscedasticity.
HTH
Rubén
-----------------------------------------
Rubén H. Roa-Ureta, Ph. D.
AZTI Tecnalia, Txatxarramendi Ugartea z/g,
Sukarrieta, Bizkaia, SPAIN
-----Original Message-----
From: r-help-bounces@r-project.org on behalf of pigpigmeow
Sent: Sun 6/19/2011 9:39 AM
To: r-help@r-project.org
Subject: [R] please help! what are the different using log-link function and log
transformation?
I'm new R-programming user, I need to use gam function.
y<-gam(a~s(b),family=gaussian(link=log),data)
y<-gam(loga~s(b), family =gaussian (link=identity),data)
why these two command results are different?
I guess these two command results are same, but actally these two command
results are different, Why?
--
View this message in context:
http://r.789695.n4.nabble.com/please-help-what-are-the-different-using-log-link-function-and-log-transformation-tp3608931p3608931.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
Bill.Venables at csiro.au
2011-Jun-19 12:36 UTC
[R] please help! what are the different using log-link function and log transformation?
The two commands you give below are certain to lead to very different results, because they are fitting very different models. The first is a gaussian model for the response with a log link, and constant variance. The second is a gaussian model for a log-transformed response and identity link. On the original scale this model would imply a constant coefficient of variation and hence a variance proportional to the square of the mean, and not constant. Your problem is not particularly an R issue, but a difficulty with understanding generalized linear models (and hence generalized additive models, which are based on them). Bill Venables. ________________________________________ From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] On Behalf Of pigpigmeow [glorykwok at hotmail.com] Sent: 19 June 2011 17:39 To: r-help at r-project.org Subject: [R] please help! what are the different using log-link function and log transformation? I'm new R-programming user, I need to use gam function. y <- gam(a ~ s(b), family = gaussian(link=log), data) y <- gam(log(a) ~ s(b), family = gaussian (link=identity), data) why [do] these two command [give different] results? I guess these two command results are same, but actally these two command results are different, Why? -- View this message in context: http://r.789695.n4.nabble.com/please-help-what-are-the-different-using-log-link-function-and-log-transformation-tp3608931p3608931.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.