I'm just starting to learn about GAM models. When using the lm function in R, any factors I have in my data set are automatically converted into a series of binomial variables. For example, if I have a data.frame with a column named color and values "red", "green", "blue". The lm function automatically replaces it with 3 variables colorred, colorgreen, colorblue which are binomial {0,1} When I use the gam function, R doesn't do this so I get an error. 1) Is there a way to ask the gam function to do this conversion for me? 2) If not, is there some other tool or utility to make this data transformation easy? 3) Last option - can I use lm to transform the data and then extract it into a new data.frame to then pass to gam? Thanks!!!
Hi Noah GAM models were developed to assess the functional form of the relationship of continuous predictor variables to the response, so weren't really meant to handle factor variables as predictor variables. GAMs are of the form E(Y | X1, X2, ...) = So + S(X1) + S(X2) + ... where S(X) is a smooth function of X. Hence you might want to rethink why you'd want a factor variable as a predictor variable in a GAM. This is why the gam machinery doesn't just do the factor conversion to indicator variables as is done in lm. HTH Steven McKinney ________________________________________ From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] On Behalf Of Noah Silverman [noah at smartmediacorp.com] Sent: March 19, 2010 12:54 PM To: r-help at r-project.org Subject: [R] Factor variables with GAM models I'm just starting to learn about GAM models. When using the lm function in R, any factors I have in my data set are automatically converted into a series of binomial variables. For example, if I have a data.frame with a column named color and values "red", "green", "blue". The lm function automatically replaces it with 3 variables colorred, colorgreen, colorblue which are binomial {0,1} When I use the gam function, R doesn't do this so I get an error. 1) Is there a way to ask the gam function to do this conversion for me? 2) If not, is there some other tool or utility to make this data transformation easy? 3) Last option - can I use lm to transform the data and then extract it into a new data.frame to then pass to gam? Thanks!!! ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
It doesn't usually make much sense to *smooth* over a factor variable (in the cases where it does you should treat the factor as a random effect), but there is no problem in including factor variables in a GAM. `gam' lets you mix factor and continuous variables in a bunch of ways. Suppose that `a' is a factor, `x' is a continuous (or just metric) variable and `y' is a response.... y ~ a + s(x) will fit a model where `a' is treated exactly as a factor variable is treated by `lm', while `x' is smoothed over. In mgcv:gam then y ~ s(x,by=a) would create a `smooth-factor interaction' --- a separate smooth of `x' for each level of `a'. y ~ s(x,by=a,id=1) would do the same, but would insist on each of the smooths of `x' having the same smoothng parameter. ?gam.models gives some more detail. best, Simon On Friday 19 March 2010 19:54, Noah Silverman wrote:> I'm just starting to learn about GAM models. > > When using the lm function in R, any factors I have in my data set are > automatically converted into a series of binomial variables. > > For example, if I have a data.frame with a column named color and values > "red", "green", "blue". The lm function automatically replaces it with > 3 variables colorred, colorgreen, colorblue which are binomial {0,1} > > When I use the gam function, R doesn't do this so I get an error. > > 1) Is there a way to ask the gam function to do this conversion for me? > 2) If not, is there some other tool or utility to make this data > transformation easy? > 3) Last option - can I use lm to transform the data and then extract it > into a new data.frame to then pass to gam? > > Thanks!!! > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html and provide commented, minimal, > self-contained, reproducible code.--> Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK > +44 1225 386603 www.maths.bath.ac.uk/~sw283