Antonio P. Ramos
2013-Mar-26 01:09 UTC
[R] GAM model with interactions between continuous variables and factors
Hi all, I am not sure how to handle interactions with categorical predictors in the GAM models. For example what is the different between these bellow two models. Tests are indicating that they are different but their predictions are essentially the same. Thanks a bunch,> gam.1 <- gam(mortality.under.2~ maternal_age_c+ I(maternal_age_c^2)++ s(birth_year,by=wealth) + + + wealth + sex + + residence+ maternal_educ + birth_order, + ,data=rwanda2,family="binomial")> > gam.2 <- gam(mortality.under.2~ maternal_age_c+ I(maternal_age_c^2)++ s(birth_year,by=wealth) + + + sex + + residence+ maternal_educ + birth_order, + ,data=rwanda2,family="binomial")> > anova(gam.1,gam.2,test="Chi")Analysis of Deviance Table Model 1: mortality.under.2 ~ maternal_age_c + I(maternal_age_c^2) + s(birth_year, by = wealth) + +wealth + sex + residence + maternal_educ + birth_order Model 2: mortality.under.2 ~ maternal_age_c + I(maternal_age_c^2) + s(birth_year, by = wealth) + +sex + residence + maternal_educ + birth_order Resid. Df Resid. Dev Df Deviance Pr(>Chi) 1 28986 24175 2 28989 24196 -3.6952 -21.378 0.0001938 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1> str(rwanda2)'data.frame': 29027 obs. of 18 variables: $ CASEID : Factor w/ 10718 levels " 1 5 2",..: 289 2243 7475 9982 6689 10137 7426 428 8415 10426 ... $ mortality.under.2 : int 0 1 0 0 0 0 0 0 1 0 ... $ maternal_age_disct: Factor w/ 3 levels "-25","+35","25-35": 1 1 1 1 1 1 3 1 3 1 ... $ maternal_age : int 18 21 21 23 21 22 26 18 27 21 ... $ time : int 3 3 3 3 3 3 3 3 3 3 ... $ child_mortality : num 0.232 0.232 0.232 0.232 0.232 ... $ democracy : Factor w/ 1 level "dictatorship": 1 1 1 1 1 1 1 1 1 1 ... $ wealth : Factor w/ 5 levels "Lowest quintile",..: 2 4 1 4 5 1 4 1 4 5 ... $ birth_year : int 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 ... $ residence : Factor w/ 2 levels "Rural","Urban": 1 1 1 1 2 1 1 1 1 2 ... $ birth_order : int 1 2 2 5 1 1 3 1 2 2 ... $ maternal_educ : Factor w/ 4 levels "Higher","No education",..: 3 2 2 3 4 2 3 2 2 2 ... $ sex : Factor w/ 2 levels "Female","Male": 1 1 2 2 1 1 2 2 2 2 ... $ quinquennium : Factor w/ 7 levels "00-5's","70-4",..: 2 2 2 2 2 2 2 2 2 2 ... $ time.1 : int 3 3 3 3 3 3 3 3 3 3 ... $ new_time : int 0 0 0 0 0 0 0 0 0 0 ... $ maternal_age_c : num -6.12 -3.12 -3.12 -1.12 -3.12 ... $ birth_year_c : num -14.8 -14.8 -14.8 -14.8 -14.8 ... [[alternative HTML version deleted]]
Antonio P. Ramos
2013-Mar-26 01:12 UTC
[R] GAM model with interactions between continuous variables and factors
Just to clarify: gam.1 has wealth inside the smooths and as a fixed effect predictor while gam.2 only have wealth inside the smooths. Thanks On Mon, Mar 25, 2013 at 6:09 PM, Antonio P. Ramos < ramos.grad.student@gmail.com> wrote:> Hi all, > > I am not sure how to handle interactions with categorical predictors in > the GAM models. For example what is the different between these bellow two > models. Tests are indicating that they are different but their predictions > are essentially the same. > > Thanks a bunch, > > > gam.1 <- gam(mortality.under.2~ maternal_age_c+ I(maternal_age_c^2)+ > + s(birth_year,by=wealth) + > + + wealth + sex + > + residence+ maternal_educ + birth_order, > + ,data=rwanda2,family="binomial") > > > > gam.2 <- gam(mortality.under.2~ maternal_age_c+ I(maternal_age_c^2)+ > + s(birth_year,by=wealth) + > + + sex + > + residence+ maternal_educ + birth_order, > + ,data=rwanda2,family="binomial") > > > > anova(gam.1,gam.2,test="Chi") > Analysis of Deviance Table > > Model 1: mortality.under.2 ~ maternal_age_c + I(maternal_age_c^2) + > s(birth_year, > by = wealth) + +wealth + sex + residence + maternal_educ + > birth_order > Model 2: mortality.under.2 ~ maternal_age_c + I(maternal_age_c^2) + > s(birth_year, > by = wealth) + +sex + residence + maternal_educ + birth_order > Resid. Df Resid. Dev Df Deviance Pr(>Chi) > 1 28986 24175 > 2 28989 24196 -3.6952 -21.378 0.0001938 *** > --- > Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > > str(rwanda2) > 'data.frame': 29027 obs. of 18 variables: > $ CASEID : Factor w/ 10718 levels " 1 5 2",..: 289 > 2243 7475 9982 6689 10137 7426 428 8415 10426 ... > $ mortality.under.2 : int 0 1 0 0 0 0 0 0 1 0 ... > $ maternal_age_disct: Factor w/ 3 levels "-25","+35","25-35": 1 1 1 1 1 1 > 3 1 3 1 ... > $ maternal_age : int 18 21 21 23 21 22 26 18 27 21 ... > $ time : int 3 3 3 3 3 3 3 3 3 3 ... > $ child_mortality : num 0.232 0.232 0.232 0.232 0.232 ... > $ democracy : Factor w/ 1 level "dictatorship": 1 1 1 1 1 1 1 1 1 > 1 ... > $ wealth : Factor w/ 5 levels "Lowest quintile",..: 2 4 1 4 5 > 1 4 1 4 5 ... > $ birth_year : int 1970 1970 1970 1970 1970 1970 1970 1970 1970 > 1970 ... > $ residence : Factor w/ 2 levels "Rural","Urban": 1 1 1 1 2 1 1 1 > 1 2 ... > $ birth_order : int 1 2 2 5 1 1 3 1 2 2 ... > $ maternal_educ : Factor w/ 4 levels "Higher","No education",..: 3 2 > 2 3 4 2 3 2 2 2 ... > $ sex : Factor w/ 2 levels "Female","Male": 1 1 2 2 1 1 2 2 > 2 2 ... > $ quinquennium : Factor w/ 7 levels "00-5's","70-4",..: 2 2 2 2 2 2 > 2 2 2 2 ... > $ time.1 : int 3 3 3 3 3 3 3 3 3 3 ... > $ new_time : int 0 0 0 0 0 0 0 0 0 0 ... > $ maternal_age_c : num -6.12 -3.12 -3.12 -1.12 -3.12 ... > $ birth_year_c : num -14.8 -14.8 -14.8 -14.8 -14.8 ... >[[alternative HTML version deleted]]
Joshua Wiley
2013-Mar-26 01:18 UTC
[R] GAM model with interactions between continuous variables and factors
Hi Antonio, If wealth is a factor variable, you should include the main effect in the model, as the smooths will be centered. Cheers, Josh On Mon, Mar 25, 2013 at 6:09 PM, Antonio P. Ramos <ramos.grad.student at gmail.com> wrote:> Hi all, > > I am not sure how to handle interactions with categorical predictors in the > GAM models. For example what is the different between these bellow two > models. Tests are indicating that they are different but their predictions > are essentially the same. > > Thanks a bunch, > >> gam.1 <- gam(mortality.under.2~ maternal_age_c+ I(maternal_age_c^2)+ > + s(birth_year,by=wealth) + > + + wealth + sex + > + residence+ maternal_educ + birth_order, > + ,data=rwanda2,family="binomial") >> >> gam.2 <- gam(mortality.under.2~ maternal_age_c+ I(maternal_age_c^2)+ > + s(birth_year,by=wealth) + > + + sex + > + residence+ maternal_educ + birth_order, > + ,data=rwanda2,family="binomial") >> >> anova(gam.1,gam.2,test="Chi") > Analysis of Deviance Table > > Model 1: mortality.under.2 ~ maternal_age_c + I(maternal_age_c^2) + > s(birth_year, > by = wealth) + +wealth + sex + residence + maternal_educ + > birth_order > Model 2: mortality.under.2 ~ maternal_age_c + I(maternal_age_c^2) + > s(birth_year, > by = wealth) + +sex + residence + maternal_educ + birth_order > Resid. Df Resid. Dev Df Deviance Pr(>Chi) > 1 28986 24175 > 2 28989 24196 -3.6952 -21.378 0.0001938 *** > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 >> str(rwanda2) > 'data.frame': 29027 obs. of 18 variables: > $ CASEID : Factor w/ 10718 levels " 1 5 2",..: 289 > 2243 7475 9982 6689 10137 7426 428 8415 10426 ... > $ mortality.under.2 : int 0 1 0 0 0 0 0 0 1 0 ... > $ maternal_age_disct: Factor w/ 3 levels "-25","+35","25-35": 1 1 1 1 1 1 > 3 1 3 1 ... > $ maternal_age : int 18 21 21 23 21 22 26 18 27 21 ... > $ time : int 3 3 3 3 3 3 3 3 3 3 ... > $ child_mortality : num 0.232 0.232 0.232 0.232 0.232 ... > $ democracy : Factor w/ 1 level "dictatorship": 1 1 1 1 1 1 1 1 1 > 1 ... > $ wealth : Factor w/ 5 levels "Lowest quintile",..: 2 4 1 4 5 1 > 4 1 4 5 ... > $ birth_year : int 1970 1970 1970 1970 1970 1970 1970 1970 1970 > 1970 ... > $ residence : Factor w/ 2 levels "Rural","Urban": 1 1 1 1 2 1 1 1 > 1 2 ... > $ birth_order : int 1 2 2 5 1 1 3 1 2 2 ... > $ maternal_educ : Factor w/ 4 levels "Higher","No education",..: 3 2 2 > 3 4 2 3 2 2 2 ... > $ sex : Factor w/ 2 levels "Female","Male": 1 1 2 2 1 1 2 2 > 2 2 ... > $ quinquennium : Factor w/ 7 levels "00-5's","70-4",..: 2 2 2 2 2 2 2 > 2 2 2 ... > $ time.1 : int 3 3 3 3 3 3 3 3 3 3 ... > $ new_time : int 0 0 0 0 0 0 0 0 0 0 ... > $ maternal_age_c : num -6.12 -3.12 -3.12 -1.12 -3.12 ... > $ birth_year_c : num -14.8 -14.8 -14.8 -14.8 -14.8 ... > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://joshuawiley.com/ Senior Analyst - Elkhart Group Ltd. http://elkhartgroup.com