Szumiloski, John
2011-Jun-24 14:59 UTC
[R] mgcv:gamm: predict to reflect random s() effects?
Dear useRs, I am using the gamm function in the mgcv package to model a smooth relationship between a covariate and my dependent variable, while allowing for quantification of the subjectwise variability in the smooths. What I would like to do is to make subjectwise predictions for plotting purposes which account for the random smooth components of the fit. An example. (sessionInfo() is at bottom of message) My model is analogous to > out.gamm <- gamm( Y ~ Group + s(X, by=Group), random = list(Subject=~1) ) Y and X are numeric, Group is an unordered factor with 5 levels, and Subject is an unordered factor with ~70 levels Now the output from gamm is a list with an lme component and a gam component. If I make a data frame "newdat" like this: > newdat X Group Subject 5 g1 s1 5 g1 s2 5 g1 s3 6 g1 s1 6 g1 s2 6 g1 s3 I can get the fixed effects prediction of the smooth by > predict(out.gamm$gam, newdata=newdat) Which gives 1 1.1 1.2 2 2.1 2.2 3.573210 3.573210 3.573210 3.553694 3.553694 3.553694 But I note that the predictions are identical across different values of Subject. So this accounts for only the fixed effects part of the model, and not any random smooth effects. If I try to extract predictions from the lme component: > predict(out.gamm$lme, newdata=newdat) I get the following error message: Error in predict.lme(out.gamm$lme, newdata = newdat) : Cannot evaluate groups for desired levels on "newdata" So, is there a way to get subjectwise predictions which include the random effect contributions of the smooths? Thanks, John --------- ### session info follows> sessionInfo()R version 2.13.0 Patched (2011-06-20 r56188) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] grDevices datasets splines grid graphics utils stats methods base other attached packages: [1] mgcv_1.7-6 gmodels_2.15.1 car_2.0-10 nnet_7.3-1 MASS_7.3-13 nlme_3.1-101 [7] rms_3.3-1 Hmisc_3.8-3 survival_2.36-9 lattice_0.19-26 loaded via a namespace (and not attached): [1] cluster_1.14.0 gdata_2.8.2 gtools_2.6.2 Matrix_0.999375-50 tools_2.13.0 John Szumiloski, Ph.D. Senior Biometrician Biometrics Research WP53B-120 Merck Research Laboratories P.O. Box 0004 West Point, PA 19486-0004 USA (215) 652-7346 (PH) (215) 993-1835 (FAX) john<dot>szumiloski<at>merck<dot>com ___________________________________________________ These opinions are my own and do not necessarily reflect that of Merck & Co., Inc. Notice: This e-mail message, together with any attachme...{{dropped:14}}
Given that you don't have huge numbers of subjects you could fit the model with `gam' rather than `gamm', using out.gamm <- gam( Y ~ Group + s(X, by=Group) + s(Subject,bs="re"), method="REML") Then your predictions will differ by subject (see e.g. ?random.effects for a bit more information on simple random effects in mgcv:gam). A further trick allows you to choose whether to predict with the subject effects at their predicted values, or zero. Let dum be a vector of 1's... out.gamm <- gam( Y ~ Group + s(X, by=Group) + s(Subject,bs="re",by=dum), method="REML") Predicting with dum set to 1 gives the predictions that you want. Setting dum to 0 gives predictions with the prediction Subject effects set to zero. The reason that trying to predict with the gamm lme object is tricky relates to how gamm works. It takes the GAMM specification, and then sets up a corresponding `working mixed model' which is estimated using lme. The working mixed model uses working variable names set within gamm. If you try to predict using the working model lme object then predict.lme looks for these internal working variable names, not the variable names that you supplied.... Basically gamm treats all random effects as 'part of the noise' in the model specification, and adjusts the variance estimates for the smooths and fixed effects to reflect this. It isn't set up to predict easily at different random effect grouping levels, in the way that lme is. best, Simon On 24/06/11 15:59, Szumiloski, John wrote:> Dear useRs, > I am using the gamm function in the mgcv package to model a smooth > relationship between a covariate and my dependent variable, while > allowing for quantification of the subjectwise variability in the > smooths. What I would like to do is to make subjectwise predictions for > plotting purposes which account for the random smooth components of the fit. > An example. (sessionInfo() is at bottom of message) My model is > analogous to > > out.gamm <- gamm( Y ~ Group + s(X, by=Group), random = list(Subject=~1) ) > Y and X are numeric, Group is an unordered factor with 5 levels, and > Subject is an unordered factor with ~70 levels > Now the output from gamm is a list with an lme component and a gam > component. If I make a data frame "newdat" like this: > > newdat > X Group Subject > 5 g1 s1 > 5 g1 s2 > 5 g1 s3 > 6 g1 s1 > 6 g1 s2 > 6 g1 s3 > I can get the fixed effects prediction of the smooth by > > predict(out.gamm$gam, newdata=newdat) > Which gives > 1 1.1 1.2 2 2.1 2.2 > 3.573210 3.573210 3.573210 3.553694 3.553694 3.553694 > But I note that the predictions are identical across different values of > Subject. So this accounts for only the fixed effects part of the model, > and not any random smooth effects. > If I try to extract predictions from the lme component: > > predict(out.gamm$lme, newdata=newdat) > I get the following error message: > Error in predict.lme(out.gamm$lme, newdata = newdat) : > Cannot evaluate groups for desired levels on "newdata" > So, is there a way to get subjectwise predictions which include the > random effect contributions of the smooths? > Thanks, John > --------- > ### session info follows > > sessionInfo() > R version 2.13.0 Patched (2011-06-20 r56188) > Platform: i386-pc-mingw32/i386 (32-bit) > locale: > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United > States.1252 > [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > attached base packages: > [1] grDevices datasets splines grid graphics utils stats methods base > other attached packages: > [1] mgcv_1.7-6 gmodels_2.15.1 car_2.0-10 nnet_7.3-1 MASS_7.3-13 > nlme_3.1-101 > [7] rms_3.3-1 Hmisc_3.8-3 survival_2.36-9 lattice_0.19-26 > loaded via a namespace (and not attached): > [1] cluster_1.14.0 gdata_2.8.2 gtools_2.6.2 Matrix_0.999375-50 tools_2.13.0 > John Szumiloski, Ph.D. > Senior Biometrician > Biometrics Research > WP53B-120 > Merck Research Laboratories > P.O. Box 0004 > West Point, PA 19486-0004 > USA > (215) 652-7346 (PH) > (215) 993-1835 (FAX) > john<dot>szumiloski<at>merck<dot>com > ___________________________________________________ > These opinions are my own and do not necessarily reflect that of > Merck & Co., Inc. > > Notice: This e-mail message, together with any attach...{{dropped:16}}
> In your dummy variable trick, I see how subjectwise predictions are > obtained. But to get the Group mean predictions, isn't the s term > with the 0 dummy by variable the same as just omitting the s term > with Subject altogether? I would think so but want to check.- Yes it's the same as omitting the term altogether for prediction (of means), but that's not the same as having omitted it for fitting, of course.> In the spirit of handling subjectwise variabilities, I could imagine > using the term > > s(Subject, bs="re", by=X) > > and trying to emulate lme-like behavior and fitting a random effect > slope to each subject. First question is, is that indeed what it > does? If so, would this term include an intercept? I would assume > it doesn't but again want to check.- s(Subject,X,bs="re") is a slightly better way of doing this, as it generalises a bit more easily. To include random intercepts as well use s(Subject,bs="re") + s(Subject,X,bs="re") ...exactly how these terms work is explained in the details section of ?smooth.construct.re.smooth.spec. best, Simon> Thanks again, John > > > > -----Original Message----- From: Simon Wood > [mailto:s.wood at bath.ac.uk] Sent: Friday, 24 June, 2011 11:43 AM To: > Szumiloski, John Cc: r-help at r-project.org Subject: Re: mgcv:gamm: > predict to reflect random s() effects? > > Given that you don't have huge numbers of subjects you could fit the > model with `gam' rather than `gamm', using > > out.gamm<- gam( Y ~ Group + s(X, by=Group) + s(Subject,bs="re"), > method="REML") > > Then your predictions will differ by subject (see e.g. > ?random.effects for a bit more information on simple random effects > in mgcv:gam). > > A further trick allows you to choose whether to predict with the > subject effects at their predicted values, or zero. > > Let dum be a vector of 1's... > > out.gamm<- gam( Y ~ Group + s(X, by=Group) + > s(Subject,bs="re",by=dum), method="REML") > > Predicting with dum set to 1 gives the predictions that you want. > Setting dum to 0 gives predictions with the prediction Subject > effects set to zero. > > The reason that trying to predict with the gamm lme object is tricky > relates to how gamm works. It takes the GAMM specification, and then > sets up a corresponding `working mixed model' which is estimated > using lme. The working mixed model uses working variable names set > within gamm. If you try to predict using the working model lme > object then predict.lme looks for these internal working variable > names, not the variable names that you supplied.... > > Basically gamm treats all random effects as 'part of the noise' in > the model specification, and adjusts the variance estimates for the > smooths and fixed effects to reflect this. It isn't set up to > predict easily at different random effect grouping levels, in the way > that lme is. > > best, Simon > > On 24/06/11 15:59, Szumiloski, John wrote: >> Dear useRs, I am using the gamm function in the mgcv package to >> model a smooth relationship between a covariate and my dependent >> variable, while allowing for quantification of the subjectwise >> variability in the smooths. What I would like to do is to make >> subjectwise predictions for plotting purposes which account for >> the random smooth components of the fit. An example. (sessionInfo() >> is at bottom of message) My model is analogous to> out.gamm<- >> gamm( Y ~ Group + s(X, by=Group), random = list(Subject=~1) ) Y and >> X are numeric, Group is an unordered factor with 5 levels, and >> Subject is an unordered factor with ~70 levels Now the output from >> gamm is a list with an lme component and a gam component. If I make >> a data frame "newdat" like this: >>> newdat >> X Group Subject 5 g1 s1 5 g1 s2 5 g1 s3 6 g1 s1 6 g1 s2 6 g1 s3 I >> can get the fixed effects prediction of the smooth by> >> predict(out.gamm$gam, newdata=newdat) Which gives 1 1.1 1.2 2 2.1 >> 2.2 3.573210 3.573210 3.573210 3.553694 3.553694 3.553694 But I >> note that the predictions are identical across different values of >> Subject. So this accounts for only the fixed effects part of the >> model, and not any random smooth effects. If I try to extract >> predictions from the lme component: >>> predict(out.gamm$lme, newdata=newdat) I get the following error >> message: Error in predict.lme(out.gamm$lme, newdata = newdat) : >> Cannot evaluate groups for desired levels on "newdata" So, is >> there a way to get subjectwise predictions which include the >> random effect contributions of the smooths? Thanks, John --------- >> ### session info follows >>> sessionInfo() >> R version 2.13.0 Patched (2011-06-20 r56188) Platform: >> i386-pc-mingw32/i386 (32-bit) locale: [1] >> LC_COLLATE=English_United States.1252 LC_CTYPE=English_United >> States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C >> [5] LC_TIME=English_United States.1252 attached base packages: [1] >> grDevices datasets splines grid graphics utils stats methods base >> other attached packages: [1] mgcv_1.7-6 gmodels_2.15.1 car_2.0-10 >> nnet_7.3-1 MASS_7.3-13 nlme_3.1-101 [7] rms_3.3-1 Hmisc_3.8-3 >> survival_2.36-9 lattice_0.19-26 loaded via a namespace (and not >> attached): [1] cluster_1.14.0 gdata_2.8.2 gtools_2.6.2 >> Matrix_0.999375-50 tools_2.13.0 John Szumiloski, Ph.D. > [snip] > > > -- Simon Wood, Mathematical Science, University of Bath BA2 7AY UK > +44 (0)1225 386603 http://people.bath.ac.uk/sw283 > Notice: This e-mail message, together with any attach...{{dropped:18}}