ronggui
2005-Apr-02 03:19 UTC
[R] using GAM to assess the linearity in logistic regression
as agresti(2002) points out that we had better to screen the data to see if the the logit(pi) and the predictor has linear realtionship in logistic regressin.and i find some materials in MASS and the refernce of s-plus.but it is a bit simple and i can not exactly master the means to assess the linearity in logistic regression. so anyone suggest some materials? i am not familiar with GAM,but i think thers maybe some materials can let me use GAM to assess the linearity in logistic regression without master GAM model. is it right? thank you!
Wensui Liu
2005-Apr-02 04:37 UTC
[R] using GAM to assess the linearity in logistic regression
I am a little confused about what you asked. If you want to assess the linearity in logistic regression, why do you want to use GAM instead of GLM? As far as I understand, GAM is used to capture nonlinearity rather linearity. Am I right here? On Apr 1, 2005 10:19 PM, ronggui <0034058 at fudan.edu.cn> wrote:> as agresti(2002) points out that we had better to screen the data to see if the the logit(pi) and the predictor has linear realtionship in logistic regressin.and i find some materials in MASS and the refernce of s-plus.but it is a bit simple and i can not exactly master the means to assess the linearity in logistic regression. so anyone suggest some materials? > > i am not familiar with GAM,but i think thers maybe some materials can let me use GAM to assess the linearity in logistic regression without master GAM model. is it right? > > thank you! > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >-- WenSui Liu, MS MA Senior Decision Support Analyst Division of Health Policy and Clinical Effectiveness Cincinnati Children Hospital Medical Center
ronggui
2005-Apr-02 06:12 UTC
[R] using GAM to assess the linearity in logistic regression
maybe the idea is simle,but the details is beyond me.you are right,gam can capture the non-linearity.but if the results from gam shows little evidence on on-linearity,then we can assume linearity exists. am i right? from agresti(2002): ... Before fitting the model and making such interpretations, look at the data to check that the logistic regression model is appropriate. Since Y takes only values 0 and 1, it is difficult to check this by plotting Y against x. It can be helpful to plot sample proportions or logits against x.......When X is continuous and all nis1, or when it is essentially continuous and all ni are small, this is unsatisfactory. One could group the data with nearby x values into categories before calculating sample proportions and sample logits. A better approach that does not require choosing arbitrary categories uses a smoothing mechanism to reveal trends. One such smoothing approach fits a generalized additive model__Section 4.8., which replaces the linear predictor of a GLM by a smooth function. Inspect a plot of the fit to see if severe discrepancies occur from the S-shaped trend predicted by logistic regression. from" S-PLUS (and R) Manual to Accompany Agresti¡¯s Categorical Data Analysis (2002)"(2nd edition,Laura A. Thompson, 2005) Prior to fitting a logistic regression model to data, one should check the assumption of a logistic relationship between the response and explanatory variables. A simple way to do this is to use the linear relationship between the logit and the explanatory variable. The values of the explanatory variable can be plotted against the sample logits (p. 168, Agresti) at those values. The plot should look roughly linear for a logistic model to be appropriate. If there are not enough response data at each unique x value (and categorizing x values is undesirable), then the technique of the last section in Chapter 4 can be used (i.e., GAM). There, we saw that a sigmoidal (or S-shaped) trend appeared in the plot of the response by predictor (Figure 4.7, Agresti). from MASS: .... Residuals are not always very informative with binary responses but at least none are particularly large here. An alternative approach is to predict the actual live birth weight and later threshold at 2.5 kilograms. This is left as an exercise for the reader; surprisingly it produces somewhat worse predictions with around 52 errors. We can examine the linearity in age and mother¡¯s weight more flexibly using generalized additive models. These stand in the same relationship to additive models (Section 8.8) as generalized linear models do to regression models; replace the linear predictor in a GLM by an additive model, the sum of linear and smooth terms in the explanatory variables. We use function gam from S-PLUS. (R has a somewhat different function gam in package mgcv by Simon Wood.)> attach(bwt) > age1 <- age*(ftv=="1"); age2 <- age*(ftv=="2+") > birthwt.gam <- gam(low ~ s(age) + s(lwt) + smoke + ptd +ht + ui + ftv + s(age1) + s(age2) + smoke:ui, binomial, bwt, bf.maxit=25)> summary(birthwt.gam)Residual Deviance: 170.35 on 165.18 degrees of freedom DF for Terms and Chi-squares for Nonparametric Effects Df Npar Df Npar Chisq P(Chi) s(age) 1 3.0 3.1089 0.37230 s(lwt) 1 2.9 2.3392 0.48532 s(age1) 1 3.0 3.2504 0.34655 s(age2) 1 3.0 3.1472 0.36829> table(low, predict(birthwt.gam) > 0)FALSE TRUE 0 115 15 1 28 31> plot(birthwt.gam, ask = T, se = T)Creating the variables age1 and age2 allows us to fit smooth terms for the difference in having one or more visits in the first trimester. Both the summary and the plots show no evidence of non-linearity. The convergence of the fitting algorithm is slow in this example, so we increased the control parameter bf.maxit from 10 to 25. The parameter ask = T allows us to choose plots from a menu. Our choice of plots is shown in Figure 7.2. See Chambers and Hastie (1992) for more details on gam . On Fri, 01 Apr 2005 23:37:13 -0500 Wensui Liu <liuwensui at gmail.com> wrote:> I am a little confused about what you asked. > > If you want to assess the linearity in logistic regression, why do you > want to use GAM instead of GLM? > > As far as I understand, GAM is used to capture nonlinearity rather linearity. > > Am I right here? > > > On Apr 1, 2005 10:19 PM, ronggui <0034058 at fudan.edu.cn> wrote: > > as agresti(2002) points out that we had better to screen the data to see if the the logit(pi) and the predictor has linear realtionship in logistic regressin.and i find some materials in MASS and the refernce of s-plus.but it is a bit simple and i can not exactly master the means to assess the linearity in logistic regression. so anyone suggest some materials? > > > > i am not familiar with GAM,but i think thers maybe some materials can let me use GAM to assess the linearity in logistic regression without master GAM model. is it right? > > > > thank you! > > > > ______________________________________________ > > R-help at stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > > > > > -- > WenSui Liu, MS MA > Senior Decision Support Analyst > Division of Health Policy and Clinical Effectiveness > Cincinnati Children Hospital Medical Center
John Fox
2005-Apr-02 13:08 UTC
[R] using GAM to assess the linearity in logistic regression
Dear ronggui, There are several approaches you can take, one of which is to fit a GAM and simply look to see whether the relationships appear linear on the logit scale. As well, you could compare the fit of the GAM with semiparametric models in which each smooth term in turn is replaced by a linear term; see ?anova.gam in the mcgv or gam package and the on-line appendix on nonparametric regression to my R and S-PLUS Companion to Applied Regression (at http://socserv.socsci.mcmaster.ca/jfox/Books/Companion/appendix-nonparametri c-regression.pdf, and slightly out of date). Another approach is to fit the linear logit model with glm() and examine component+residual (partial-residual) plots via the cr.plots() function or the ceres.plots() function, both in the car package. If nonlinearity in, say, x is correctable by a power transformation, you can get an approximate score test for the need to transform x by adding the "constructed variable" I(x*log(x)) to the model and examining its Wald statistic; an added-variable plot (av.plots in car) for the constructed variable shows leverage and influence on the decision to transform x. You can also compute a suggested power transformation as p = 1 - b/g, where b is the coefficient of x in the *original* model and g that of the constructed variable. Details are in the R and S-PLUS Companion. Some further examples are in lecture notes at http://socserv.socsci.mcmaster.ca/jfox/Courses/soc740/lecture-11.pdf. If x is quantitative but discrete, refitting the logit model replacing x with as.factor(x) and comparing via anova() to the original model gives a test of nonlinearity. I hope this helps, John -------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox --------------------------------> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of ronggui > Sent: Friday, April 01, 2005 10:19 PM > To: r-help at stat.math.ethz.ch > Subject: [R] using GAM to assess the linearity in logistic regression > > as agresti(2002) points out that we had better to screen the > data to see if the the logit(pi) and the predictor has linear > realtionship in logistic regressin.and i find some materials > in MASS and the refernce of s-plus.but it is a bit simple > and i can not exactly master the means to assess the > linearity in logistic regression. so anyone suggest some materials? > > i am not familiar with GAM,but i think thers maybe some > materials can let me use GAM to assess the linearity in > logistic regression without master GAM model. is it right? > > thank you! > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html