Kiyoshi Sasaki
2010-Jul-07 03:46 UTC
[R] Why do <none>s appear in the list of predictor variables in logistic regression using 'step' or 'stepAIC' function?
Would anyone help me solve my problem with R, please? I am very new to R. I am doing logistic regression analysis on the presence/absence of salamanders using several predictor variables, as shown below. I have checked my data, but I didn't find any 'NA' or empty cells. When I used step() or stepAIC to select significant predictor variables, <none>s appear to places where predictor variables are listed (please see the bottom part of the codes I used and their output. Could anyone know what is going on? Just in cases, I copied the data I am using at the end of the output. Thank you in advance for your time and help! Kiyoshi> # Step 2: Includes all of the variables identified from Step 1. > logit<-glm(Presence~AreaOfCover+CoverCharac+Ivy, data=rbs.no.NA.rows, family=binomial(link=logit), na.action=na.exclude, control=list(epsilon = 0.0001, maxit = 50, trace = F)) > summary(logit)Call: glm(formula = Presence ~ AreaOfCover + CoverCharac + Ivy, family = binomial(link = logit), data = rbs.no.NA.rows, na.action = na.exclude, control = list(epsilon = 1e-04, maxit = 50, trace = F)) Deviance Residuals: Min 1Q Median 3Q Max -1.7596 -0.8397 0.6524 0.8468 2.1962 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 1.0425418 0.5109583 2.040 0.041314 * AreaOfCover -0.0002055 0.0001530 -1.343 0.179173 CoverCharac 0.1141122 0.1748377 0.653 0.513966 Ivy -0.0351472 0.0104595 -3.360 0.000779 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 125.97 on 91 degrees of freedom Residual deviance: 100.56 on 88 degrees of freedom AIC: 108.56 Number of Fisher Scoring iterations: 4> res<-step(logit, direction="backward", scope=list(upper=logit$formula))Start: AIC=108.56 Presence ~ AreaOfCover + CoverCharac + Ivy Df Deviance AIC - CoverCharac 1 101.00 107.00 - AreaOfCover 1 102.50 108.50 <none> 100.56 108.56 - Ivy 1 118.05 124.05 Step: AIC=107 Presence ~ AreaOfCover + Ivy Df Deviance AIC <none> 101.00 107.00 - AreaOfCover 1 103.01 107.01 - Ivy 1 118.86 122.86 The blow is my data:> rbs.no.NA.rowsPresence AreaOfCover CoverCharac Ivy DOS DUS DSHRB HSHRUB HVEG LEAF WSTEM VEG 1 1 2200.0 2 0 2.0 0.2 5.0 1.0 1.00 5 5 1 2 0 4000.0 2 0 2.0 0.5 4.0 1.0 1.00 0 10 0 3 0 2880.0 2 0 3.0 1.0 2.0 1.0 0.50 0 5 0 4 0 2200.0 5 0 2.0 2.0 4.0 1.0 0.20 10 5 1 5 0 625.0 5 5 2.0 1.0 3.0 1.0 0.10 5 5 10 6 1 1740.0 5 0 1.0 0.8 3.0 1.0 0.10 10 10 10 7 1 5000.0 5 0 2.0 1.0 5.0 1.0 0.10 10 0 1 8 0 2400.0 2 0 1.5 1.5 5.0 1.0 0.50 10 20 1 9 1 45.0 2 1 0.0 2.0 0.1 1.5 0.00 20 20 20 10 1 280.0 1 30 0.8 1.0 2.5 1.6 0.10 40 20 30 11 1 250.0 1 0 2.0 2.5 3.0 0.5 0.10 70 10 50 12 0 32.0 2 90 2.0 1.5 2.5 1.0 0.10 40 10 40 13 1 28.0 1 20 1.5 0.5 0.5 1.6 0.10 30 25 70 14 1 1032.0 5 20 3.0 1.0 1.5 1.8 0.10 30 80 40 15 1 1032.0 5 20 3.0 1.0 1.5 1.8 0.10 30 80 40 16 0 2880.0 1 5 0.1 1.2 1.4 1.0 0.10 60 20 40 19 0 800.0 5 100 1.0 2.0 2.0 1.0 0.20 40 0 30 20 1 400.0 5 0 1.0 1.0 0.1 1.7 0.10 40 10 40 21 1 315.0 2 10 1.0 1.5 1.3 1.7 0.10 30 1 30 22 0 600.0 2 0 0.5 0.6 1.0 1.0 0.50 1 20 10 23 1 1400.0 6 0 1.5 1.0 3.0 1.0 0.10 10 5 10 24 0 190.0 1 60 1.0 1.0 3.0 1.5 0.10 70 10 70 25 1 1100.0 3 19 1.0 1.5 3.0 1.0 0.10 50 25 10 26 1 484.5 1 0 3.0 1.0 10.0 0.5 0.10 70 15 10 27 1 300.0 4 0 3.0 0.5 3.0 0.3 0.10 50 10 10 28 0 598.0 1 30 2.0 1.5 2.0 2.0 0.50 10 5 80 29 0 1750.0 1 100 1.0 1.0 0.8 2.0 0.60 100 20 100 30 1 476.0 1 0 3.0 2.0 4.0 1.0 0.20 0 0 30 31 1 272.0 2 0 2.0 1.0 30.0 0.0 0.10 90 5 10 32 1 2000.0 1 1 2.0 0.5 30.0 0.0 0.10 10 5 5 33 1 1908.0 1 0 0.5 1.0 30.0 1.0 0.10 0 2 10 34 1 1802.0 1 10 1.0 1.0 30.0 0.8 0.10 40 5 60 35 1 570.0 1 0 2.0 1.5 30.0 0.0 0.10 80 5 5 36 0 656.0 1 0 4.0 1.5 30.0 0.3 0.10 100 5 1 37 1 850.0 5 0 1.0 1.0 30.0 0.3 0.10 0 5 0 38 1 1536.0 1 0 1.5 2.0 30.0 0.0 0.10 90 5 0 39 1 1536.0 1 0 1.5 3.0 30.0 0.0 1.10 90 5 0 40 1 600.0 1 0 0.5 1.5 30.0 0.2 0.20 95 20 5 42 1 6500.0 2 0 3.0 2.5 1.0 2.5 0.15 10 5 10 43 1 600.0 2 2 3.0 2.0 1.5 2.5 0.20 10 5 30 44 0 3150.0 1 70 2.0 3.0 20.0 2.0 0.20 60 3 10 45 1 3000.0 1 30 2.0 3.0 20.0 2.0 20.00 60 5 13 46 0 1620.0 1 70 1.0 1.5 2.0 2.0 0.20 0 5 45 47 0 1008.0 1 2 3.0 2.0 2.0 1.0 0.20 0 2 92 48 0 980.0 1 2 1.0 1.0 3.0 2.0 0.20 25 5 70 49 1 686.0 1 2 1.0 30.0 5.0 0.8 0.50 60 0 70 50 1 686.0 1 3 1.0 30.0 6.0 0.8 0.50 40 0 80 53 1 1680.0 1 0 3.5 0.1 1.5 0.8 0.50 60 0 20 54 0 4620.0 1 20 2.5 20.0 30.0 0.1 0.10 20 0 95 55 0 1827.0 2 30 0.3 1.0 0.5 0.3 0.10 80 10 10 56 1 495.0 2 0 0.2 0.5 0.8 3.0 0.00 95 30 10 57 1 495.0 2 0 0.2 0.5 0.8 3.0 0.00 95 30 10 58 0 5565.0 2 50 0.8 1.5 0.8 0.3 0.00 60 10 20 59 0 1440.0 1 100 3.0 1.0 30.0 0.2 0.10 60 0 70 60 1 800.0 2 0 2.0 1.5 30.0 0.0 0.10 90 5 10 61 0 2150.0 2 0 0.2 1.0 30.0 0.3 0.10 90 0 50 64 0 799.0 2 0 1.0 5.0 30.0 0.0 0.05 20 0 80 66 0 740.0 2 0 5.0 7.0 30.0 0.0 0.05 90 10 40 67 1 720.0 6 0 4.0 1.0 30.0 0.0 0.05 40 0 80 68 0 938.0 1 0 25.0 10.0 0.0 0.0 0.05 30 30 80 72 0 750.0 2 0 1.0 5.0 30.0 1.0 0.06 5 30 100 73 1 840.0 1 30 2.0 1.0 30.0 0.0 0.10 30 10 100 75 0 2250.0 2 75 5.0 2.0 30.0 1.2 0.60 30 30 100 76 0 7150.0 2 100 2.0 1.0 2.5 2.0 0.30 25 5 100 77 1 3420.0 2 20 2.0 1.0 30.0 0.0 0.05 90 10 70 78 0 2028.0 2 10 1.0 2.0 30.0 0.0 0.05 80 5 70 79 0 770.0 2 20 1.0 1.0 30.0 0.0 0.05 60 0 70 80 1 448.0 5 0 0.0 1.0 30.0 0.0 1.00 60 0 90 81 1 448.0 5 0 0.0 1.0 30.0 0.0 1.00 60 0 90 82 1 448.0 5 0 0.0 1.0 30.0 0.0 1.00 60 0 90 83 1 564.0 1 0 15.0 1.0 30.0 0.0 0.05 100 0 30 85 0 1150.0 3 100 2.0 2.5 2.0 0.5 0.20 70 10 100 86 1 450.0 1 0 4.0 2.0 8.0 0.3 0.10 40 10 70 87 0 1600.0 1 15 1.5 2.0 1.0 3.0 0.15 10 0 80 88 1 1274.0 1 0 3.0 3.0 2.0 0.8 0.20 100 0 5 89 0 1800.0 3 100 1.0 1.0 30.0 0.0 0.20 20 10 100 90 1 2088.0 1 0 0.1 0.1 1.0 2.0 0.30 20 0 100 91 0 6750.0 2 100 1.5 2.0 3.0 1.2 2.00 70 10 100 92 0 17500.0 2 100 0.1 4.0 1.0 1.5 0.30 80 10 100 93 0 1500.0 2 0 2.0 1.0 0.2 1.2 0.20 25 10 100 94 0 4000.0 1 100 2.0 1.0 1.0 1.2 0.20 70 10 100 95 1 450.0 1 30 4.0 2.0 2.5 1.2 0.08 90 0 25 96 1 450.0 1 30 4.0 2.0 2.5 1.2 0.08 90 0 25 97 1 390.0 1 10 2.0 2.0 30.0 2.0 0.70 70 10 50 98 1 560.0 1 0 2.0 2.0 2.0 2.5 0.20 90 5 80 99 1 2070.0 2 90 2.0 2.0 2.0 1.2 0.20 50 20 100 100 1 1820.0 1 20 3.0 0.5 5.0 0.0 0.20 60 0 100 101 0 4200.0 1 40 3.0 1.5 5.0 0.6 0.20 80 5 80 102 0 2000.0 1 20 2.5 1.0 4.0 0.2 0.20 100 5 50 103 0 2200.0 2 60 0.8 0.5 3.0 2.0 0.20 100 5 90 104 1 8800.0 1 0 1.5 2.0 2.0 1.5 0.20 60 35 1 105 1 124.0 1 0 3.0 1.0 30.0 0.0 0.10 90 5 15 [[alternative HTML version deleted]]
Gabor Grothendieck
2010-Jul-07 04:11 UTC
[R] Why do <none>s appear in the list of predictor variables in logistic regression using 'step' or 'stepAIC' function?
On Tue, Jul 6, 2010 at 11:46 PM, Kiyoshi Sasaki <skiyoshi2001 at yahoo.com> wrote:> > > Would anyone help me solve my problem with R, please? I am very new to R. I am doing logistic regression analysis on the presence/absence of salamanders using several predictor variables, as shown below. I have checked my data, but I didn't find any 'NA' or empty cells. When I used step() or stepAIC to select significant predictor variables, <none>s appear to places where predictor variables are listed (please see the bottom part of the codes I used and their output. Could anyone know what is going on? Just in cases, I copied the data I am using at the end of the output. ><none> refers to the line for which no variable is dropped and therefore that line shows what the AIC is as you entered that step. Each other line shows what the AIC is for dropping the indicated variable so that if the variable is on a line above the <none> line then dropping that variable lowers the AIC and if the variable is on a line below the <none> line then dropping it increases the AIC. <none> is not related to missing values.