That depends on whether the IV could have some significant interactions with
other Ivs not considered in the bivariate analysis. E.g.,
> iv <- expand.grid(-2:2, -2:2)
> y <- 3 + iv[,1] * iv[,2] + rnorm(nrow(iv), sd=0.1)
> summary(lm(y ~ iv[,1]))
Call:
lm(formula = y ~ iv[, 1])
Residuals:
Min 1Q Median 3Q Max
-4.06259 -1.06048 -0.02377 1.05901 4.04315
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.01908 0.41482 7.278 2.09e-07 ***
iv[, 1] 0.01417 0.29332 0.048 0.962
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05
'.' 0.1 ' ' 1
Residual standard error: 2.074 on 23 degrees of freedom
Multiple R-Squared: 0.0001014, Adjusted R-squared: -0.04337
F-statistic: 0.002333 on 1 and 23 DF, p-value: 0.9619
> summary(lm(y ~ iv[,1] * iv[,2]))
Call:
lm(formula = y ~ iv[, 1] * iv[, 2])
Residuals:
Min 1Q Median 3Q Max
-0.22390 -0.08894 -0.01279 0.13525 0.17608
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.019083 0.026330 114.665 <2e-16 ***
iv[, 1] 0.014167 0.018618 0.761 0.455
iv[, 2] -0.005486 0.018618 -0.295 0.771
iv[, 1]:iv[, 2] 0.992865 0.013165 75.418 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05
'.' 0.1 ' ' 1
Residual standard error: 0.1316 on 21 degrees of freedom
Multiple R-Squared: 0.9963, Adjusted R-squared: 0.9958
F-statistic: 1896 on 3 and 21 DF, p-value: < 2.2e-16
Andy
From: Wensui Liu>
> Dear Lister,
>
> I have a question about variable selection for regression.
>
> if the IV is not significantly related to DV in the bivariate
> analysis, does
> it make sense to include this IV into the full model with
> multiple IVs?
>
> Thank you so much!
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
>