Hi everyone, Running the vif() function from the car package like ---------------------------------------------------- > reg2 <- lm(CARsPur~Delay_max10+LawChange+MarketTrend_20d+MultiTrade, data=data.frame(VarVecPur)) > vif(reg2) Delay_max10 LawChange MarketTrend_20d MultiTrade 1.010572 1.009874 1.004278 1.003351 ---------------------------------------------------- gives a useful result. But using the right-hand variables as a matrix in the following way doesn't work with the vif() function: ---------------------------------------------------- > reg <- lm(CARsPur~VarVecPur) > summary(reg) Call: lm(formula = CARsPur ~ VarVecPur) Residuals: Min 1Q Median 3Q Max -0.72885 -0.06461 0.00493 0.06873 0.74936 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.037860 0.006175 -6.131 9.25e-10 *** VarVecPurDelay_max10 0.003661 0.001593 2.298 0.0216 * VarVecPurLawChange 0.004679 0.006185 0.757 0.4493 VarVecPurMarketTrend_20d 0.019015 0.001409 13.493 < 2e-16 *** VarVecPurMultiTrade -0.005081 0.003129 -1.624 0.1045 --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Residual standard error: 0.1229 on 6272 degrees of freedom Multiple R-squared: 0.03021, Adjusted R-squared: 0.02959 F-statistic: 48.84 on 4 and 6272 DF, p-value: < 2.2e-16 > vif(reg) Error in vif.lm(reg) : model contains fewer than 2 terms ---------------------------------------------------- Is there a solution or a way to work around? Thank you very much in advanced. -- Kind Regards, Martin H. Schmidt Humboldt University Berlin
Dear Martin,> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] > On Behalf Of Martin H. Schmidt > Sent: Thursday, September 20, 2012 8:52 AM > To: r-help at r-project.org > Subject: [R] Variance Inflation Factor VIC() with a matrix > > Hi everyone, > > Running the vif() function from the car package like > > ---------------------------------------------------- > > reg2 <- lm(CARsPur~Delay_max10+LawChange+MarketTrend_20d+MultiTrade, > data=data.frame(VarVecPur)) > > vif(reg2) > Delay_max10 LawChange MarketTrend_20d MultiTrade > 1.010572 1.009874 1.004278 1.003351 > ---------------------------------------------------- > > gives a useful result. But using the right-hand variables as a matrix in > the following way doesn't work with the vif() function: > > ---------------------------------------------------- > > reg <- lm(CARsPur~VarVecPur) > > summary(reg) > > Call: > lm(formula = CARsPur ~ VarVecPur) > > Residuals: > Min 1Q Median 3Q Max > -0.72885 -0.06461 0.00493 0.06873 0.74936 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) -0.037860 0.006175 -6.131 9.25e-10 *** > VarVecPurDelay_max10 0.003661 0.001593 2.298 0.0216 * > VarVecPurLawChange 0.004679 0.006185 0.757 0.4493 > VarVecPurMarketTrend_20d 0.019015 0.001409 13.493 < 2e-16 *** > VarVecPurMultiTrade -0.005081 0.003129 -1.624 0.1045 > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > Residual standard error: 0.1229 on 6272 degrees of freedom > Multiple R-squared: 0.03021, Adjusted R-squared: 0.02959 > F-statistic: 48.84 on 4 and 6272 DF, p-value: < 2.2e-16 > > > vif(reg) > Error in vif.lm(reg) : model contains fewer than 2 terms > > ---------------------------------------------------- > Is there a solution or a way to work around?Not with vif() in the car package, which wants to compute generalized variance inflation factors (GVIFs) for multi-df terms in the model. Single-df VIFs are pretty simple, so you could just write your own function. Alternatively, there are other packages on CRAN, such as DAAG, that compute VIFs, so you might try one of these. I hope this helps, John ----------------------------------------------- John Fox Senator McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada> > Thank you very much in advanced. > > > > -- > Kind Regards, > > Martin H. Schmidt > Humboldt University Berlin > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.
You've stumbled across the answer to your question -- while lm() supports y~X formulas without a data=argument and y~ X1+X2+X3 formulas with one, you can't depend on all contributed functions to do the same. As John pointed out, the advantage of car::vif over other implementations is that it correctly handles the cases of factors, polynomial terms, etc. for which generalized VIF is more useful, and this is most easily accommodated with the formula interface. The matrix interface takes less typing, but sometimes leaves you wondering later what you actually had in VarVecPur. -Michael On 9/20/2012 8:52 AM, Martin H. Schmidt wrote:> Hi everyone, > > Running the vif() function from the car package like > > ---------------------------------------------------- > > reg2 <- lm(CARsPur~Delay_max10+LawChange+MarketTrend_20d+MultiTrade, > data=data.frame(VarVecPur)) > > vif(reg2) > Delay_max10 LawChange MarketTrend_20d MultiTrade > 1.010572 1.009874 1.004278 1.003351 > ---------------------------------------------------- > > gives a useful result. But using the right-hand variables as a matrix in > the following way doesn't work with the vif() function: > > ---------------------------------------------------- > > reg <- lm(CARsPur~VarVecPur) > > summary(reg) > > Call: > lm(formula = CARsPur ~ VarVecPur) > > Residuals: > Min 1Q Median 3Q Max > -0.72885 -0.06461 0.00493 0.06873 0.74936 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) -0.037860 0.006175 -6.131 9.25e-10 *** > VarVecPurDelay_max10 0.003661 0.001593 2.298 0.0216 * > VarVecPurLawChange 0.004679 0.006185 0.757 0.4493 > VarVecPurMarketTrend_20d 0.019015 0.001409 13.493 < 2e-16 *** > VarVecPurMultiTrade -0.005081 0.003129 -1.624 0.1045 > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > Residual standard error: 0.1229 on 6272 degrees of freedom > Multiple R-squared: 0.03021, Adjusted R-squared: 0.02959 > F-statistic: 48.84 on 4 and 6272 DF, p-value: < 2.2e-16 > > > vif(reg) > Error in vif.lm(reg) : model contains fewer than 2 terms > > ---------------------------------------------------- > Is there a solution or a way to work around? > > Thank you very much in advanced. > > >-- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-2100 x66249 Fax: 416 736-5814 4700 Keele Street Web: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA