tmp <- data.frame(x=c(1,1), y=c(1,2)) tmp.lm <- lm(y ~ x, data=tmp) summary(tmp.lm) coef(summary(tmp.lm)) ## I consider this to be a bug. Since summary(tmp.lm) gives ## two rows for the coefficients, I believe the coef() function ## should also give two rows.> summary(tmp.lm)Call: lm(formula = y ~ x, data = tmp) Residuals: 1 2 -0.5 0.5 Coefficients: (1 not defined because of singularities) Estimate Std. Error t value Pr(>|t|) (Intercept) 1.5 0.5 3 0.205 x NA NA NA NA Residual standard error: 0.7071 on 1 degrees of freedom> coef(summary(tmp.lm))Estimate Std. Error t value Pr(>|t|) (Intercept) 1.5 0.5 3 0.2048328> > version_ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 4.0 year 2006 month 10 day 03 svn rev 39566 language R version.string R version 2.4.0 (2006-10-03)>## this is a related problem tmp <- data.frame(x=c(1,2), y=c(1,2)) tmp.lm <- lm(y ~ x, data=tmp) summary(tmp.lm) coef(summary(tmp.lm)) ## Here the summary() give NA for the values that can't be ## calculated and the coef() function gives NaN. I think both ## functions should return the same result.> summary(tmp.lm)Call: lm(formula = y ~ x, data = tmp) Residuals: ALL 2 residuals are 0: no residual degrees of freedom! Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0 NA NA NA x 1 NA NA NA Residual standard error: NaN on 0 degrees of freedom Multiple R-Squared: 1, Adjusted R-squared: NaN F-statistic: NaN on 1 and 0 DF, p-value: NA> > coef(summary(tmp.lm))Estimate Std. Error t value Pr(>|t|) (Intercept) 0 NaN NaN NaN x 1 NaN NaN NaN> >
it doesn't appear to be a bug for me, given that one of your coefficients is NA due to linear dependencies on your design matrix. i prefer to think of it as a feature :-) (show only the coefficients for the variables that do not show linear dependencies). x=1:5 y=c(1:3, 7, 6) fit=lm(y~x) coef(fit) coef(summary(fit)) b On Nov 12, 2006, at 11:28 PM, rmh at temple.edu wrote:> tmp <- data.frame(x=c(1,1), > y=c(1,2)) > > tmp.lm <- lm(y ~ x, data=tmp) > summary(tmp.lm) > > coef(summary(tmp.lm)) > > ## I consider this to be a bug. Since summary(tmp.lm) gives > ## two rows for the coefficients, I believe the coef() function > ## should also give two rows. > > > >> summary(tmp.lm) > > Call: > lm(formula = y ~ x, data = tmp) > > Residuals: > 1 2 > -0.5 0.5 > > Coefficients: (1 not defined because of singularities) > Estimate Std. Error t value Pr(>|t|) > (Intercept) 1.5 0.5 3 0.205 > x NA NA NA NA > > Residual standard error: 0.7071 on 1 degrees of freedom > >> coef(summary(tmp.lm)) > Estimate Std. Error t value Pr(>|t|) > (Intercept) 1.5 0.5 3 0.2048328 >> >> version > _ > platform i386-pc-mingw32 > arch i386 > os mingw32 > system i386, mingw32 > status > major 2 > minor 4.0 > year 2006 > month 10 > day 03 > svn rev 39566 > language R > version.string R version 2.4.0 (2006-10-03) >> > > > ## this is a related problem > > tmp <- data.frame(x=c(1,2), > y=c(1,2)) > > tmp.lm <- lm(y ~ x, data=tmp) > summary(tmp.lm) > > coef(summary(tmp.lm)) > > ## Here the summary() give NA for the values that can't be > ## calculated and the coef() function gives NaN. I think both > ## functions should return the same result. > > >> summary(tmp.lm) > > Call: > lm(formula = y ~ x, data = tmp) > > Residuals: > ALL 2 residuals are 0: no residual degrees of freedom! > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 0 NA NA NA > x 1 NA NA NA > > Residual standard error: NaN on 0 degrees of freedom > Multiple R-Squared: 1, Adjusted R-squared: NaN > F-statistic: NaN on 1 and 0 DF, p-value: NA > >> >> coef(summary(tmp.lm)) > Estimate Std. Error t value Pr(>|t|) > (Intercept) 0 NaN NaN NaN > x 1 NaN NaN NaN >> >> > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
ripley at stats.ox.ac.uk
2006-Nov-13 08:06 UTC
[Rd] inconsistency or bug in coef() (PR#9358)
On Mon, 13 Nov 2006, rmh at temple.edu wrote:> tmp <- data.frame(x=c(1,1), > y=c(1,2)) > > tmp.lm <- lm(y ~ x, data=tmp) > summary(tmp.lm) > > coef(summary(tmp.lm)) > > ## I consider this to be a bug. Since summary(tmp.lm) gives > ## two rows for the coefficients, I believe the coef() function > ## should also give two rows.That claim is false: it is print.summary.lm that is giving two lines, not the result of summary.lm: try unclass(summary(tmp.lm)) This is also clear from the Value section of ?summary.lm, whose See Also says Function 'coef' will extract the matrix of coefficients with standard errors, t-statistics and p-values. The point is that the print method is making use of both the $coefficients and the $aliased components. I really do think this is clear from reading the help page: did you actually cross-check before sending a bug report?> > > >> summary(tmp.lm) > > Call: > lm(formula = y ~ x, data = tmp) > > Residuals: > 1 2 > -0.5 0.5 > > Coefficients: (1 not defined because of singularities) > Estimate Std. Error t value Pr(>|t|) > (Intercept) 1.5 0.5 3 0.205 > x NA NA NA NA > > Residual standard error: 0.7071 on 1 degrees of freedom > >> coef(summary(tmp.lm)) > Estimate Std. Error t value Pr(>|t|) > (Intercept) 1.5 0.5 3 0.2048328 >> >> version > _ > platform i386-pc-mingw32 > arch i386 > os mingw32 > system i386, mingw32 > status > major 2 > minor 4.0 > year 2006 > month 10 > day 03 > svn rev 39566 > language R > version.string R version 2.4.0 (2006-10-03) >> > > > ## this is a related problem > > tmp <- data.frame(x=c(1,2), > y=c(1,2)) > > tmp.lm <- lm(y ~ x, data=tmp) > summary(tmp.lm) > > coef(summary(tmp.lm)) > > ## Here the summary() give NA for the values that can't be > ## calculated and the coef() function gives NaN. I think both > ## functions should return the same result. > > >> summary(tmp.lm) > > Call: > lm(formula = y ~ x, data = tmp) > > Residuals: > ALL 2 residuals are 0: no residual degrees of freedom! > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 0 NA NA NA > x 1 NA NA NA > > Residual standard error: NaN on 0 degrees of freedom > Multiple R-Squared: 1, Adjusted R-squared: NaN > F-statistic: NaN on 1 and 0 DF, p-value: NA > >> >> coef(summary(tmp.lm)) > Estimate Std. Error t value Pr(>|t|) > (Intercept) 0 NaN NaN NaN > x 1 NaN NaN NaN >> >> > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
I am fascinated. I can accept the argument that suppressing aliased values is a valid behavior of the program. The defense of the inconsistency between the two displays surprises me. Yes, Brian, I checked. That's why I offered the option of an inconsistency. I do believe strongly that the coef section of summary and the coef function should give identical results. I still think the inconsistency is a bug. I think the author of the print.summary.lm did the right thing by showing the requested coef for the x variable and giving it a missing value. Rich