Hi All I am running a linear regression using the lm object. In the event that my independent variable is the same across all observations the regression slope is returned as an NA. For example, if I have the following y=c(10,12,17) x=c(5,5,5) lm = lm(y~x) produces the following Coefficients: (Intercept) x 13 NA Other than post-processing the results, is there a way to output the slope as 0 rather than NA? Thanks Pete This e-mail may contain confidential or proprietary information belonging to the BP group and is intended only for the use of the recipients named above. If you are not the intended recipient, please immediately notify the sender and either delete this email or return to the sender immediately. You may not review, copy or distribute this email. Within the bounds of law, this part of BP retains all emails and IMs and may monitor them to ensure compliance with BP's internal policies and for other legitimate business purposes. [[alternative HTML version deleted]]
That comes out as an NA because X'X is not invertible because it is not full rank (one row/column is a linear combination of the other(s)). And that means there is no unique solution to the system. y=c(10,12,17) x=c(5,5,5) X=cbind(1,x) X t(X)%*%X solve(t(X)%*%X) Therefore, nope, there is now way to make this come out as a zero, because it fails the very assumptions of regression analysis. HTH, Daniel ------------------------- cuncta stricte discussurus ------------------------- -----Urspr?ngliche Nachricht----- Von: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Im Auftrag von Brecknock, Peter Gesendet: Friday, October 09, 2009 5:12 PM An: r-help at r-project.org Betreff: [R] lm output Hi All I am running a linear regression using the lm object. In the event that my independent variable is the same across all observations the regression slope is returned as an NA. For example, if I have the following y=c(10,12,17) x=c(5,5,5) lm = lm(y~x) produces the following Coefficients: (Intercept) x 13 NA Other than post-processing the results, is there a way to output the slope as 0 rather than NA? Thanks Pete This e-mail may contain confidential or proprietary information belonging to the BP group and is intended only for the use of the recipients named above. If you are not the intended recipient, please immediately notify the sender and either delete this email or return to the sender immediately. You may not review, copy or distribute this email. Within the bounds of law, this part of BP retains all emails and IMs and may monitor them to ensure compliance with BP's internal policies and for other legitimate business purposes. [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On 09-Oct-09 21:12:18, Brecknock, Peter wrote:> Hi All > I am running a linear regression using the lm object. > > In the event that my independent variable is the same across all > observations the regression slope is returned as an NA. > > For example, if I have the following > > y=c(10,12,17) > x=c(5,5,5) > > lm = lm(y~x) > produces the following > > Coefficients: > (Intercept) x > 13 NA > > Other than post-processing the results, is there a way to output the > slope as 0 rather than NA? > > Thanks > PeteYou should post-process! To incorporate such a thing into the function lm() itself would be an arbitrary resolution of an indeterminate situation, and would not be appropriate for a general-purpose function. Your situation is, graphically, 17 + * | 15 + | | 12 + * | 10 + * | | | | 5 +----+----+ 0 5 10 Any line whatever through the point (5,13) will fit the three points as well as any other, so the slope is indeterminate.[*] You have decided that you would like it to be zero, but that is an arbitrary decision! [*] Note that a vertical line (with slope +/- Inf) does not resolve the indeterminacy, since, although it goes through all three points, you do not know which point[s] on the line to use in evaluating the error -- the difference between a point on the line with given value of x and "the" y-value on the line corresponding to x -- because all points on the line qualify as potential y values, since they all have the same value of x. Since the slope is indeterminate, it is NA (that is what NA means). It arises numerically because the calculation of the slope is (N*sum(x*y) - sum(x)*sum(y))/(N*sum(x^2) - sum(x)^2) which is 0/0 -- likewise NA. Hoping this helps, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 09-Oct-09 Time: 22:41:37 ------------------------------ XFMail ------------------------------