Following some old advice on this list, I have been reading the code for summary.lm to understand the computation of R-squared from a weighted regression. Usually weights in lm are applied to squared residuals, but I see that the weighted mean of the observations is calculated as if the weights are on the original scale: [...] f <- z$fitted.values w <- z$weights [...] m <- sum(w * f/sum(w)) [mss <-] sum(w * (f - m)^2) [...] This seems inconsistent to me. What am I missing? Murray Efford
Do you mean w <- z$residuals ? Type names(z) to see the list of item in your model. I ran your code on a lm and it work fine. You don't need the brackets around mss <- Michael Long On 04/07/2016 02:21 PM, Murray Efford wrote:> Following some old advice on this list, I have been reading the code for summary.lm to understand the computation of R-squared from a weighted regression. Usually weights in lm are applied to squared residuals, but I see that the weighted mean of the observations is calculated as if the weights are on the original scale: > > [...] > f <- z$fitted.values > w <- z$weights > [...] > m <- sum(w * f/sum(w)) > [mss <-] sum(w * (f - m)^2) > [...] > > This seems inconsistent to me. What am I missing? > > Murray Efford > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Perhaps I did not make clear that this is not my code - it is the code in summary.lm. I used square brackets to try to make my edited version of the text intelligible - print summary.lm and you will see where my version fits in. ________________________________________ From: R-help <r-help-bounces at r-project.org> on behalf of hd625b <hd625b at gmail.com> Sent: Friday, 8 April 2016 9:51 a.m. To: r-help at r-project.org Subject: Re: [R] R.squared in summary.lm with weights Do you mean w <- z$residuals ? Type names(z) to see the list of item in your model. I ran your code on a lm and it work fine. You don't need the brackets around mss <- Michael Long On 04/07/2016 02:21 PM, Murray Efford wrote:> Following some old advice on this list, I have been reading the code for summary.lm to understand the computation of R-squared from a weighted regression. Usually weights in lm are applied to squared residuals, but I see that the weighted mean of the observations is calculated as if the weights are on the original scale: > > [...] > f <- z$fitted.values > w <- z$weights > [...] > m <- sum(w * f/sum(w)) > [mss <-] sum(w * (f - m)^2) > [...] > > This seems inconsistent to me. What am I missing? > > Murray Efford > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On 07/04/2016 5:21 PM, Murray Efford wrote:> Following some old advice on this list, I have been reading the code for summary.lm to understand the computation of R-squared from a weighted regression. Usually weights in lm are applied to squared residuals, but I see that the weighted mean of the observations is calculated as if the weights are on the original scale: > > [...] > f <- z$fitted.values > w <- z$weights > [...] > m <- sum(w * f/sum(w)) > [mss <-] sum(w * (f - m)^2) > [...] > > This seems inconsistent to me. What am I missing?I think you are expecting consistency where there needn't be any. Why do you see an inconsistency here? Those are different calculations. You get expressions like these if you assume observations have variance sigma^2/w, and you're trying to estimate sigma^2. Duncan Murdoch
On 08 Apr 2016, at 12:57 , Duncan Murdoch <murdoch.duncan at gmail.com> wrote:> On 07/04/2016 5:21 PM, Murray Efford wrote: >> Following some old advice on this list, I have been reading the code for summary.lm to understand the computation of R-squared from a weighted regression. Usually weights in lm are applied to squared residuals, but I see that the weighted mean of the observations is calculated as if the weights are on the original scale: >> >> [...] >> f <- z$fitted.values >> w <- z$weights >> [...] >> m <- sum(w * f/sum(w)) >> [mss <-] sum(w * (f - m)^2) >> [...] >> >> This seems inconsistent to me. What am I missing? > > I think you are expecting consistency where there needn't be any. Why do you see an inconsistency here? Those are different calculations. You get expressions like these if you assume observations have variance sigma^2/w, and you're trying to estimate sigma^2. >It's also perfectly consistent that m is the minimizer of mss: d/dm sum(w*(f-m)^2) = -2 sum(w*(f-m)) = 0 => m = sum(w*f) / sum(w) However, beware the distiction between inverse variance weights, replication weights, and sampling weights.> Duncan Murdoch > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com