thr3ads.net - R help - [R] Question About lm() [Feb 2022]

If this information is useful, please help other people find it:
Share via:

Bromaghin, Jeffrey F

2022-Feb-09 22:00 UTC

[R] Question About lm()

Hello,

I was constructing a simple linear model with one categorical (3-levels) and one
quantitative predictor variable for a colleague. I estimated model parameters
with and without an intercept, sometimes called reference cell coding and cell
means coding.

Model 1: yResp ~ -1 + xCat + xCont
Model 2: yResp ~ xCat + xCont

These models are equivalent and the estimated coefficients come out fine, but
the R-squared and F statistics returned by summary() differ markedly. I spent
some time looking at the code for both lm() and summary.lm() but did not find
the source of the difference. aov() and anova() results also differ, so I
suspect the issue involves how the sums of squares are being computed. I've
also spent some time trying to search online for information on this, without
success. I haven't used lm() for quite a while, but my memory is that these
differences didn't occur in the distant past when I was teaching.

Thanks in advance for any insights you might have,
Jeff

Jeffrey F. Bromaghin
Research Statistician
USGS Alaska Science Center
907-786-7086
Jeffrey Bromaghin, Ph.D. | U.S. Geological Survey
(usgs.gov)<https://www.usgs.gov/staff-profiles/jeffrey-bromaghin>
Ecosystems Analytics | U.S. Geological Survey
(usgs.gov)<https://www.usgs.gov/centers/alaska-science-center/science/ecosystems-analytics>


	[[alternative HTML version deleted]]

David Winsemius

2022-Feb-10 07:16 UTC

head link

[R] Question About lm()

The models are NOT equivalent. Why would you?ll think they were?

? 
David

Sent from my iPhone
> On Feb 9, 2022, at 11:10 PM, Bromaghin, Jeffrey F via R-help <r-help at
r-project.org> wrote:
> 
> ?Hello,
> 
> I was constructing a simple linear model with one categorical (3-levels)
and one quantitative predictor variable for a colleague. I estimated model
parameters with and without an intercept, sometimes called reference cell coding
and cell means coding.
> 
> Model 1: yResp ~ -1 + xCat + xCont
> Model 2: yResp ~ xCat + xCont
> 
> These models are equivalent and the estimated coefficients come out fine,
but the R-squared and F statistics returned by summary() differ markedly. I
spent some time looking at the code for both lm() and summary.lm() but did not
find the source of the difference. aov() and anova() results also differ, so I
suspect the issue involves how the sums of squares are being computed. I've
also spent some time trying to search online for information on this, without
success. I haven't used lm() for quite a while, but my memory is that these
differences didn't occur in the distant past when I was teaching.
> 
> Thanks in advance for any insights you might have,
> Jeff
> 
> Jeffrey F. Bromaghin
> Research Statistician
> USGS Alaska Science Center
> 907-786-7086
> Jeffrey Bromaghin, Ph.D. | U.S. Geological Survey
(usgs.gov)<https://www.usgs.gov/staff-profiles/jeffrey-bromaghin>
> Ecosystems Analytics | U.S. Geological Survey
(usgs.gov)<https://www.usgs.gov/centers/alaska-science-center/science/ecosystems-analytics>
> 
> 
>    [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Ivan Krylov

2022-Feb-10 07:20 UTC

head link

[R] Question About lm()

On Wed, 9 Feb 2022 22:00:40 +0000
"Bromaghin, Jeffrey F via R-help" <r-help at r-project.org>
wrote:
> These models are equivalent and the estimated coefficients come out
> fine, but the R-squared and F statistics returned by summary() differ
> markedly.
Is the mean of yResp far from zero? Here's what summary.lm says about
that:
>> r.squared: R^2, the ?fraction of variance explained by the model?,
>> 
>>               R^2 = 1 - Sum(R[i]^2) / Sum((y[i] - y*)^2),
>> 
>>            where y* is the mean of y[i] if there is an intercept and
>>            zero otherwise.
-- 
Best regards,
Ivan

PIKAL Petr

2022-Feb-10 07:28 UTC

head link

[R] Question About lm()

Hi

Is it enough for explanation?

https://stats.stackexchange.com/questions/26176/removal-of-statistically-sig
nificant-intercept-term-increases-r2-in-linear-mo

https://stackoverflow.com/questions/57415793/r-squared-in-lm-for-zero-interc
ept-model

Cheers
Petr> -----Original Message-----
> From: R-help <r-help-bounces at r-project.org> On Behalf Of
Bromaghin,
Jeffrey> F via R-help
> Sent: Wednesday, February 9, 2022 11:01 PM
> To: r-help at r-project.org
> Subject: [R] Question About lm()
> 
> Hello,
> 
> I was constructing a simple linear model with one categorical (3-levels)
and one> quantitative predictor variable for a colleague. I estimated model
parameters> with and without an intercept, sometimes called reference cell coding and
cell> means coding.
> 
> Model 1: yResp ~ -1 + xCat + xCont
> Model 2: yResp ~ xCat + xCont
> 
> These models are equivalent and the estimated coefficients come out fine,
but> the R-squared and F statistics returned by summary() differ markedly. I
spent> some time looking at the code for both lm() and summary.lm() but did not
find> the source of the difference. aov() and anova() results also differ, so I
suspect> the issue involves how the sums of squares are being computed. I've
also
spent> some time trying to search online for information on this, without
success. I> haven't used lm() for quite a while, but my memory is that these
differences> didn't occur in the distant past when I was teaching.
> 
> Thanks in advance for any insights you might have, Jeff
> 
> Jeffrey F. Bromaghin
> Research Statistician
> USGS Alaska Science Center
> 907-786-7086
> Jeffrey Bromaghin, Ph.D. | U.S. Geological Survey
> (usgs.gov)<https://www.usgs.gov/staff-profiles/jeffrey-bromaghin>
> Ecosystems Analytics | U.S. Geological Survey
> (usgs.gov)<https://www.usgs.gov/centers/alaska-science-
> center/science/ecosystems-analytics>
> 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.

R help - Feb 2022 - Question About lm()

[R] Question About lm()

[R] Question About lm()

[R] Question About lm()

[R] Question About lm()