thr3ads.net - R help - [R] how to get perfect fit of lm if response is constant [Jan 2010]

If this information is useful, please help other people find it:
Share via:

Jan-Henrik Pötter

2010-Jan-08 16:50 UTC

[R] how to get perfect fit of lm if response is constant

Hello.

Consider the response-variable of data.frame df is constant, so analytically
perfect fit of a linear model is expected. Fitting a regression line using
lm result in residuals, slope and std.errors not exactly zero, which is
acceptable in some way, but errorneous. But if you use summary.lm it shows
inacceptable error propagation in the calculation of the t value and the
corresponding p-value for the slope, as well R-Square – just consider the
adj R-Square of 0.6788! This result is independent of which mode used for
the input vectors. Is there any way to get the perfect fitted regression
curve using lm and prevent this error propagation? I consider rounding all
values of the lm-object afterwards to somewhat precision as a bad idea.
Unfortunately there is no option in lm for calculation precision. 

 
> df<-data.frame(x=1:10,y=1)
> myl<-lm(y~x,data=df)
 
> myl
 

Call:

lm(formula = y ~ x, data = df)

 

Coefficients:

(Intercept)            x  

  1.000e+00    9.463e-18  

 
> summary(myl)
 

Call:

lm(formula = y ~ x, data = df)

 

Residuals:

       Min         1Q     Median         3Q        Max 

-1.136e-16 -1.341e-17  7.886e-18  2.918e-17  5.047e-17 

 

Coefficients:

             Estimate Std. Error   t value Pr(>|t|)    

(Intercept) 1.000e+00  3.390e-17 2.950e+16   <2e-16 ***

x           9.463e-18  5.463e-18 1.732e+00    0.122    

---

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

 

Residual standard error: 4.962e-17 on 8 degrees of freedom

Multiple R-squared: 0.7145,     Adjusted R-squared: 0.6788 

F-statistic: 20.02 on 1 and 8 DF,  p-value: 0.002071

 


	[[alternative HTML version deleted]]

Peter Ehlers

2010-Jan-08 18:43 UTC

head link

[R] how to get perfect fit of lm if response is constant

You need to review the assumptions of linear models:
y is assumed to be the realization of a random variable,
not a constant (or, more precisely: there are assumed to
be deviations that are N(0, sigma^2).

If you 'know' that y is a constant, then you have
two options:

1. don't do the regression because it makes no sense;
2. if you want to test lm()'s handling of the data:

fm <- lm(y ~ x, data = df, offset = rep(1, nrow(df)))

(or use: offset = y)

  -Peter Ehlers

Jan-Henrik P?tter wrote:> Hello.
> 
> Consider the response-variable of data.frame df is constant, so
analytically
> perfect fit of a linear model is expected. Fitting a regression line using
> lm result in residuals, slope and std.errors not exactly zero, which is
> acceptable in some way, but errorneous. But if you use summary.lm it shows
> inacceptable error propagation in the calculation of the t value and the
> corresponding p-value for the slope, as well R-Square ? just consider the
> adj R-Square of 0.6788! This result is independent of which mode used for
> the input vectors. Is there any way to get the perfect fitted regression
> curve using lm and prevent this error propagation? I consider rounding all
> values of the lm-object afterwards to somewhat precision as a bad idea.
> Unfortunately there is no option in lm for calculation precision. 
> 
>  
> 
>> df<-data.frame(x=1:10,y=1)
> 
>> myl<-lm(y~x,data=df)
> 
>  
> 
>> myl
> 
>  
> 
> Call:
> 
> lm(formula = y ~ x, data = df)
> 
>  
> 
> Coefficients:
> 
> (Intercept)            x  
> 
>   1.000e+00    9.463e-18  
> 
>  
> 
>> summary(myl)
> 
>  
> 
> Call:
> 
> lm(formula = y ~ x, data = df)
> 
>  
> 
> Residuals:
> 
>        Min         1Q     Median         3Q        Max 
> 
> -1.136e-16 -1.341e-17  7.886e-18  2.918e-17  5.047e-17 
> 
>  
> 
> Coefficients:
> 
>              Estimate Std. Error   t value Pr(>|t|)    
> 
> (Intercept) 1.000e+00  3.390e-17 2.950e+16   <2e-16 ***
> 
> x           9.463e-18  5.463e-18 1.732e+00    0.122    
> 
> ---
> 
> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 
> 
>  
> 
> Residual standard error: 4.962e-17 on 8 degrees of freedom
> 
> Multiple R-squared: 0.7145,     Adjusted R-squared: 0.6788 
> 
> F-statistic: 20.02 on 1 and 8 DF,  p-value: 0.002071
> 
>  
> 
> 
> 	[[alternative HTML version deleted]]
> 
> 
> 
> ------------------------------------------------------------------------
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
Peter Ehlers
University of Calgary
403.202.3921

Ista Zahn

2010-Jan-08 21:11 UTC

head link

[R] how to get perfect fit of lm if response is constant

Just to clarify this point: I don't think the problem is that y is
"perfectly fittable", but that it is constant. Since the variance of a
constant is zero, there is no variance to explain.

-Ista

On Fri, Jan 8, 2010 at 2:32 PM, Jan-Henrik P?tter <henrik.poetter at
gmx.de> wrote:> Thanks for the answer.
> The situation is that I don't know anything of y a priori. Of course I
then would not do a regression on constant y's, but isn't it a problem
of stability of the algorithm, if I get an adj RSquare of 0.6788 for
> a least square fit on this type of data? I think lm should give me a
correct result even in case of y is perfectly fittable, because the situation is
that I never know if my data could become so. If I have to offset y in this
case, then my question becomes how noisy do my y's have to be, so that I can
rely on the lm result, if I specify the formula y~x without offset. What if my
y's become nearly linear (or nearly perfect fittable with another linear
model). I think my question now becomes 'how to rely on lm's result if
the formula is specified the way y~x without offset? or 'How do I prevent my
result to become numerically incorrect if I may get nearly perfect fittable
y's'.
>
> Greetings
>
> Henrik
>
>
> -----Urspr?ngliche Nachricht-----
> Von: Peter Ehlers [mailto:ehlers at ucalgary.ca]
> Gesendet: Freitag, 8. Januar 2010 19:44
> An: Jan-Henrik P?tter
> Cc: r-help at r-project.org
> Betreff: Re: [R] how to get perfect fit of lm if response is constant
>
> You need to review the assumptions of linear models:
> y is assumed to be the realization of a random variable,
> not a constant (or, more precisely: there are assumed to
> be deviations that are N(0, sigma^2).
>
> If you 'know' that y is a constant, then you have
> two options:
>
> 1. don't do the regression because it makes no sense;
> 2. if you want to test lm()'s handling of the data:
>
> fm <- lm(y ~ x, data = df, offset = rep(1, nrow(df)))
>
> (or use: offset = y)
>
> ?-Peter Ehlers
>
> Jan-Henrik P?tter wrote:
>> Hello.
>>
>> Consider the response-variable of data.frame df is constant, so
analytically
>> perfect fit of a linear model is expected. Fitting a regression line
using
>> lm result in residuals, slope and std.errors not exactly zero, which is
>> acceptable in some way, but errorneous. But if you use summary.lm it
shows
>> inacceptable error propagation in the calculation of the t value and
the
>> corresponding p-value for the slope, as well R-Square ? just consider
the
>> adj R-Square of 0.6788! This result is independent of which mode used
for
>> the input vectors. Is there any way to get the perfect fitted
regression
>> curve using lm and prevent this error propagation? I consider rounding
all
>> values of the lm-object afterwards to somewhat precision as a bad idea.
>> Unfortunately there is no option in lm for calculation precision.
>>
>>
>>
>>> df<-data.frame(x=1:10,y=1)
>>
>>> myl<-lm(y~x,data=df)
>>
>>
>>
>>> myl
>>
>>
>>
>> Call:
>>
>> lm(formula = y ~ x, data = df)
>>
>>
>>
>> Coefficients:
>>
>> (Intercept) ? ? ? ? ? ?x
>>
>> ? 1.000e+00 ? ?9.463e-18
>>
>>
>>
>>> summary(myl)
>>
>>
>>
>> Call:
>>
>> lm(formula = y ~ x, data = df)
>>
>>
>>
>> Residuals:
>>
>> ? ? ? ?Min ? ? ? ? 1Q ? ? Median ? ? ? ? 3Q ? ? ? ?Max
>>
>> -1.136e-16 -1.341e-17 ?7.886e-18 ?2.918e-17 ?5.047e-17
>>
>>
>>
>> Coefficients:
>>
>> ? ? ? ? ? ? ?Estimate Std. Error ? t value Pr(>|t|)
>>
>> (Intercept) 1.000e+00 ?3.390e-17 2.950e+16 ? <2e-16 ***
>>
>> x ? ? ? ? ? 9.463e-18 ?5.463e-18 1.732e+00 ? ?0.122
>>
>> ---
>>
>> Signif. codes: ?0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>>
>>
>>
>> Residual standard error: 4.962e-17 on 8 degrees of freedom
>>
>> Multiple R-squared: 0.7145, ? ? Adjusted R-squared: 0.6788
>>
>> F-statistic: 20.02 on 1 and 8 DF, ?p-value: 0.002071
>>
>>
>>
>>
>> ? ? ? [[alternative HTML version deleted]]
>>
>>
>>
>>
------------------------------------------------------------------------
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Ehlers
> University of Calgary
> 403.202.3921
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

Possibly Parallel Threads

Search for more maybe matching threads

R help - Jan 2010 - how to get perfect fit of lm if response is constant

[R] how to get perfect fit of lm if response is constant

[R] how to get perfect fit of lm if response is constant

[R] how to get perfect fit of lm if response is constant

Possibly Parallel Threads