thr3ads.net - R help - [R] How to do linear regression with errors in x and y? [Jun 2000]

If this information is useful, please help other people find it:
Share via:

Dan E. Kelley

2000-Jun-03 15:32 UTC

[R] How to do linear regression with errors in x and y?

QUESTION: how should I do a linear regression in which there are
errors in x as well as y?

SUPPLEMENT: I've seen folks approach this problem by computing
eigenvectors of the covariance matrix, and that makes sense to me.
But I'm wondering if this has a "pedigree" (i.e. if it makes sense
to
folks on this list, and if it's something that has been published, so
I can refer to it.)

BACKGROUND: (I'm providing this for interest of readers, since I
personally find such ancillary comments on this list to be quite
intriguing.)  My problem is something that comes up all the time in
physics (in this case, fluid mechanics).  I have measured variables,
let's call them X and Y, and dimensional analysis suggests that these
be scaled by Lx and Ly say, so the buckingham Pi theorem says that we
must have

	Y/Ly = f(X/Lx, ...)

where the ... is a list of nondimensional parameters of the problem.
(As an aside, the X is depth below the ocean surface, Lx is the RMS
height of waves on the surface, Y is a measure of the turbulence in
the ocean, and Ly is related to the wind stress on the water surface.
The ... is a list of parameters that includes how long the wind has
been blowing; sailors will know that waves take a while to build up.)

A power-law dependence, i.e.

	Y/Ly = (X/Lx)^alpha

seems justified by theory, but the value of alpha is contentious and
we seek to determine it empirically.  (Engineers reading this will
recognize that alpha=-1 is the so-called "law of the wall" for the
decay of turbulence away from a frictional wall.)

Thus, my approach is to try to fit a line like

	log(Y/Ly) ~ log(X/Lx)

but since there are errors in (X,Y,Lx,Ly) (all of which rely on
measurement), we emphatically have errors in both the dependent and
independent variable.  If our scaling is correct, X/Lx and Y/Ly are
roughly of order unity.  The data suggest log(X/Lx) and log(Y/Ly) have
roughly comparable scatter.

Thus, I'd be happy to state that the errors in the dependent and
independent variables are comparable.  And so my question becomes, on
this assumption, how to fit a line through data in which both "x" and
"y" have (equal) uncertainty.  I'm thinking the eigenvector
approach
is fine.  Comments?

-- 
Dan E. Kelley                                         phone:(902)494-1694
Oceanography Department, Dalhousie University           fax:(902)494-2885
Halifax, Nova Scotia                             mailto:Dan.Kelley at Dal.CA 
Canada B3H 4J1       http://www.phys.ocean.dal.ca/~kelley/Kelley_Dan.html

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Jan de Leeuw

2000-Jun-03 17:49 UTC

head link

[R] How to do linear regression with errors in x and y?

Distinguished pedigree.

Karl Pearson
On Lines and Planes of Closest Fit  to Systems of Points in Space
Phil Mag. 2, 1901, 559-572.

Goes back even further (to Adcock, around 1875).


At 12:32 -0300 06/03/2000, Dan E. Kelley wrote:>QUESTION: how should I do a linear regression in which there are
>errors in x as well as y?
>
>SUPPLEMENT: I've seen folks approach this problem by computing
>eigenvectors of the covariance matrix, and that makes sense to me.
>But I'm wondering if this has a "pedigree" (i.e. if it makes
sense to
>folks on this list, and if it's something that has been published, so
>I can refer to it.)
>
>BACKGROUND: (I'm providing this for interest of readers, since I
>personally find such ancillary comments on this list to be quite
>intriguing.)  My problem is something that comes up all the time in
>physics (in this case, fluid mechanics).  I have measured variables,
>let's call them X and Y, and dimensional analysis suggests that these
>be scaled by Lx and Ly say, so the buckingham Pi theorem says that we
>must have
>
>	Y/Ly = f(X/Lx, ...)
>
>where the ... is a list of nondimensional parameters of the problem.
>(As an aside, the X is depth below the ocean surface, Lx is the RMS
>height of waves on the surface, Y is a measure of the turbulence in
>the ocean, and Ly is related to the wind stress on the water surface.
>The ... is a list of parameters that includes how long the wind has
>been blowing; sailors will know that waves take a while to build up.)
>
>A power-law dependence, i.e.
>
>	Y/Ly = (X/Lx)^alpha
>
>seems justified by theory, but the value of alpha is contentious and
>we seek to determine it empirically.  (Engineers reading this will
>recognize that alpha=-1 is the so-called "law of the wall" for the
>decay of turbulence away from a frictional wall.)
>
>Thus, my approach is to try to fit a line like
>
>	log(Y/Ly) ~ log(X/Lx)
>
>but since there are errors in (X,Y,Lx,Ly) (all of which rely on
>measurement), we emphatically have errors in both the dependent and
>independent variable.  If our scaling is correct, X/Lx and Y/Ly are
>roughly of order unity.  The data suggest log(X/Lx) and log(Y/Ly) have
>roughly comparable scatter.
>
>Thus, I'd be happy to state that the errors in the dependent and
>independent variables are comparable.  And so my question becomes, on
>this assumption, how to fit a line through data in which both "x"
and
>"y" have (equal) uncertainty.  I'm thinking the eigenvector
approach
>is fine.  Comments?
>
>--
>Dan E. Kelley                                         phone:(902)494-1694
>Oceanography Department, Dalhousie University           fax:(902)494-2885
>Halifax, Nova Scotia                             mailto:Dan.Kelley at Dal.CA
>Canada B3H 4J1       http://www.phys.ocean.dal.ca/~kelley/Kelley_Dan.html
>
>-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
>r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
>Send "info", "help", or "[un]subscribe"
>(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
>_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
-- 
==Jan de Leeuw; Professor and Chair, UCLA Department of Statistics;
US mail: 8142 Math Sciences Bldg, Box 951554, Los Angeles, CA 90095-1554
phone (310)-825-9550;  fax (310)-206-5658;  email: deleeuw at stat.ucla.edu
    http://www.stat.ucla.edu/~deleeuw and http://home1.gte.net/datamine/
===========================================================================     
No matter where you go, there you are. --- Buckaroo Banzai
                   http://webdev.stat.ucla.edu/sounds/nomatter.au
===========================================================================-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Marc R. Feldesman

2000-Jun-03 18:06 UTC

head link

[R] How to do linear regression with errors in x and y?

This is referred to in *my* trade as a Model II regression and is fit by 
finding either the major axis slope or the reduced major axis slope.  We 
find the RMA slope using principal components analysis of the covariance 
matrix - the ratio of eigenvectors of x & y variables form the major axis 
slopes; we get the reduced major axis slope by dividing the linear 
regression slope by the correlation coefficient for x & y.

The original approach to this type of regression traces to at least Haldane 
and Kermack in 1950.

At 12:32 PM 6/3/00 -0300, Dan E. Kelley wrote:
 >QUESTION: how should I do a linear regression in which there are
 >errors in x as well as y?
 >

Dr. Marc R. Feldesman
email:  feldesmanm at pdx.edu
email:  feldesman at ibm.net
fax:    503-725-3905

"Don't know where I'm going.
Don't like where I've been.
There may be no exit.
But hell, I'm going in."  Jimmy Buffett

Powered by Superchoerus - the 700 MHz Coppermine Box

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Hans Ehrbar

2000-Jun-03 18:06 UTC

head link

[R] How to do linear regression with errors in x and y?

Hello Dan,

There are extensive sections about errors in variables
in my on-line econometrics class notes at
http://www.econ.utah.edu/ehrbar/ecmet.pdf
(this is a 5 MB pdf file).  Maybe this has
something interesting for you.

Hans Ehrbar.

-- 
Hans G. Ehrbar                               ehrbar at econ.utah.edu
Economics Department, University of Utah     (801) 581 7797 (my office)
1645 Campus Center Dr., Rm 308               (801) 581 7481 (econ office)
Salt Lake City    UT 84112-9300              (801) 585 5649 (FAX)

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Prof Brian D Ripley

2000-Jun-04 05:38 UTC

head link

[R] How to do linear regression with errors in x and y?

On Sat, 3 Jun 2000, Dan E. Kelley wrote:
> QUESTION: how should I do a linear regression in which there are
> errors in x as well as y?
By definition, that is not a linear *regression*.  More precisely,
what you should do depends critically on the assumptions and purpose
of the analysis.  For example, for a calibration problem regression
of x on y (that is least-squares fitting) is still a good idea. And it
depends on whether the observed x values were controlled or the
true values or if this is a random sample of (x,y)'s.

In what I think you want there is a true linear relationship and
both x and y are measured with error, and you are interested in the
relationship.  That's called a linear functional relationship model.
(Econometricians use structural models, the radnom-sample version.)

[...]
> Thus, I'd be happy to state that the errors in the dependent and
> independent variables are comparable.  And so my question becomes, on
> this assumption, how to fit a line through data in which both "x"
and
> "y" have (equal) uncertainty.  I'm thinking the eigenvector
approach
> is fine.  Comments?
As Jan de Leeuw has already commented, this is an extremely well
re-discovered result, going back to Adcock ca 1872.  But minor
variations still seem unknown (and I once wrote a paper on the
variation in which the uncertainty in x and y depend on the true
value, as occurs in analytical chemistry).

There is a whole book on this and related ideas:

@Book{Fuller.87,
  author       = "Fuller, W.",
  title        = "Measurement Error Models",
  publisher    = "Wiley",
  year         = "1987",
}

and you will find treatments in a few linear models books, AFAIR
those by G.A.F. Seber and P. Sprent especially.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Seemingly Similar Threads

Search for more maybe matching threads

R help - Jun 2000 - How to do linear regression with errors in x and y?

[R] How to do linear regression with errors in x and y?

[R] How to do linear regression with errors in x and y?

[R] How to do linear regression with errors in x and y?

[R] How to do linear regression with errors in x and y?

[R] How to do linear regression with errors in x and y?

Seemingly Similar Threads