Hi Folks, I am dealing with data which have been presented as at each x_i, mean m_i of the y-values at x_i, sd s_i of the y-values at x_i number n_i of the y-values at x_i and I want to linearly regress y on x. There does not seem to be an option to 'lm' which can deal with such data directly, though the regression problem could be algebraically expressed in these terms. One way of fudging it would be to replace each m_i by a set of n_i numbers Y_i constructed as u_i <- rnorm(ni) Y_i <- m_i + s_i*(u_i - mean(u_i))/sd(u_i) and associate these with X_i <- rep(x_i,n_i), thereby constructing a regression-equivalent set of pseudo "raw data" which could be fed to lm(Y~X). However, this strikes me as cumbersome, at least, and even ugly! Is there a direct way to go from {(n_i,m_i,s_i)} to the fitted regression, with summaries and all (and use of 'predict')? With thanks, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 167 1972 Date: 18-Apr-04 Time: 02:27:50 ------------------------------ XFMail ------------------------------
The short answer is no, as there is no way to recover the fitted values and residuals so you can't get a proper fit object of class "lm" (and hence get `summaries and all'). Your pseudo-data method needs to fix the u_i to be mean zero, variance one in the sample. That is probably the quickest method. The elegant one is to create a new class "groupedlm" and write a constructor etc for it .... On Sun, 18 Apr 2004 Ted.Harding at nessie.mcc.ac.uk wrote:> Hi Folks, > > I am dealing with data which have been presented as > > at each x_i, mean m_i of the y-values at x_i, > sd s_i of the y-values at x_i > number n_i of the y-values at x_i > > and I want to linearly regress y on x. > > There does not seem to be an option to 'lm' which can > deal with such data directly, though the regression > problem could be algebraically expressed in these terms. > > One way of fudging it would be to replace each m_i by > a set of n_i numbers Y_i constructed as > > u_i <- rnorm(ni) > > Y_i <- m_i + s_i*(u_i - mean(u_i))/sd(u_i) > > and associate these with X_i <- rep(x_i,n_i), thereby > constructing a regression-equivalent set of pseudo "raw data" > which could be fed to lm(Y~X). However, this strikes me as > cumbersome, at least, and even ugly! > > Is there a direct way to go from {(n_i,m_i,s_i)} to the > fitted regression, with summaries and all (and use of 'predict')?-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
On 18 Apr 2004 at 2:27, Ted Harding wrote:> Hi Folks, > > I am dealing with data which have been presented as > > at each x_i, mean m_i of the y-values at x_i, > sd s_i of the y-values at x_i > number n_i of the y-values at x_i > > and I want to linearly regress y on x.You need weighted regresseion, so lm with the argument weights. Assuning constant variance of the errors, you can calculate w[i] <- n_i and put mean m_i of the y-values at x_i into y, x_i values into x, n)- values into n. Then lm(y ~ x, weight=n) Kjetil Halvorsen> > There does not seem to be an option to 'lm' which can > deal with such data directly, though the regression > problem could be algebraically expressed in these terms. > > One way of fudging it would be to replace each m_i by > a set of n_i numbers Y_i constructed as > > u_i <- rnorm(ni) > > Y_i <- m_i + s_i*(u_i - mean(u_i))/sd(u_i) > > and associate these with X_i <- rep(x_i,n_i), thereby > constructing a regression-equivalent set of pseudo "raw data" > which could be fed to lm(Y~X). However, this strikes me as > cumbersome, at least, and even ugly! > > Is there a direct way to go from {(n_i,m_i,s_i)} to the > fitted regression, with summaries and all (and use of 'predict')? > > With thanks, > Ted. > > > -------------------------------------------------------------------- > E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk> Fax-to-email: +44 > (0)870 167 1972 Date: 18-Apr-04 > Time: 02:27:50 ------------------------------ XFMail > ------------------------------ > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html