Ranjan Maitra
2007-Feb-18 02:23 UTC
[R] applying lm on an array of observations with common design matrix
Dear list, I have a 4-dimensional array Y of dimension 330 x 67 x 35 x 51. I have a design matrix X of dimension 330 x 4. I want to fit a linear regression of each lm( Y[, i, j, k] ~ X). for each i, j, k. Can I do it in one shot without a loop? Actually, I am also interested in getting the p-values of some of the coefficients -- lets say the coefficient corresponding to the second column of the design matrix. Can the same be done using array-based operations? I would be happy to clarify if anything is unclear. Many thanks and best wishes, Ranjan
Prof Brian Ripley
2007-Feb-18 07:46 UTC
[R] applying lm on an array of observations with common design matrix
On Sat, 17 Feb 2007, Ranjan Maitra wrote:> Dear list, > > I have a 4-dimensional array Y of dimension 330 x 67 x 35 x 51. I have a > design matrix X of dimension 330 x 4. I want to fit a linear regression > of each > > lm( Y[, i, j, k] ~ X). for each i, j, k. > > Can I do it in one shot without a loop?Yes. YY <- YY dim(YY) <- c(330, 67*35*51) fit <- lm(YY ~ X)> Actually, I am also interested in getting the p-values of some of the > coefficients -- lets say the coefficient corresponding to the second > column of the design matrix. Can the same be done using array-based > operations?Use lapply(summary(fit), function(x) coef(x)[3,4]) (since there is a intercept, you want the third coefficient). Note that this will give a vector, so set its dimension to c(67,35,51) to relate to the original array. I have not BTW looked into the memory requirements here, and you might want to do this on slices of the array for that reason. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Ranjan Maitra
2007-Feb-22 05:15 UTC
[R] applying lm on an array of observations with common design matrix
On Sun, 18 Feb 2007 07:46:56 +0000 (GMT) Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:> On Sat, 17 Feb 2007, Ranjan Maitra wrote: > > > Dear list, > > > > I have a 4-dimensional array Y of dimension 330 x 67 x 35 x 51. I have a > > design matrix X of dimension 330 x 4. I want to fit a linear regression > > of each > > > > lm( Y[, i, j, k] ~ X). for each i, j, k. > > > > Can I do it in one shot without a loop? > > Yes. > > YY <- YY > dim(YY) <- c(330, 67*35*51) > fit <- lm(YY ~ X) > > > Actually, I am also interested in getting the p-values of some of the > > coefficients -- lets say the coefficient corresponding to the second > > column of the design matrix. Can the same be done using array-based > > operations? > > Use lapply(summary(fit), function(x) coef(x)[3,4]) (since there is a > intercept, you want the third coefficient).In this context, can one also get the variance-covariance matrix of the coefficients? Thank you, and best wishes! Ranjan> Note that this will give a vector, so set its dimension to c(67,35,51) to > relate to the original array. > > I have not BTW looked into the memory requirements here, and you might > want to do this on slices of the array for that reason. > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 >
Petr Klasterecky
2007-Feb-22 07:04 UTC
[R] applying lm on an array of observations with common design matrix
Ranjan Maitra napsal(a):> On Sun, 18 Feb 2007 07:46:56 +0000 (GMT) Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote: > >> On Sat, 17 Feb 2007, Ranjan Maitra wrote: >> >>> Dear list, >>> >>> I have a 4-dimensional array Y of dimension 330 x 67 x 35 x 51. I have a >>> design matrix X of dimension 330 x 4. I want to fit a linear regression >>> of each >>> >>> lm( Y[, i, j, k] ~ X). for each i, j, k. >>> >>> Can I do it in one shot without a loop? >> Yes. >> >> YY <- YY >> dim(YY) <- c(330, 67*35*51) >> fit <- lm(YY ~ X) >> >>> Actually, I am also interested in getting the p-values of some of the >>> coefficients -- lets say the coefficient corresponding to the second >>> column of the design matrix. Can the same be done using array-based >>> operations? >> Use lapply(summary(fit), function(x) coef(x)[3,4]) (since there is a >> intercept, you want the third coefficient). > > In this context, can one also get the variance-covariance matrix of the coefficients?Sure: lapply(summary(fit), function(x) {"$"(x,cov.unscaled)}) Add indexing if you do not want the whole matrix. You can extract whatever you want, just take a look at ?summary.lm, section Value. Petr -- Petr Klasterecky Dept. of Probability and Statistics Charles University in Prague Czech Republic
Prof Brian Ripley
2007-Feb-22 08:17 UTC
[R] applying lm on an array of observations with common design matrix
On Thu, 22 Feb 2007, Petr Klasterecky wrote:> Ranjan Maitra napsal(a): >> On Sun, 18 Feb 2007 07:46:56 +0000 (GMT) Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote: >> >>> On Sat, 17 Feb 2007, Ranjan Maitra wrote: >>> >>>> Dear list, >>>> >>>> I have a 4-dimensional array Y of dimension 330 x 67 x 35 x 51. I have a >>>> design matrix X of dimension 330 x 4. I want to fit a linear regression >>>> of each >>>> >>>> lm( Y[, i, j, k] ~ X). for each i, j, k. >>>> >>>> Can I do it in one shot without a loop? >>> Yes. >>> >>> YY <- YY >>> dim(YY) <- c(330, 67*35*51) >>> fit <- lm(YY ~ X) >>> >>>> Actually, I am also interested in getting the p-values of some of the >>>> coefficients -- lets say the coefficient corresponding to the second >>>> column of the design matrix. Can the same be done using array-based >>>> operations? >>> Use lapply(summary(fit), function(x) coef(x)[3,4]) (since there is a >>> intercept, you want the third coefficient). >> >> In this context, can one also get the variance-covariance matrix of the >> coefficients? > > Sure: > > lapply(summary(fit), function(x) {"$"(x,cov.unscaled)})But that is not the variance-covariance matrix (and it is an unusual way to write x$cov.unscaled)!> Add indexing if you do not want the whole matrix. You can extract > whatever you want, just take a look at ?summary.lm, section Value.It is unclear to me what the questioner expects: the estimated coefficients for different responses are independent. For a list of matrices applying to each response one could mimic vcov.lm and do lapply(summary(fit, corr=FALSE), function(so) so$sigma^2 * so$cov.unscaled) -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Ranjan Maitra
2007-Feb-22 19:21 UTC
[R] applying lm on an array of observations with common design matrix
On Thu, 22 Feb 2007 08:17:38 +0000 (GMT) Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:> On Thu, 22 Feb 2007, Petr Klasterecky wrote: > > > Ranjan Maitra napsal(a): > >> On Sun, 18 Feb 2007 07:46:56 +0000 (GMT) Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote: > >> > >>> On Sat, 17 Feb 2007, Ranjan Maitra wrote: > >>> > >>>> Dear list, > >>>> > >>>> I have a 4-dimensional array Y of dimension 330 x 67 x 35 x 51. I have a > >>>> design matrix X of dimension 330 x 4. I want to fit a linear regression > >>>> of each > >>>> > >>>> lm( Y[, i, j, k] ~ X). for each i, j, k. > >>>> > >>>> Can I do it in one shot without a loop? > >>> Yes. > >>> > >>> YY <- YY > >>> dim(YY) <- c(330, 67*35*51) > >>> fit <- lm(YY ~ X) > >>> > >>>> Actually, I am also interested in getting the p-values of some of the > >>>> coefficients -- lets say the coefficient corresponding to the second > >>>> column of the design matrix. Can the same be done using array-based > >>>> operations? > >>> Use lapply(summary(fit), function(x) coef(x)[3,4]) (since there is a > >>> intercept, you want the third coefficient). > >> > >> In this context, can one also get the variance-covariance matrix of the > >> coefficients? > > > > Sure: > > > > lapply(summary(fit), function(x) {"$"(x,cov.unscaled)}) > > But that is not the variance-covariance matrix (and it is an unusual way > to write x$cov.unscaled)! > > > Add indexing if you do not want the whole matrix. You can extract > > whatever you want, just take a look at ?summary.lm, section Value. > > It is unclear to me what the questioner expects: the estimated > coefficients for different responses are independent. For a list of > matrices applying to each response one could mimic vcov.lm and do > > lapply(summary(fit, corr=FALSE), > function(so) so$sigma^2 * so$cov.unscaled)Thanks! Actually, I am really looking to compare the coefficients (let us say second and the third) beta2 - beta4 = 0 for each regression. Basically, get the two-sided p-value for the test statistic for each regression. One way of doing that is to get the dispersion matrix of each regression and then to compute the t-statistic and the p-value. That is the genesis of the question above. Is there a better way? Many thanks and best wishes, Ranjan