li li
2017-Mar-16 18:26 UTC
[R] standard error for regression coefficients corresponding to factor levels
Hi all, I have the following data called "data1". After fitting the ancova model with different slopes and intercepts for each region, I calculated the regression coefficients and the corresponding standard error. The standard error (for intercept or for slope) are all the same for different regions. Is there something wrong? I know the SE is related to (X^T X)^-1, where X is design matrix. So does this happen whenever each factor level has the same set of values for "week"? Thanks. Hanna> mod <- lm(response ~ region*week, data1)> tmp <- coef(summary(mod))> res <- matrix(NA, 5,4)> res[1,1:2] <- tmp[1,1:2]> res[2:5,1] <- tmp[1,1]+tmp[2:5,1]> res[2:5,2] <- sqrt(tmp[2:5,2]^2-tmp[1,2]^2)> res[1,3:4] <- tmp[6,1:2]> res[2:5,3] <- tmp[6,1]+tmp[7:10,1]> res[2:5,4] <- sqrt(tmp[7:10,2]^2-tmp[6,2]^2)> colnames(res) <- c("intercept", "intercept SE", "slope", "slope SE")> rownames(res) <- letters[1:5]> res intercept intercept SE slope slope SEa 0.18404464 0.08976301 -0.018629310 0.01385073 b 0.17605666 0.08976301 -0.022393789 0.01385073 c 0.16754130 0.08976301 -0.022367770 0.01385073 d 0.12554452 0.08976301 -0.017464385 0.01385073 e 0.06153256 0.08976301 0.007714685 0.01385073> data1 week region response5 3 c 0.057325067 6 6 c 0.066723632 7 9 c -0.025317808 12 3 d 0.024692613 13 6 d 0.021761492 14 9 d -0.099820335 19 3 c 0.119559235 20 6 c -0.054456186 21 9 c 0.078811180 26 3 d 0.091667189 27 6 d -0.053400777 28 9 d 0.090754363 33 3 c 0.163818085 34 6 c 0.008959741 35 9 c -0.115410852 40 3 d 0.193920693 41 6 d -0.087738914 42 9 d 0.004987542 47 3 a 0.121332285 48 6 a -0.020202707 49 9 a 0.037295785 54 3 b 0.214304603 55 6 b -0.052346480 56 9 b 0.082501222 61 3 a 0.053540767 62 6 a -0.019182819 63 9 a -0.057629113 68 3 b 0.068592791 69 6 b -0.123298216 70 9 b -0.230671818 75 3 a 0.330741562 76 6 a 0.013902905 77 9 a 0.190620360 82 3 b 0.151002874 83 6 b 0.086177696 84 9 b 0.178982656 89 3 e 0.062974799 90 6 e 0.062035391 91 9 e 0.206200831 96 3 e 0.123102197 97 6 e 0.040181790 98 9 e 0.121332285 103 3 e 0.147557564 104 6 e 0.062035391 105 9 e 0.144965770 [[alternative HTML version deleted]]
Rolf Turner
2017-Mar-16 23:41 UTC
[R] [FORGED] standard error for regression coefficients corresponding to factor levels
You have been posting to the R-help list long enough so that you should have learned by now *not* to post in html. Your code is mangled so as to be unreadable. A few comments: (1) Your data frame "data1" seems to have a mysterious (and irrelevant?) column named "data1" as well. (2) The covariance matrix of your coefficient estimates is indeed (as you hint) a constant multiple of (X^T X)^{-1}. So do: X <- model.matrix(~response*week,data=data1) S <- solve(t(X)%*%X) print(S) and you will see the same pattern of constancy that your results exhibit. (3) You could get the results you want much more easily, without all the fooling around buried in your (illegible) code, by doing: mod <- lm(response ~ (region - 1)/week,data=data1) summary(mod) cheers, Rolf Turner -- Technical Editor ANZJS Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276 On 17/03/17 07:26, li li wrote:> Hi all, > I have the following data called "data1". After fitting the ancova model > with different slopes and intercepts for each region, I calculated the > regression coefficients and the corresponding standard error. The standard > error (for intercept or for slope) are all the same for different regions. > Is there something wrong? > I know the SE is related to (X^T X)^-1, where X is design matrix. So does > this happen whenever each factor level has the same set of values for > "week"? > Thanks. > Hanna > > > >> mod <- lm(response ~ region*week, data1)> tmp <- coef(summary(mod))> res <- matrix(NA, 5,4)> res[1,1:2] <- tmp[1,1:2]> res[2:5,1] <- tmp[1,1]+tmp[2:5,1]> res[2:5,2] <- sqrt(tmp[2:5,2]^2-tmp[1,2]^2)> res[1,3:4] <- tmp[6,1:2]> res[2:5,3] <- tmp[6,1]+tmp[7:10,1]> res[2:5,4] <- sqrt(tmp[7:10,2]^2-tmp[6,2]^2) > >> colnames(res) <- c("intercept", "intercept SE", "slope", "slope SE")> rownames(res) <- letters[1:5]> res intercept intercept SE slope slope SE > a 0.18404464 0.08976301 -0.018629310 0.01385073 > b 0.17605666 0.08976301 -0.022393789 0.01385073 > c 0.16754130 0.08976301 -0.022367770 0.01385073 > d 0.12554452 0.08976301 -0.017464385 0.01385073 > e 0.06153256 0.08976301 0.007714685 0.01385073 > > > > > > > >> data1 week region response > 5 3 c 0.057325067 > 6 6 c 0.066723632 > 7 9 c -0.025317808 > 12 3 d 0.024692613 > 13 6 d 0.021761492 > 14 9 d -0.099820335 > 19 3 c 0.119559235 > 20 6 c -0.054456186 > 21 9 c 0.078811180 > 26 3 d 0.091667189 > 27 6 d -0.053400777 > 28 9 d 0.090754363 > 33 3 c 0.163818085 > 34 6 c 0.008959741 > 35 9 c -0.115410852 > 40 3 d 0.193920693 > 41 6 d -0.087738914 > 42 9 d 0.004987542 > 47 3 a 0.121332285 > 48 6 a -0.020202707 > 49 9 a 0.037295785 > 54 3 b 0.214304603 > 55 6 b -0.052346480 > 56 9 b 0.082501222 > 61 3 a 0.053540767 > 62 6 a -0.019182819 > 63 9 a -0.057629113 > 68 3 b 0.068592791 > 69 6 b -0.123298216 > 70 9 b -0.230671818 > 75 3 a 0.330741562 > 76 6 a 0.013902905 > 77 9 a 0.190620360 > 82 3 b 0.151002874 > 83 6 b 0.086177696 > 84 9 b 0.178982656 > 89 3 e 0.062974799 > 90 6 e 0.062035391 > 91 9 e 0.206200831 > 96 3 e 0.123102197 > 97 6 e 0.040181790 > 98 9 e 0.121332285 > 103 3 e 0.147557564 > 104 6 e 0.062035391 > 105 9 e 0.144965770 > > [[alternative HTML version deleted]]
Doran, Harold
2017-Mar-17 09:05 UTC
[R] [FORGED] standard error for regression coefficients corresponding to factor levels
A slightly more ?R-ish? way of doing S <- solve(t(X)%*%X) Is to instead use S <- solve(crossprod(X)) And the idea idea of inverting the SSCP matrix only and not actually solving the linear system is not so great, which is why it is better to do as Rolf is suggesting and get all things you need from lm, which uses decompositions and not the algebraic representations for the solution to the linear system. On 3/16/17, 7:41 PM, "Rolf Turner" <r.turner at auckland.ac.nz> wrote:> >You have been posting to the R-help list long enough so that you should >have learned by now *not* to post in html. Your code is mangled so as >to be unreadable. > >A few comments: > >(1) Your data frame "data1" seems to have a mysterious (and irrelevant?) >column named "data1" as well. > >(2) The covariance matrix of your coefficient estimates is indeed (as >you hint) a constant multiple of (X^T X)^{-1}. So do: > > X <- model.matrix(~response*week,data=data1) > S <- solve(t(X)%*%X) > print(S) > >and you will see the same pattern of constancy that your results exhibit. > >(3) You could get the results you want much more easily, without all the >fooling around buried in your (illegible) code, by doing: > > mod <- lm(response ~ (region - 1)/week,data=data1) > summary(mod) > >cheers, > >Rolf Turner > >-- >Technical Editor ANZJS >Department of Statistics >University of Auckland >Phone: +64-9-373-7599 ext. 88276 > >On 17/03/17 07:26, li li wrote: >> Hi all, >> I have the following data called "data1". After fitting the ancova >>model >> with different slopes and intercepts for each region, I calculated the >> regression coefficients and the corresponding standard error. The >>standard >> error (for intercept or for slope) are all the same for different >>regions. >> Is there something wrong? >> I know the SE is related to (X^T X)^-1, where X is design matrix. So >>does >> this happen whenever each factor level has the same set of values for >> "week"? >> Thanks. >> Hanna >> >> >> >>> mod <- lm(response ~ region*week, data1)> tmp <- coef(summary(mod))> >>>res <- matrix(NA, 5,4)> res[1,1:2] <- tmp[1,1:2]> res[2:5,1] <- >>>tmp[1,1]+tmp[2:5,1]> res[2:5,2] <- sqrt(tmp[2:5,2]^2-tmp[1,2]^2)> >>>res[1,3:4] <- tmp[6,1:2]> res[2:5,3] <- tmp[6,1]+tmp[7:10,1]> >>>res[2:5,4] <- sqrt(tmp[7:10,2]^2-tmp[6,2]^2) >> >>> colnames(res) <- c("intercept", "intercept SE", "slope", "slope SE")> >>>rownames(res) <- letters[1:5]> res intercept intercept SE >>>slope slope SE >> a 0.18404464 0.08976301 -0.018629310 0.01385073 >> b 0.17605666 0.08976301 -0.022393789 0.01385073 >> c 0.16754130 0.08976301 -0.022367770 0.01385073 >> d 0.12554452 0.08976301 -0.017464385 0.01385073 >> e 0.06153256 0.08976301 0.007714685 0.01385073 >> >> >> >> >> >> >> >>> data1 week region response >> 5 3 c 0.057325067 >> 6 6 c 0.066723632 >> 7 9 c -0.025317808 >> 12 3 d 0.024692613 >> 13 6 d 0.021761492 >> 14 9 d -0.099820335 >> 19 3 c 0.119559235 >> 20 6 c -0.054456186 >> 21 9 c 0.078811180 >> 26 3 d 0.091667189 >> 27 6 d -0.053400777 >> 28 9 d 0.090754363 >> 33 3 c 0.163818085 >> 34 6 c 0.008959741 >> 35 9 c -0.115410852 >> 40 3 d 0.193920693 >> 41 6 d -0.087738914 >> 42 9 d 0.004987542 >> 47 3 a 0.121332285 >> 48 6 a -0.020202707 >> 49 9 a 0.037295785 >> 54 3 b 0.214304603 >> 55 6 b -0.052346480 >> 56 9 b 0.082501222 >> 61 3 a 0.053540767 >> 62 6 a -0.019182819 >> 63 9 a -0.057629113 >> 68 3 b 0.068592791 >> 69 6 b -0.123298216 >> 70 9 b -0.230671818 >> 75 3 a 0.330741562 >> 76 6 a 0.013902905 >> 77 9 a 0.190620360 >> 82 3 b 0.151002874 >> 83 6 b 0.086177696 >> 84 9 b 0.178982656 >> 89 3 e 0.062974799 >> 90 6 e 0.062035391 >> 91 9 e 0.206200831 >> 96 3 e 0.123102197 >> 97 6 e 0.040181790 >> 98 9 e 0.121332285 >> 103 3 e 0.147557564 >> 104 6 e 0.062035391 >> 105 9 e 0.144965770 >> >> [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.