John Sorkin
2006-Oct-14 20:26 UTC
[R] regression analyses using a vector of means and a variance-covariance matrix
R 2.2.0 windows XP How can I perform a regression analyses using a vector of means, a variance-covariance matrix? I looked at the help screen for lm and did not see any option for using the afore mentioned structures as input to lm. Thanks, John John Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics Baltimore VA Medical Center GRECC, University of Maryland School of Medicine Claude D. Pepper OAIC, University of Maryland Clinical Nutrition Research Unit, and Baltimore VA Center Stroke of Excellence University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) jsorkin at grecc.umaryland.edu Confidentiality Statement: This email message, including any attachments, is for the so...{{dropped}}
John Fox
2006-Oct-14 21:15 UTC
[R] regression analyses using a vector of means anda variance-covariance matrix
Dear John, I'm not aware of an existing function that will do what you want conveniently, but it's not hard to write one: reg <- function(V, means, y, x, n){ # V: covariance matrix of variables # means: vector of means # y: index of response in V # x: indices of regressors in V SSP <- (n - 1)*V + n*outer(means, means) SSP <- cbind(n*means, SSP) SSP <- rbind(c(n, n*means), SSP) XtXinv <- solve(SSP[c(1,x+1), c(1,x+1)]) b <- as.vector(XtXinv %*% SSP[c(1,x+1), y+1]) R2 <- as.vector(V[y,x] %*% solve(V[x,x]) %*% V[x,y])/V[y,y] s2 <- (1 - R2)*V[y,y]*(n - 1)/(n - length(x) - 1) Vb <- s2*XtXinv list(coef=b, vcov=Vb, R2=R2, s2=s2, SSP=SSP) } For example,> library(car) > data <- Duncan[,c("prestige", "income", "education")] > reg(var(data), colMeans(data), 1, 2:3, nrow(data))$coef [1] -6.0646629 0.5987328 0.5458339 $vcov income education 18.2494814 -0.15184501 -0.150706025 income -0.1518450 0.01432027 -0.008518551 education -0.1507060 -0.00851855 0.009653582 $R2 [1] 0.8281734 $s2 [1] 178.7309 $SSP prestige income education 45 2146 1884 2365 prestige 2146 146028 118229 147936 income 1884 118229 105148 122197 education 2365 147936 122197 163265> summary(lm(prestige ~ income + education, data=Duncan)) # checkCall: lm(formula = prestige ~ income + education, data = Duncan) Residuals: Min 1Q Median 3Q Max -29.5380 -6.4174 0.6546 6.6051 34.6412 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -6.06466 4.27194 -1.420 0.163 income 0.59873 0.11967 5.003 1.05e-05 *** education 0.54583 0.09825 5.555 1.73e-06 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 13.37 on 42 degrees of freedom Multiple R-Squared: 0.8282, Adjusted R-squared: 0.82 F-statistic: 101.2 on 2 and 42 DF, p-value: < 2.2e-16 Some caveats: This function will work OK for well conditioned data, since I'm inverting the matrix of sums of squares and products; one could take a more sophisticated approach. As well, there's a bit of redundancy in the computation of R2, but I didn't have time to figure out how to avoid it. Finally, there's an obvious advantage to computing on the original data -- e.g., one can get residuals. I hope this helps, John -------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox --------------------------------> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of John Sorkin > Sent: Saturday, October 14, 2006 3:27 PM > To: r-help at stat.math.ethz.ch > Subject: [R] regression analyses using a vector of means anda > variance-covariance matrix > > R 2.2.0 > windows XP > > How can I perform a regression analyses using a vector of > means, a variance-covariance matrix? I looked at the help > screen for lm and did not see any option for using the afore > mentioned structures as input to lm. > Thanks, > John > > John Sorkin M.D., Ph.D. > Chief, Biostatistics and Informatics > Baltimore VA Medical Center GRECC, > University of Maryland School of Medicine Claude D. Pepper > OAIC, University of Maryland Clinical Nutrition Research > Unit, and Baltimore VA Center Stroke of Excellence > > University of Maryland School of Medicine Division of > Gerontology Baltimore VA Medical Center 10 North Greene > Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 > > (Phone) 410-605-7119 > (Fax) 410-605-7913 (Please call phone number above prior to > faxing) jsorkin at grecc.umaryland.edu > > Confidentiality Statement: > This email message, including any attachments, is for the\...{{dropped}}
Gabor Grothendieck
2006-Oct-14 22:32 UTC
[R] regression analyses using a vector of means and a variance-covariance matrix
Here is another approach using the same data as in John Fox's reply. His is probably superior but this does have the advantage that its very simple. Note that it gives the same coefficients and R squared to several decimal places. We just simulate a data set with the given means and variance covariance matrix> library(car) # for Duncan > library(MASS) # for mvrnorm > set.seed(1) > DF <- mvrnorm(10000, colMeans(data), var(data), empirical = TRUE) > summary(lm(prestige ~ income + education, as.data.frame(DF)))Call: lm(formula = prestige ~ income + education, data = as.data.frame(DF)) Residuals: Min 1Q Median 3Q Max -49.0377 -8.7332 0.2068 8.8342 50.2737 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -6.064663 0.277576 -21.85 <2e-16 *** income 0.598733 0.007756 77.19 <2e-16 *** education 0.545834 0.006368 85.71 <2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 13.06 on 9997 degrees of freedom Multiple R-Squared: 0.8282, Adjusted R-squared: 0.8281 F-statistic: 2.409e+04 on 2 and 9997 DF, p-value: < 2.2e-16 On 10/14/06, John Sorkin <jsorkin at grecc.umaryland.edu> wrote:> R 2.2.0 > windows XP > > How can I perform a regression analyses using a vector of means, a > variance-covariance matrix? I looked at the help screen for lm and did > not see any option for using the afore mentioned structures as input to > lm. > Thanks, > John > > John Sorkin M.D., Ph.D. > Chief, Biostatistics and Informatics > Baltimore VA Medical Center GRECC, > University of Maryland School of Medicine Claude D. Pepper OAIC, > University of Maryland Clinical Nutrition Research Unit, and > Baltimore VA Center Stroke of Excellence > > University of Maryland School of Medicine > Division of Gerontology > Baltimore VA Medical Center > 10 North Greene Street > GRECC (BT/18/GR) > Baltimore, MD 21201-1524 > > (Phone) 410-605-7119 > (Fax) 410-605-7913 (Please call phone number above prior to faxing) > jsorkin at grecc.umaryland.edu > > Confidentiality Statement: > This email message, including any attachments, is for the so...{{dropped}} > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >