Woolner, Keith
2008-Jun-09 14:27 UTC
[R] Systemfit (was RE: How to force two regression coefficients to be equal but opposite in sign?)
Thank you, Greg, and also to Scott Ellison, who replied privately. I am in the process of trying out both suggestions. After I sent my initial message, I came across the Systemfit package, which allows specification of constraints on parameters. In theory, this should solve my problem perfectly. However, I was not able to get it to work with my data, as every attempt yielded the following error: Error in dimnames(x) <- dn : length of 'dimnames' [2] not equal to array extent I suspect that it is related to some of my variables being factors rather than numeric. Is systemfit able to deal with factors as independent variables having constraints, and if so, is there some trick in formulating the problem? I searched through the package documentation, but did not see mention of factors being either supported or unsupported. Thank you, again. Keith> Date: Fri, 6 Jun 2008 11:39:27 -0600 > From: "Greg Snow" <Greg.Snow at imail.org> > Subject: Re: [R] How to force two regression coefficients to be equal > but opposite in sign? > To: "Woolner, Keith" <kwoolner at indians.com>, "r-help at r-project.org" > <r-help at r-project.org> > Message-ID: > <B37C0A15B8FB3C468B5BC7EBC7DA14CC60F685895B at LP- > EXMBVS10.CO.IHC.COM> > Content-Type: text/plain; charset=us-ascii > > One simple way is to do something like: > > > fit <- lm(y ~ I(x1-x2) + x3, data=mydata) > > The first coeficient (after the intercept) will be the slope for x1, > the slope for x2 will be the negative of that. This model is nestedin> the fuller model with x1 and x2 fit seperately and you can therefore > test for differences. > > Hope this helps, > > -- > Gregory (Greg) L. Snow Ph.D. > Statistical Data Center > Intermountain Healthcare > greg.snow at imail.org > (801) 408-8111 > > > > > -----Original Message----- > > From: r-help-bounces at r-project.org > > [mailto:r-help-bounces at r-project.org] On Behalf Of Woolner, Keith > > Sent: Friday, June 06, 2008 10:07 AM > > To: r-help at r-project.org > > Subject: [R] How to force two regression coefficients to be > > equal but opposite in sign? > > > > Is there a way to set up a regression in R that forces two > > coefficients to be equal but opposite in sign? > > > > I'm trying to setup a model where a subject appears in a pair of > > environments where a measurement X is made. There are a total of 5 > > environments, one of which is a baseline. But each observation isfor> > a subject in only two of them, and not all subjects will appear in > > each environment. > > > > Each of the environments has an effect on the variable X. I want to > > measure the relative effects of each environment E on X with amodel.> > > > Xj = Xi * Ei / Ej > > > > Ei of the baseline model is set equal to 1. > > > > With a log transform, a linear-looking regression can be written as: > > > > log(Xj) = log(Xi) + log(Ei) - log(Ej) > > > > My data looks like: > > > > # E1 X1 E2 X2 > > 1 A .20 B .25 > > > > What I've tried in R: > > > > env <- c("A","B","C","D","E") > > > > # Note: data is made up just for this example > > > > df <- data.frame( > > X1 > > c(.20,.10,.40,.05,.10,.24,.30,.70,.48,.22,.87,.29,.24,.19,.92), > > X2 > > c(.25,.12,.45,.01,.19,.50,.30,.40,.50,.40,.68,.30,.16,.02,.70), > > E1 > > c("A","A","A","B","B","B","C","C","C","D","D","D","E","E","E"), > > E2 > > c("B","C","D","A","D","E","A","B","E","B","C","E","A","B","C") > > > > ) > > > > model <- lm(log(X2) ~ log(X1) + E1 + E2, data = df) > > > > summary(model) > > Call: > > > > lm(formula = log(X2) ~ log(X1) + E1 + E2, data = df) > > > > Residuals: > > > > 1 2 3 4 5 6 7 > > 8 9 > > 10 11 12 13 14 15 > > 0.3240 0.2621 -0.5861 -1.0283 0.5861 0.4422 0.3831 > > -0.2608 -0.1222 > > 0.9002 -0.5802 -0.3200 0.6452 -0.9634 0.3182 > > > > Coefficients: > > > > Estimate Std. Error t value Pr(>|t|) > > > > (Intercept) 0.54563 1.71558 0.318 0.763 > > log(X1) 1.29745 0.57295 2.265 0.073 . > > E1B -0.23571 0.95738 -0.246 0.815 > > E1C -0.57057 1.20490 -0.474 0.656 > > E1D -0.22988 0.98274 -0.234 0.824 > > E1E -1.17181 1.02918 -1.139 0.306 > > E2B -0.16775 0.87803 -0.191 0.856 > > E2C 0.05952 1.12779 0.053 0.960 > > E2D 0.43077 1.19485 0.361 0.733 > > E2E 0.40633 0.98289 0.413 0.696 > > > > --- > > > > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > Residual standard error: 1.004 on 5 degrees of freedom > > Multiple R-squared: 0.7622, Adjusted R-squared: 0.3343 > > F-statistic: 1.781 on 9 and 5 DF, p-value: 0.2721 > > > > ---- > > > > What I need to do is force the corresponding environmentcoefficients> > to be equal in absolute value, but opposite in sign. That is: > > > > E1B = -E2B > > E1C = -E3C > > E1D = -E3D > > E1E = -E1E > > > > In essence, E1 and E2 are the "same" variable, but can play two > > different roles in the model depending on whether it's the firstpart> > of the observation or the second part. > > > > I searched the archive, and the closest thing I found to mysituation> > was: > > > > http://tolstoy.newcastle.edu.au/R/e4/help/08/03/6773.html > > > > But the response to that thread didn't seem to be applicable to my > > situation. > > > > Any pointers would be appreciated. > > > > Thanks, > > Keith > *************************************
Arne Henningsen
2008-Jun-10 06:33 UTC
[R] Systemfit (was RE: How to force two regression coefficients to be equal but opposite in sign?)
Hi Keith! On Monday 09 June 2008 16:27, Woolner, Keith wrote:> [...] > After I sent my initial message, I came across the Systemfit package, > which allows specification of constraints on parameters. In theory, > this should solve my problem perfectly. However, I was not able to get > it to work with my data, as every attempt yielded the following error: > > Error in dimnames(x) <- dn : > length of 'dimnames' [2] not equal to array extent > > I suspect that it is related to some of my variables being factors > rather than numeric.Yes and no (see below).> library(systemfit) > > # create data frame - X1, X2, X3, X4 are numeric. E1 and E2 are factors > df <- data.frame( > X1 > c(.20,.10,.40,.05,.10,.24,.30,.70,.48,.22,.87,.29,.24,.19,.92), > X2 > c(.25,.12,.45,.01,.19,.50,.30,.40,.50,.40,.68,.30,.16,.02,.70), > E1 > c("A","A","A","B","B","B","C","C","C","D","D","D","E","E","E"), > E2 > c("B","C","D","A","D","E","A","B","E","B","C","E","A","B","C") > ) > > df$X3 <- sqrt(df$X1)+runif(1) > df$X4 <- sqrt(df$X2)+runif(1) > > # Create constraint matrix such that the last two variables must be > equal but opposite in signNo. I guess that you mean that the *coefficients* (and not the variables) must be equal but opposite in sign.> tx <- matrix(0,nrow=1,ncol=4) > tx[1,3]<- 1 > tx[1,4]<- 1 > > # Run systemfit with only numeric variables (works) > systemfit(X2 ~ X1 + X3 + X4,"OLS", data=df, restrict.matrix=tx) > > # Run systemfit with factors but not constraints (works) > systemfit(X2 ~ X1 + E1 + E2,"OLS", data=df) > > # Run systemfit with factors and constraints (this returns an error) > systemfit(X2 ~ X1 + E1 + E2,"OLS", data=df, restrict.matrix=tx)Run this regression without constraints (works) systemfit(X2 ~ X1 + E1 + E2,"OLS", data=df ) Take a look at the coefficients: We have *10* coefficients now (because "E1" and "E2" are factors. Hence, your restriction matrix must have *10* columns. For instance, if you want to restrict the coefficients of the "B"s in "E1" and "E2" (third and seventh coefficient, respectively) to be equal but opposite in sign, you could do the following: tx2 <- matrix(0,nrow=1,ncol=10) tx2[1,3]<- 1 tx2[1,7]<- 1 systemfit(X2 ~ X1 + E1 + E2,"OLS", data=df, restrict.matrix=tx2)> Is systemfit able to deal with factors as independent variables having > constraints, and if so, is there some trick in formulating the problem? > I searched through the package documentation, but did not see mention of > factors being either supported or unsupported.Until now, I thought that it not necessary to say something about factors, because they should work in systemfit as in other R functions (e.g. lm). The documentation says that "restrict.matrix" must be a j x k matrix, where k is the number of all parameters (NOT the number of all regressors, which differs if some regressors are factors). Hence, I think that the documentation is clear enough. However, please tell me if you have any suggestions for improving the documentation. Best wishes, Arne -- Arne Henningsen http://www.arne-henningsen.name -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20080610/0db9e071/attachment.bin>