Dear all, i am doing linear regression with robust error to know the effect of a (x) variable on (y)other if i execute the command i found positive trend. But if i check the effect of number of (x.x1,x2,x3)variables on same (y)variable then the positive effect shwon by x variable turns to negative. so plz help me in this situation. Barjesh Kochar Research scholar
----------------------------------------> Date: Fri, 10 Jun 2011 09:53:20 +0530 > From: bkkochar at gmail.com > To: r-help at r-project.org > Subject: [R] Linear multivariate regression with Robust error > > Dear all, > > i am doing linear regression with robust error to know the effect of > a (x) variable on (y)other if i execute the command i found positive > trend. > But if i check the effect of number of (x.x1,x2,x3)variables > on same (y)variable then the positive effect shwon by x variable turns > to negative. so plz help me in this situation.take y as goodness and x and x1 have something to do with a product. The first analysis is from company A, second is from company B and the underlying relationship is given with some noise LOL, ( I'm still on first cup of cofee, this was fist example to come to mind as these question keep coming up here everyday )> x=1:100 > x1=x*x > y<-x-x1+runif(100) > lm(y~x)Call: lm(formula = y ~ x) Coefficients: (Intercept)??????????? x ?????? 1718???????? -100> lm(y~x+x1)Call: lm(formula = y ~ x + x1) Coefficients: (Intercept)??????????? x?????????? x1 ???? 0.5253?????? 1.0024????? -1.0000>> > Barjesh Kochar > Research scholar > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Michael Friendly
2011-Jun-10 15:50 UTC
[R] Linear multivariate regression with Robust error
On 6/10/2011 12:23 AM, Barjesh Kochar wrote:> Dear all, > > i am doing linear regression with robust error to know the effect of > a (x) variable on (y)other if i execute the command i found positive > trend. > But if i check the effect of number of (x.x1,x2,x3)variables > on same (y)variable then the positive effect shwon by x variable turns > to negative. so plz help me in this situation. > > Barjesh Kochar > Research scholar >You don't give any data or provide any code (as the posting guide requests) , so I have to guess that you have just rediscovered Simpson's paradox -- that the coefficient of a variable in a marginal regression can have an opposite sign to that in a joint model with other predictors. I have no idea what you mean by 'robust error'. One remedy is an added-variable plot which will show you the partial contributions of each predictor in the joint model, as well as whether there are any influential observations that are driving the estimated coefficients. -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Street Web: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA
I am with Michael. It is almost impossible to figure out what you are trying. However, I assume, like Michael, that you regress y on x2 and find, say, a negative effect. But when you regress y on x1 and x2, then you find a positive effect of x2. The short answer to your question is that in this case your restricted model (the one only containing x2) suffers from omitted variable bias. Here is an example: Let's assume you are interested in the effect of x2 in this example! Let's say we have 100 observations and that y depends on x1 and x2. Furthermore, let us assume that x1 and x2 are positively correlated. x1=rnorm(100) e1=rnorm(100) #random error term x2=x1+rnorm(100) #x2 is correlated with x2 e=rnorm(100) #random error term y=-3*x1+x2+e #dependent variable Note that x1 has a negative relationship to y, but x2 has a positive relationship to y. Note also that the effect of x1 on y is larger in size (minus 3) than the effect of x2 on y (positive 1). Now let's run some regressions. First, let's run y on x1 only. An unbiased estimate should reproduce the coefficient of -3 within the confidence interval. However, the estimated x1 is much smaller than we would expect. The reason is that because we omit x2, x1 picks up some of the effect of x2 because x1 and x2 are correlated. Hence the coefficient for x1 is diluted. reg1<-lm(y~x1) summary(reg1) Now, let's run y on x2. An unbiased estimate should reproduce the coefficient of 1 within the confidence interval. However, the estimated effect of x2 is negative and significant. Obviously, the estimate for x2 is severely biased. The reasons are the following. First, x2 correlates with x1. Hence, when you regress y only on x2, the coefficient will pickup some of the effect of x1 on y. This will generally lead to biased estimates of the coefficient for x2. The reason why the coefficient has the opposite sign that it is supposed to have (and why it is not just a little bit biased like the coefficient on x1 in the previous regression) is that 1. x1 and x2 correlated positively, 2. x1 has a negative effect, while x2 has a positive effect on y (opposite signs), and 3. the effect of x1 is much larger in size than the effect of x2. reg2<-lm(y~x2) summary(reg2) Hence, if we accounted for x1 and x2 in our regression of y, both coefficients should be consistently estimated because then we do not suffer from the omission of important predictors of y that are correlated among each other. reg3<-lm(y~x1+x2) summary(reg3) Taahtaah. Problem solved (most likely). So the answer to your question is that the "correct" coefficient is likely the one in which you include the other control variables. You should read up on "omitted variable bias." If that is not the problem, you have to give us more information/reproducible code. Hope that helps, Daniel -- View this message in context: http://r.789695.n4.nabble.com/Linear-multivariate-regression-with-Robust-error-tp3587531p3589083.html Sent from the R help mailing list archive at Nabble.com.
Filipe Leme Botelho
2011-Jun-10 19:50 UTC
[R] RES: Linear multivariate regression with Robust error
An embedded message was scrubbed... From: "Filipe Leme Botelho" <filipe.botelho at vpar.com.br> Subject: RES: [R] Linear multivariate regression with Robust error Date: Fri, 10 Jun 2011 16:50:24 -0300 Size: 3804 URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110610/1553a221/attachment.mht> -------------- next part -------------- "This message and its attachments may contain confidential and/or privileged information. If you are not the addressee, please, advise the sender immediately by replying to the e-mail and delete this message." "Este mensaje y sus anexos pueden contener informaci?n confidencial o privilegiada. Si ha recibido este e-mail por error por favor b?rrelo y env?e un mensaje al remitente." "Esta mensagem e seus anexos podem conter informa??o confidencial ou privilegiada. Caso n?o seja o destinat?rio, solicitamos a imediata notifica??o ao remetente e exclus?o da mensagem."