Hello Thierry, Thank you for your quick response. Sorry, but I am not sure if I follow what you said. I get the following outputs from the two models:> coef(lmer(Reaction ~ Days + (1| Subject), sleepstudy))Subject (Intercept) Days 308 292.1888 10.46729 309 173.5556 10.46729 310 188.2965 10.46729 330 255.8115 10.46729 331 261.6213 10.46729 332 259.6263 10.46729 333 267.9056 10.46729 334 248.4081 10.46729 335 206.1230 10.46729 337 323.5878 10.46729 349 230.2089 10.46729 350 265.5165 10.46729 351 243.5429 10.46729 352 287.7835 10.46729 369 258.4415 10.46729 370 245.0424 10.46729 371 248.1108 10.46729 372 269.5209 10.46729> coef(lm(Reaction ~ Days + Subject, sleepstudy))(Intercept) 295.03104 Days 10.46729 Subject309 -126.90085 Subject310 -111.13256 Subject330 -38.91241 Subject331 -32.69778 Subject332 -34.83176 Subject333 -25.97552 Subject334 -46.83178 Subject335 -92.06379 Subject337 33.58718 Subject349 -66.29936 Subject350 -28.53115 Subject351 -52.03608 Subject352 -4.71229 Subject369 -36.09919 Subject370 -50.43206 Subject371 -47.14979 Subject372 -24.24770 Now, what I expected is the following: - 'Intercept' of model-2 to match with Intercept of Subject-308 of model-1 - 'Intercept+Subject309' of model-2 to match with Intercept of Subject-309 of model-1 - and so on... What am I missing here? If it is difficult to explain this, can you alternately answer the following: "Is it possible to define the 'lm' and 'lmer' models above so they produce the same results (at least in terms of predictions)?" Thanks again. Utkarsh Singhal 91.96508.54333 On 12 July 2016 at 19:15, Thierry Onkelinx <thierry.onkelinx at inbo.be> wrote:> The parametrisation is different. > > The intercept in model 1 is the effect of the "average" subject at days => 0. > The intercept in model 2 is the effect of the first subject at days == 0. > > ir. Thierry Onkelinx > Instituut voor natuur- en bosonderzoek / Research Institute for Nature and > Forest > team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance > Kliniekstraat 25 > 1070 Anderlecht > Belgium > > To call in the statistician after the experiment is done may be no more > than asking him to perform a post-mortem examination: he may be able to say > what the experiment died of. ~ Sir Ronald Aylmer Fisher > The plural of anecdote is not data. ~ Roger Brinner > The combination of some data and an aching desire for an answer does not > ensure that a reasonable answer can be extracted from a given body of data. > ~ John Tukey > > 2016-07-12 15:35 GMT+02:00 Utkarsh Singhal <utkarsh.iit at gmail.com>: > >> Hi experts, >> >> While the slope is coming out to be identical in the two methods below, >> the >> intercepts are not. As far as I understand, both are formulations are >> identical in the sense that these are asking for a slope corresponding to >> 'Days' and a separate intercept term for each Subject. >> >> # Model-1 >> library(lmer) >> coef(lmer(Reaction ~ Days + (1| Subject), sleepstudy)) >> >> # Model-2 >> coef(lm(Reaction ~ Days + Subject, sleepstudy)) >> >> Can somebody tell me the reason? Are the above formulations actually >> different or is it due to different optimization method used? >> >> Thank you. >> >> Utkarsh Singhal >> 91.96508.54333 >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > >[[alternative HTML version deleted]]
Your lm() estimates are using the default contrasts of contr.treatment, providing an intercept corresponding to your subject 308 and the other subject* estimates are differences from subject 308 intercept. You could have specified this with contrasts as contr.sum and the estimates would be more easily compared to the lmer() model estimates. They are close but will never be identical as the lmer() model estimates are based on assuming a normal distribution with specified variance. They rarely would be identical. Brian Brian S. Cade, PhD U. S. Geological Survey Fort Collins Science Center 2150 Centre Ave., Bldg. C Fort Collins, CO 80526-8818 email: cadeb at usgs.gov <brian_cade at usgs.gov> tel: 970 226-9326 On Tue, Jul 12, 2016 at 12:10 PM, Utkarsh Singhal <utkarsh.iit at gmail.com> wrote:> Hello Thierry, > > Thank you for your quick response. Sorry, but I am not sure if I follow > what you said. I get the following outputs from the two models: > > coef(lmer(Reaction ~ Days + (1| Subject), sleepstudy)) > Subject (Intercept) Days > 308 292.1888 10.46729 > 309 173.5556 10.46729 > 310 188.2965 10.46729 > 330 255.8115 10.46729 > 331 261.6213 10.46729 > 332 259.6263 10.46729 > 333 267.9056 10.46729 > 334 248.4081 10.46729 > 335 206.1230 10.46729 > 337 323.5878 10.46729 > 349 230.2089 10.46729 > 350 265.5165 10.46729 > 351 243.5429 10.46729 > 352 287.7835 10.46729 > 369 258.4415 10.46729 > 370 245.0424 10.46729 > 371 248.1108 10.46729 > 372 269.5209 10.46729 > > > coef(lm(Reaction ~ Days + Subject, sleepstudy)) > (Intercept) 295.03104 > Days 10.46729 > Subject309 -126.90085 > Subject310 -111.13256 > Subject330 -38.91241 > Subject331 -32.69778 > Subject332 -34.83176 > Subject333 -25.97552 > Subject334 -46.83178 > Subject335 -92.06379 > Subject337 33.58718 > Subject349 -66.29936 > Subject350 -28.53115 > Subject351 -52.03608 > Subject352 -4.71229 > Subject369 -36.09919 > Subject370 -50.43206 > Subject371 -47.14979 > Subject372 -24.24770 > > Now, what I expected is the following: > > - 'Intercept' of model-2 to match with Intercept of Subject-308 of > model-1 > - 'Intercept+Subject309' of model-2 to match with Intercept of > Subject-309 of model-1 > - and so on... > > What am I missing here? > > If it is difficult to explain this, can you alternately answer the > following: "Is it possible to define the 'lm' and 'lmer' models above so > they produce the same results (at least in terms of predictions)?" > > Thanks again. > > Utkarsh Singhal > 91.96508.54333 > > > On 12 July 2016 at 19:15, Thierry Onkelinx <thierry.onkelinx at inbo.be> > wrote: > > > The parametrisation is different. > > > > The intercept in model 1 is the effect of the "average" subject at days > => > 0. > > The intercept in model 2 is the effect of the first subject at days == 0. > > > > ir. Thierry Onkelinx > > Instituut voor natuur- en bosonderzoek / Research Institute for Nature > and > > Forest > > team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance > > Kliniekstraat 25 > > 1070 Anderlecht > > Belgium > > > > To call in the statistician after the experiment is done may be no more > > than asking him to perform a post-mortem examination: he may be able to > say > > what the experiment died of. ~ Sir Ronald Aylmer Fisher > > The plural of anecdote is not data. ~ Roger Brinner > > The combination of some data and an aching desire for an answer does not > > ensure that a reasonable answer can be extracted from a given body of > data. > > ~ John Tukey > > > > 2016-07-12 15:35 GMT+02:00 Utkarsh Singhal <utkarsh.iit at gmail.com>: > > > >> Hi experts, > >> > >> While the slope is coming out to be identical in the two methods below, > >> the > >> intercepts are not. As far as I understand, both are formulations are > >> identical in the sense that these are asking for a slope corresponding > to > >> 'Days' and a separate intercept term for each Subject. > >> > >> # Model-1 > >> library(lmer) > >> coef(lmer(Reaction ~ Days + (1| Subject), sleepstudy)) > >> > >> # Model-2 > >> coef(lm(Reaction ~ Days + Subject, sleepstudy)) > >> > >> Can somebody tell me the reason? Are the above formulations actually > >> different or is it due to different optimization method used? > >> > >> Thank you. > >> > >> Utkarsh Singhal > >> 91.96508.54333 > >> > >> [[alternative HTML version deleted]] > >> > >> ______________________________________________ > >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Hi Brian, This makes some sense to me theoretically, but doesn't pan out with my experiment. The contrasts default was the following as you said:> options("contrasts")$contrasts unordered ordered "contr.treatment" "contr.poly" I changed it as follows:> options(contracts=c(unordered="contr.sum", ordered="contr.poly"))But the output of 'lm' model was exactly the same as before. Then I tried the following:> coef(lm(Reaction ~ Days + Subject, sleepstudy,contrasts=list(Subject="contr.sum"))) The model output appears slightly different, but the 'Subject* + Intercept' is still exactly the same as the default 'lm' model. Also, as you can verify below, the predictions from two 'lm' models are exactly the same> fit_lm = lm(Reaction ~ Days + Subject, sleepstudy); > fit_lm2 = lm(Reaction ~ Days + Subject, sleepstudy,contrasts=list(Subject="contr.sum"))> > head(predict(fit_lm))1 2 3 4 5 6 295.0310 305.4983 315.9656 326.4329 336.9002 347.3675> head(predict(fit_lm2))1 2 3 4 5 6 295.0310 305.4983 315.9656 326.4329 336.9002 347.3675 And these are quite different from the mixedmodel:> fit_lmer = lmer(Reaction ~ Days + (1| Subject), sleepstudy) > head(predict(fit_lmer))1 2 3 4 5 6 292.1888 302.6561 313.1234 323.5907 334.0580 344.5252 Any idea? Utkarsh Singhal 91.96508.54333 On 13 July 2016 at 00:18, Cade, Brian <cadeb at usgs.gov> wrote:> Your lm() estimates are using the default contrasts of contr.treatment, > providing an intercept corresponding to your subject 308 and the other > subject* estimates are differences from subject 308 intercept. You could > have specified this with contrasts as contr.sum and the estimates would be > more easily compared to the lmer() model estimates. They are close but > will never be identical as the lmer() model estimates are based on assuming > a normal distribution with specified variance. They rarely would be > identical. > > Brian > > Brian S. Cade, PhD > > U. S. Geological Survey > Fort Collins Science Center > 2150 Centre Ave., Bldg. C > Fort Collins, CO 80526-8818 > > email: cadeb at usgs.gov <brian_cade at usgs.gov> > tel: 970 226-9326 > > > On Tue, Jul 12, 2016 at 12:10 PM, Utkarsh Singhal <utkarsh.iit at gmail.com> > wrote: > >> Hello Thierry, >> >> Thank you for your quick response. Sorry, but I am not sure if I follow >> what you said. I get the following outputs from the two models: >> > coef(lmer(Reaction ~ Days + (1| Subject), sleepstudy)) >> Subject (Intercept) Days >> 308 292.1888 10.46729 >> 309 173.5556 10.46729 >> 310 188.2965 10.46729 >> 330 255.8115 10.46729 >> 331 261.6213 10.46729 >> 332 259.6263 10.46729 >> 333 267.9056 10.46729 >> 334 248.4081 10.46729 >> 335 206.1230 10.46729 >> 337 323.5878 10.46729 >> 349 230.2089 10.46729 >> 350 265.5165 10.46729 >> 351 243.5429 10.46729 >> 352 287.7835 10.46729 >> 369 258.4415 10.46729 >> 370 245.0424 10.46729 >> 371 248.1108 10.46729 >> 372 269.5209 10.46729 >> >> > coef(lm(Reaction ~ Days + Subject, sleepstudy)) >> (Intercept) 295.03104 >> Days 10.46729 >> Subject309 -126.90085 >> Subject310 -111.13256 >> Subject330 -38.91241 >> Subject331 -32.69778 >> Subject332 -34.83176 >> Subject333 -25.97552 >> Subject334 -46.83178 >> Subject335 -92.06379 >> Subject337 33.58718 >> Subject349 -66.29936 >> Subject350 -28.53115 >> Subject351 -52.03608 >> Subject352 -4.71229 >> Subject369 -36.09919 >> Subject370 -50.43206 >> Subject371 -47.14979 >> Subject372 -24.24770 >> >> Now, what I expected is the following: >> >> - 'Intercept' of model-2 to match with Intercept of Subject-308 of >> model-1 >> - 'Intercept+Subject309' of model-2 to match with Intercept of >> Subject-309 of model-1 >> - and so on... >> >> >> What am I missing here? >> >> If it is difficult to explain this, can you alternately answer the >> following: "Is it possible to define the 'lm' and 'lmer' models above so >> they produce the same results (at least in terms of predictions)?" >> >> Thanks again. >> >> Utkarsh Singhal >> 91.96508.54333 >> >> >> On 12 July 2016 at 19:15, Thierry Onkelinx <thierry.onkelinx at inbo.be> >> wrote: >> >> > The parametrisation is different. >> > >> > The intercept in model 1 is the effect of the "average" subject at days >> =>> > 0. >> > The intercept in model 2 is the effect of the first subject at days =>> 0. >> > >> > ir. Thierry Onkelinx >> > Instituut voor natuur- en bosonderzoek / Research Institute for Nature >> and >> > Forest >> > team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance >> > Kliniekstraat 25 >> > 1070 Anderlecht >> > Belgium >> > >> > To call in the statistician after the experiment is done may be no more >> > than asking him to perform a post-mortem examination: he may be able to >> say >> > what the experiment died of. ~ Sir Ronald Aylmer Fisher >> > The plural of anecdote is not data. ~ Roger Brinner >> > The combination of some data and an aching desire for an answer does not >> > ensure that a reasonable answer can be extracted from a given body of >> data. >> > ~ John Tukey >> > >> > 2016-07-12 15:35 GMT+02:00 Utkarsh Singhal <utkarsh.iit at gmail.com>: >> > >> >> Hi experts, >> >> >> >> While the slope is coming out to be identical in the two methods below, >> >> the >> >> intercepts are not. As far as I understand, both are formulations are >> >> identical in the sense that these are asking for a slope corresponding >> to >> >> 'Days' and a separate intercept term for each Subject. >> >> >> >> # Model-1 >> >> library(lmer) >> >> coef(lmer(Reaction ~ Days + (1| Subject), sleepstudy)) >> >> >> >> # Model-2 >> >> coef(lm(Reaction ~ Days + Subject, sleepstudy)) >> >> >> >> Can somebody tell me the reason? Are the above formulations actually >> >> different or is it due to different optimization method used? >> >> >> >> Thank you. >> >> >> >> Utkarsh Singhal >> >> 91.96508.54333 >> >> >> >> [[alternative HTML version deleted]] >> >> >> >> ______________________________________________ >> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> >> https://stat.ethz.ch/mailman/listinfo/r-help >> >> PLEASE do read the posting guide >> >> http://www.R-project.org/posting-guide.html >> >> and provide commented, minimal, self-contained, reproducible code. >> >> >> > >> > >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > >[[alternative HTML version deleted]]