Dragonwalker
2012-Mar-28 02:55 UTC
[R] Urgent - I really need some help lme4 model avg Estimates
Hello all, If someone could take a little time to help me then I would be very grateful. I studied piping plovers last summer. I watched each chick within a brood for 5 minutes and recorded behaviour, habitat use and foraging rate. There were two Sites, the first with 4 broods and the second with 3 broods. http://r.789695.n4.nabble.com/file/n4511178/Table_PP_Maslo_et_al.png As the data within a brood is non-independent and the fact that there were so few, then conventional statistical tests were of little use. I therefore spent a couple of months looking at mixed-models to allow me to use all the data for each day and use (1|Brood) as a random effect. At first i struggled with what models meant, but last week they 'sort of ' clicked and realised how to run them and how to weigh which models were the best (using AICc). As I had a number of factors/covariates that I wanted to look at I learned to use the dredge command in the MuMIn package from an a priori global model and decided to model average the models with a delta<2. I have two main questions: I was looking at similar research that also looked at models and they also came up with model average estimates and CIs for each variable and factor. They ended up with one table showing the top so many models with their AICc, delta and weights and then another table showing the model average Estimates and CIs for each factor and co-variate and also the Intercept. Each category within each variable was shown (I have attached an image of the table - the heading does not seem to match what is shown however). Their explanation of the variables was as follows: "A second model including these variables and wind speed reported a DAICc score <2; therefore, we model- averaged the parameter estimates included in these 2 best models (Table 3). Of the 5 habitats in which we observed plovers feeding, effect size was highest at artificial tidal ponds (5.52), followed by the intertidal zone (3.97). Positive effects of ephemeral pools (2.65) and bay shores (2.32) on adult foraging rates were 48% and 42% lower than artificial ponds, respectively. Conversely, sand flats (-2.30) had an equal but opposite effect on foraging rate, when compared to bay shores. The results also indicated that foraging rate was highest for adults during the post-breeding stage. In addition, vehicles had a 2.3 times larger effect on foraging adults than people. Finally, foraging rates during low tide were higher than at high tide by a factor of 2.5, as would be expected." As you can see, their explanation seems to suggest that all values are comparable e.g. vehicles and people. When I ran the model average I also got an Intercept estimate but only the second and beyond categorical Estimates were shown (e.g. if one factor was high tide, low tide, then only the estimate for low tide was shown, obviously an estimate of difference between the two). I asked on stats.stackexchange and they suggested just adding -1 to the end of the model, but although this worked, the estimates became much bigger to compensate for there being no intercept and although the difference between the Estimates were the same for 'within factor', the 'among factor' variables seemed to change (bigger differences between), along with the p-values for each group. In addition there was, of course, no intercept. I am therefore wondering whether anyone knows how I may be able to preserve the initial Estimates but still get the missing values (obviously the other researchers seemed to have done this as they still have an intercept and comparable estimates). This is my most important issue right now, but if someone has a moment, could you also tell me whether I should use the p-values as well, or should i just stick with explaining the magnitude of the effects, their direction and their Relative Importance. i want to keep it at a level that I can understand. Thank you in advance. I know everyone is busy but I would be very grateful for a prompt response if at all possible. Sincerely. -- View this message in context: http://r.789695.n4.nabble.com/Urgent-I-really-need-some-help-lme4-model-avg-Estimates-tp4511178p4511178.html Sent from the R help mailing list archive at Nabble.com.
Bert Gunter
2012-Mar-28 04:43 UTC
[R] Urgent - I really need some help lme4 model avg Estimates
You've got to be kidding! You are requesting extensive statistical consulting from the R-Help list. That is not the purpose of this list, nor is it reasonable to expect remote statisticians unfamiliar with your work or state of understanding (which appears to be rather sketchy) to provide reliable or perhaps even relevant advice. Instead, I suggest that you spend some serious time with local statistical experts (who may or may not be statisticians). Or take a much less complicated approach (perhaps with graphics) to your analysis -- this might actually be better because you will have a better understanding of the results and what they mean about the underlying scientific issues. Although I understand that, alas and alack, embellishing your work with dazzling statistical ornamentation may be a prerequisite to publication, so you're stuck . Cheers, Bert On Tue, Mar 27, 2012 at 7:55 PM, Dragonwalker <dragonwalkerart at hotmail.com> wrote:> Hello all, > If someone could take a little time to help me then I would be very > grateful. > I studied piping plovers last summer. I watched each chick within a brood > for 5 minutes and recorded behaviour, habitat use and foraging rate. > There were two Sites, the first with 4 broods and the second with 3 broods. > http://r.789695.n4.nabble.com/file/n4511178/Table_PP_Maslo_et_al.png As the > data within a brood is non-independent and the fact that there were so few, > then conventional statistical tests were of little use. I therefore spent a > couple of months looking at mixed-models to allow me to use all the data for > each day and use (1|Brood) as a random effect. > > At first i struggled with what models meant, but last week they 'sort of ' > clicked and realised how to run them and how to weigh which models were the > best (using AICc). > As I had a number of factors/covariates that I wanted to look at I learned > to use the dredge command in the MuMIn package from an a priori global model > and decided to model average the models with a delta<2. > > I have two main questions: > I was looking at similar research that also looked at models and they also > came up with model average estimates and CIs for each variable and factor. > They ended up with one table showing the top so many models with their AICc, > delta and weights and then another table showing the model average Estimates > and CIs for each factor and co-variate and also the Intercept. ? Each > category within each variable was shown (I have attached an image of the > table - the heading does not seem to match what is shown however). > Their explanation of the variables was as follows: > "A second model including these variables and wind speed reported a DAICc > score <2; therefore, we model- averaged the parameter estimates included in > these 2 best models (Table 3). Of the 5 habitats in which we observed > plovers feeding, effect size was highest at artificial tidal ponds (5.52), > followed by the intertidal zone (3.97). Positive effects of ephemeral pools > (2.65) and bay shores (2.32) on adult foraging rates were 48% and 42% lower > than artificial ponds, > respectively. Conversely, sand flats (-2.30) had an equal but opposite > effect on foraging rate, when compared to bay > shores. The results also indicated that foraging rate was highest for adults > during the post-breeding stage. In addition, > vehicles had a 2.3 times larger effect on foraging adults than people. > Finally, foraging rates during low tide were > higher than at high tide by a factor of 2.5, as would be expected." > > As you can see, their explanation seems to suggest that all values are > comparable e.g. vehicles and people. > > When I ran the model average I also got an Intercept estimate but only the > second and beyond categorical Estimates were shown (e.g. if one factor was > high tide, low tide, then only the estimate for low tide was shown, > obviously an estimate of difference between the two). > I asked on stats.stackexchange and they suggested just adding -1 to the end > of the model, but although this worked, the estimates became much bigger to > compensate for there being no intercept and although the difference between > the Estimates were the same for 'within factor', the 'among factor' > variables seemed to change (bigger differences between), along with the > p-values for each group. In addition there was, of course, no intercept. > > I am therefore wondering whether anyone knows how I may be able to preserve > the initial Estimates but still get the missing values (obviously the other > researchers seemed to have done this as they still have an intercept and > comparable estimates). > > This is my most important issue right now, but if someone has a moment, > could you also tell me whether I should use the p-values as well, or should > i just stick with explaining the magnitude of the effects, their direction > and their Relative Importance. i want to keep it at a level that I can > understand. > > Thank you in advance. I know everyone is busy but I would be very grateful > for a prompt response if at all possible. > > Sincerely. > > -- > View this message in context: http://r.789695.n4.nabble.com/Urgent-I-really-need-some-help-lme4-model-avg-Estimates-tp4511178p4511178.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
Bert Gunter
2012-Mar-28 05:03 UTC
[R] Urgent - I really need some help lme4 model avg Estimates
... perhaps also worth mentioning: "The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. " -- John Tukey -- Bert On Tue, Mar 27, 2012 at 7:55 PM, Dragonwalker <dragonwalkerart at hotmail.com> wrote:> Hello all, > If someone could take a little time to help me then I would be very > grateful. > I studied piping plovers last summer. I watched each chick within a brood > for 5 minutes and recorded behaviour, habitat use and foraging rate. > There were two Sites, the first with 4 broods and the second with 3 broods. > http://r.789695.n4.nabble.com/file/n4511178/Table_PP_Maslo_et_al.png As the > data within a brood is non-independent and the fact that there were so few, > then conventional statistical tests were of little use. I therefore spent a > couple of months looking at mixed-models to allow me to use all the data for > each day and use (1|Brood) as a random effect. > > At first i struggled with what models meant, but last week they 'sort of ' > clicked and realised how to run them and how to weigh which models were the > best (using AICc). > As I had a number of factors/covariates that I wanted to look at I learned > to use the dredge command in the MuMIn package from an a priori global model > and decided to model average the models with a delta<2. > > I have two main questions: > I was looking at similar research that also looked at models and they also > came up with model average estimates and CIs for each variable and factor. > They ended up with one table showing the top so many models with their AICc, > delta and weights and then another table showing the model average Estimates > and CIs for each factor and co-variate and also the Intercept. ? Each > category within each variable was shown (I have attached an image of the > table - the heading does not seem to match what is shown however). > Their explanation of the variables was as follows: > "A second model including these variables and wind speed reported a DAICc > score <2; therefore, we model- averaged the parameter estimates included in > these 2 best models (Table 3). Of the 5 habitats in which we observed > plovers feeding, effect size was highest at artificial tidal ponds (5.52), > followed by the intertidal zone (3.97). Positive effects of ephemeral pools > (2.65) and bay shores (2.32) on adult foraging rates were 48% and 42% lower > than artificial ponds, > respectively. Conversely, sand flats (-2.30) had an equal but opposite > effect on foraging rate, when compared to bay > shores. The results also indicated that foraging rate was highest for adults > during the post-breeding stage. In addition, > vehicles had a 2.3 times larger effect on foraging adults than people. > Finally, foraging rates during low tide were > higher than at high tide by a factor of 2.5, as would be expected." > > As you can see, their explanation seems to suggest that all values are > comparable e.g. vehicles and people. > > When I ran the model average I also got an Intercept estimate but only the > second and beyond categorical Estimates were shown (e.g. if one factor was > high tide, low tide, then only the estimate for low tide was shown, > obviously an estimate of difference between the two). > I asked on stats.stackexchange and they suggested just adding -1 to the end > of the model, but although this worked, the estimates became much bigger to > compensate for there being no intercept and although the difference between > the Estimates were the same for 'within factor', the 'among factor' > variables seemed to change (bigger differences between), along with the > p-values for each group. In addition there was, of course, no intercept. > > I am therefore wondering whether anyone knows how I may be able to preserve > the initial Estimates but still get the missing values (obviously the other > researchers seemed to have done this as they still have an intercept and > comparable estimates). > > This is my most important issue right now, but if someone has a moment, > could you also tell me whether I should use the p-values as well, or should > i just stick with explaining the magnitude of the effects, their direction > and their Relative Importance. i want to keep it at a level that I can > understand. > > Thank you in advance. I know everyone is busy but I would be very grateful > for a prompt response if at all possible. > > Sincerely. > > -- > View this message in context: http://r.789695.n4.nabble.com/Urgent-I-really-need-some-help-lme4-model-avg-Estimates-tp4511178p4511178.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
Dragonwalker
2012-Mar-28 14:50 UTC
[R] Urgent - I really need some help lme4 model avg Estimates
Thank you Mitchell, I will try that. So I presume that the initial paper where they showed the estimates AND the intercept from a model averaging procedure may have been done using a different method? Would it still be prudent to use a global model and then perhaps show the top so many, perhaps those with a delta<2 and then show their weights? Would it also be okay to just do a model average and then perhaps show the weights of each covariate and factor within these models to show their relative importance? I think the way the paper presented the results of extremely similar research, using only models using A+B+C+(1|D) etc and then model average, and able to come up with an Intercept and then much smaller comparable estimates made me think that this was probably the correct way to present the results and that getting these values must be something that I just didn't know how to code. They were even able to compare the Estimate differences among the variables whereas when I used -1 to remove the intercept the distance between the variables differed (although within stayed the same). Thank you again for your kind reply. Rachel -- View this message in context: http://r.789695.n4.nabble.com/Urgent-I-really-need-some-help-lme4-model-avg-Estimates-tp4511178p4512504.html Sent from the R help mailing list archive at Nabble.com.