Luis Reino
2013-Feb-02 12:53 UTC
[R] Mixed Models: Contribution of random variable to final estimate
Dear all, We want to test if the invasiveStatus is predicted by the amount (quant) of animals arriving to a country of a certain species (taxonid). We are using lmer to perform the model. The model is: lmer(invasiveStatus~I(log(quant+1))+I(log(inDegree+1))+(1|taxonid)+(1|country), family=binomial,data=td), where invasiveStatus is a binary variable, quant and inDegree are integer variables, and taxonid and country are factor variables. The fixef output is (Intercept) I(log(quant + 1)) I(log(inDegree + 1)) -15.6338288 0.3198074 2.1566502 and the ranef output is, sorted from higher to lower, andshowing only the first 10 lines, $taxonid T16 9.51 T258 8.36 T388 8.24 T961 7.98 T76 7.48 T470 7.46 T108 7.17 T84 7.15 T292 6.91 T189 6.65 ... $country US 3.23 JP 2.45 ES 2.35 IT 2.14 BM 1.63 IL 1.41 SI 1.39 LB 1.06 FR 1.05 VE 0.996 ... Our problem is that the coefficients to the final estimate of invasiveStatus are higher for the random variables than the fixed ones. We think this is a result of the confound effect between quant, and country and taxonid. In other words, the higher the number of animals of a given species(taxonid) arriving to given country, the higher the probability of other species to arrive to the same country. Are we formulating the model correctly? Is there a way to avoid that the contribution of the random variables is the most contributing part to the final estimate? Thanks, Luis Reino [[alternative HTML version deleted]]
Ben Bolker
2013-Feb-02 14:35 UTC
[R] Mixed Models: Contribution of random variable to final estimate
Luis Reino <luisreino <at> isa.utl.pt> writes:> > Dear all,> We want to test if the invasiveStatus is predicted by the amount > (quant) of animals arriving to a country of a certain species > (taxonid). We are using lmer to perform the model.In general lmer questions belong on r-sig-mixed-models at r-project.org, but I think this> The model is: > lmer(invasiveStatus~I(log(quant+1))+I(log(inDegree+1))+ > (1|taxonid)+(1|country), > family=binomial,data=td)You don't need I() around those terms -- you only need it to protect expressions such as x^2 that would be interpreted differently in the formula context.> where invasiveStatus is a binary variable, quant and inDegree are > integer variables, and taxonid and country are factor variables.> The fixef output is > (Intercept) I(log(quant + 1)) I(log(inDegree + 1)) > -15.6338288 0.3198074 2.1566502> and the ranef output is, sorted from higher to lower, andshowing > only the first 10 lines,> $taxonid > T16 9.51 > T258 8.36[snip]> $country > US 3.23 > JP 2.45 > ES 2.35[snip]> Our problem is that the coefficients to the final estimate of > invasiveStatus are higher for the random variables than the fixed > ones. We think this is a result of the confound effect between > quant, and country and taxonid. In other words, the higher the > number of animals of a given species(taxonid) arriving to given > country, the higher the probability of other species to arrive to > the same country. Are we formulating the model correctly? Is there > a way to avoid that the contribution of the random variables is the > most contributing part to the final estimate? Thanks, Luis ReinoThis might be an issue of parameter scaling. The idea is that your coefficients measure the effect of the parameters *per unit*. Thus the random effects are measured in log-odds units, while the effects of quant and inDegree are measured in units of log-odds change **per log-unit change in the variable**, i.e. multiplying by e is expected to make 1 log-odds change in the outcome. You might try scaling your variables (see e.g. Schielzeth 2010 Methods in Ecology & Evolution). (Of course, you can make the fixed effects look as big as you want by scaling the predictor appropriately ...) It worries me a little that your intercept is so small -- suggests that the average fraction invasive when quant=0 and inDegree=0 is 3 x 10^{-7} ... Follow-ups to r-sig-mixed-models