Angelo D'Ambrosio
2018-Apr-29 15:52 UTC
[R] Compare global and between group variability of 2 mixed effect models
Hello, We are comparing some features of our product against a competitor. Since the product is produced in lots and we have proof that ambient temperature is relevant for its functioning we used a mixed effect model (MEM) structured this way (NB: R lme4 notation): `out ~ Brand * Temperature + (Temperature | Lot)` with out being the various outcomes, a random intercept on Lot and a random slope on Temperature. We need to understand which brand has more variability between single items and between lots. We thought of using the random intercept and residual standard deviations from the MEM for each brand. The problem now is how to compare them. We tried using the F distribution to build confidence intervals and hypothesis test, but for most of the outcomes we evaluated, one or both models had zero standard deviation on the random intercept, so it's impossible to take the ratio (lots of zeros, infinity, NaN). We investigated also the variance gamma distribution, that can describe the difference of two standard deviations (so zeros are not a problem), but we don't actually know how to pass the right parameters to describe our data. As a third alternative, we tried modeling the ratio between standard deviations with a bayes-regularized glm model with Poisson links: `bayesglm(c(sd1, sd2) ~ c('mod1', 'mod2), weights = c(n1, n2), family poisson()) But we are not sure whether it could be a meaningful approximation or not. We thought of a Poisson distribution because it can natively cope with zeros and the bayes regularization to avoid too extreme estimates (eg: 0 - Inf confidence intervals). Finally, we thought of a much simpler approach: - first use a Levene test to compare raw variances between brands. - than using ANOVA to test for between-lots variability for each brand. Although much simpler, this methodology don?t use the information on the clustered structure of the data and the effect of temperature. Furthermore, it doesn't formally compare between-lots variability between brands, it just produce two F scores. Any suggestions? Thanks [[alternative HTML version deleted]]