Margaret Gardiner-Garden
2007-Aug-16 06:45 UTC
[R] residual plots for lmer in lme4 package
Hi, I was wondering if I might be able to ask some advice about doing residual plots for the lmer function in the lme4 package. Our group's aim is to find if the expression staining of a particular gene in a sample (or "core") is related to the pathology of the core. To do this, we used the lmer function to perform a logistic mixed model below. I apologise in advance for the lack of subscripts. logit P(yij=1) = â0 + Ui + â1Patholij where Ui~N(0, óu2), i indexes patient, j indexes measurement, Pathol is an indicator variable (0,1) for benign epithelium versus cancer and yij is the staining indicator (0,1) for each core where yij equals 1 if the core stains positive and 0 otherwise. (I have inserted some example R code at the end of this message) I was wondering if you knew how I could test that the errors Ui are normally distributed in my fit. I am not familiar with how to do residual plots for a mixed logistic regression (or even for any logistic regression!). Any advice would be greatly appreciated! Thanks and Regards Marg Example code: lmer(Intensity.over2.hyp.canc~Pathology + (1|Patient.ID), dataHSD17beta4.hyp.canc, family="binomial", na.action="na.omit") #Family: binomial(logit link) # AIC BIC logLik deviance # 414.1101 431.4147 -203.0550 406.1101 #Random effects: # Groups Name Variance Std.Dev. # Patient.ID (Intercept) 4.9558 2.2262 # of obs: 559, groups: Patient.ID, 177 #Estimated scale (compare to 1) 0.6782544 #Fixed effects: # Estimate Std. Error z value Pr(>|z|) #(Intercept) -2.05734 0.24881 -8.2686 < 2.2e-16 *** #PathologyHyperplasia -1.76627 0.44909 -3.9330 8.389e-05 *** NB. Intensity.over2.hyp.canc is the staining of the core (ie 0 or 1) Pathology is Hyperplasia or Cancer Dr Margaret Gardiner-Garden Garvan Institute of Medical Research 384 Victoria Street Darlinghurst Sydney NSW 2010 Australia Phone: 61 2 9295 8348 Fax: 61 2 9295 8321 [[alternative HTML version deleted]]
Hi Margaret, Have a look at qqmath in the lattice package. ?qqmath Hank On Aug 16, 2007, at 2:45 AM, Margaret Gardiner-Garden wrote:> Hi, > > > > I was wondering if I might be able to ask some advice about doing > residual > plots for the lmer function in the lme4 package. > > > > Our group's aim is to find if the expression staining of a > particular gene > in a sample (or "core") is related to the pathology of the core. > > To do this, we used the lmer function to perform a logistic mixed > model > below. I apologise in advance for the lack of subscripts. > > > > logit P(yij=1) = ?0 + Ui + ?1Patholij where Ui~N(0, ?u2), > > i indexes patient, j indexes measurement, Pathol is an indicator > variable > (0,1) for benign > > epithelium versus cancer and yij is the staining indicator (0,1) > for each > core where yij equals 1 if the core stains positive and 0 otherwise. > > > > (I have inserted some example R code at the end of this message) > > > > I was wondering if you knew how I could test that the errors Ui are > normally > distributed in my fit. I am not familiar with how to do residual > plots for > a mixed logistic regression (or even for any logistic regression!). > > > > Any advice would be greatly appreciated! > > > > Thanks and Regards > > Marg > > > > Example code: > > > > lmer(Intensity.over2.hyp.canc~Pathology + (1|Patient.ID), data> HSD17beta4.hyp.canc, family="binomial", na.action="na.omit") > > > > > > > > #Family: binomial(logit link) > > # AIC BIC logLik deviance > > # 414.1101 431.4147 -203.0550 406.1101 > > #Random effects: > > # Groups Name Variance Std.Dev. > > # Patient.ID (Intercept) 4.9558 2.2262 > > # of obs: 559, groups: Patient.ID, 177 > > > > #Estimated scale (compare to 1) 0.6782544 > > > > #Fixed effects: > > # Estimate Std. Error z value Pr(>|z|) > > #(Intercept) -2.05734 0.24881 -8.2686 < 2.2e-16 *** > > #PathologyHyperplasia -1.76627 0.44909 -3.9330 8.389e-05 *** > > > > NB. Intensity.over2.hyp.canc is the staining of the core (ie 0 or 1) > > Pathology is Hyperplasia or Cancer > > > > > > Dr Margaret Gardiner-Garden > > Garvan Institute of Medical Research > > 384 Victoria Street > > Darlinghurst Sydney > > NSW 2010 Australia > > > > Phone: 61 2 9295 8348 > > Fax: 61 2 9295 8321 > > > > > > > [[alternative HTML version deleted]] > > <ATT00001>Dr. Hank Stevens, Associate Professor 338 Pearson Hall Botany Department Miami University Oxford, OH 45056 Office: (513) 529-4206 Lab: (513) 529-4262 FAX: (513) 529-4243 http://www.cas.muohio.edu/~stevenmh/ http://www.muohio.edu/ecology/ http://www.muohio.edu/botany/ "E Pluribus Unum" If you send an attachment, please try to send it in a format anyone can read, such as PDF, text, Open Document Format, HTML, or RTF. Please try not to send me MS Word or PowerPoint attachments- Why? See: http://www.gnu.org/philosophy/no-word-attachments.html
Margaret Gardiner-Garden
2007-Aug-17 06:36 UTC
[R] residual plots for lmer in lme4 package
Hi, I was wondering if I might be able to ask some advice about doing residual plots for the lmer function in the lme4 package. (Apologies to anyone who has received this message twice. I have had problems with embedded text.) Our group's aim is to find if the expression staining of a particular gene in a sample (or "core") is related to the pathology of the core. To do this, we used the lmer function to perform a logistic mixed model below. logit P(yij=1) = beta0 + Ui + beta1Patholij where Ui~N(0, sigmaU2), i indexes patient, j indexes measurement, Pathol is an indicator variable (0,1) for benign epithelium versus cancer and yij is the staining indicator (0,1) for each core where yij equals 1 if the core stains positive and 0 otherwise. (I have inserted some example R code at the end of this message) I was wondering if you knew how I could test that the errors Ui are normally distributed in my fit. I am not familiar with how to do residual plots for a mixed logistic regression. Any advice would be greatly appreciated! Thanks and Regards Marg Example code: lmer(Intensity.over2.hyp.canc~Pathology + (1|Patient.ID), dataHSD17beta4.hyp.canc, family="binomial", na.action="na.omit") #Family: binomial(logit link) # AIC BIC logLik deviance # 414.1101 431.4147 -203.0550 406.1101 #Random effects: # Groups Name Variance Std.Dev. # Patient.ID (Intercept) 4.9558 2.2262 # of obs: 559, groups: Patient.ID, 177 #Estimated scale (compare to 1) 0.6782544 #Fixed effects: # Estimate Std. Error z value Pr(>|z|) #(Intercept) -2.05734 0.24881 -8.2686 < 2.2e-16 *** #PathologyHyperplasia -1.76627 0.44909 -3.9330 8.389e-05 *** NB. Intensity.over2.hyp.canc is the staining of the core (ie 0 or 1) Pathology is Hyperplasia or Cancer Dr Margaret Gardiner-Garden Garvan Institute of Medical Research 384 Victoria Street Darlinghurst Sydney NSW 2010 Australia Phone: 61 2 9295 8348 Fax: 61 2 9295 8321 [[alternative HTML version deleted]]
I am doubtful whether standard residual plots are very useful in this context. One wants the theoretical effects Ui to have a normal distribution. If there are similar amounts of information on each patient, maybe it will not be too bad to extract the estimated effects and check them for normality. I don't think you can use residuals() to extract them, as glmer() does not have the notion of levels. Maybe they can be extracted using ranef(), but I do not see any examples for use with glmer() on the help pages. The issue of checking for normality of effects in multi-level models has not been very much researched, as far as I can tell. The function residuals() gives residuals that adjust for all except the "highest" level of random effects. Depending on the relative magnitudes of the variance components, whether or not these "residuals" are anywhere near normal may not be of much or any consequence. John Maindonald email: john.maindonald at anu.edu.au phone : +61 2 (6125)3473 fax : +61 2(6125)5549 Centre for Mathematics & Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. On 17 Aug 2007, at 8:00 PM, r-help-request at stat.math.ethz.ch wrote:> From: "Martin Henry H. Stevens" <HStevens at muohio.edu> > Date: 17 August 2007 12:08:15 AM > To: Margaret Gardiner-Garden <m.gardiner-garden at garvan.org.au> > Cc: "R-help at R-project.org" <R-help at R-project.org> > Subject: Re: [R] residual plots for lmer in lme4 package > > > Hi Margaret, > Have a look at qqmath in the lattice package. > ?qqmath > Hank > On Aug 16, 2007, at 2:45 AM, Margaret Gardiner-Garden wrote: > >> Hi, >> >> >> >> I was wondering if I might be able to ask some advice about doing >> residual >> plots for the lmer function in the lme4 package. >> >> >> >> Our group's aim is to find if the expression staining of a >> particular gene >> in a sample (or "core") is related to the pathology of the core. >> >> To do this, we used the lmer function to perform a logistic mixed >> model >> below. I apologise in advance for the lack of subscripts. >> >> >> >> logit P(yij=1) = ?0 + Ui + ?1Patholij where Ui~N(0, ?u2), >> >> i indexes patient, j indexes measurement, Pathol is an indicator >> variable >> (0,1) for benign >> >> epithelium versus cancer and yij is the staining indicator (0,1) >> for each >> core where yij equals 1 if the core stains positive and 0 otherwise. >> >> >> >> (I have inserted some example R code at the end of this message) >> >> >> >> I was wondering if you knew how I could test that the errors Ui >> are normally >> distributed in my fit. I am not familiar with how to do residual >> plots for >> a mixed logistic regression (or even for any logistic regression!). >> >> >> >> Any advice would be greatly appreciated! >> >> >> >> Thanks and Regards >> >> Marg >>
John Maindonald <john.maindonald <at> anu.edu.au> writes: ...> The issue of checking for normality of effects in multi-level > models has not been very much researched, as far as I can > tell. The function residuals() gives residuals that adjust for > all except the "highest" level of random effects. Depending > on the relative magnitudes of the variance components, > whether or not these "residuals" are anywhere near normal > may not be of much or any consequence.?For what it is worth I have came across this paper just recently: http://www3.interscience.wiley.com/cgi-bin/abstract/114280441 Gregor