Mikael Olai Milhøj
2015-Feb-02 22:42 UTC
[R] Plot residuals against standard normal distribution
Hi, I'm having trouble trying to plot the density of the residuals against the standard normal distribution N(0,1). (I'm trying to see if my residuals are well-behaved). I know hwo to calculate the standardized residuals (I guess that there may be a simple way using a R function) and then plot this by using the density function y<-(model$residuals-mean(model$residuals))/sd(model$residuals) plot(density(y)) But I don't know how to add the N(0,1) curve. Any suggestions? Thanks in advance /Mikael [[alternative HTML version deleted]]
peter dalgaard
2015-Feb-02 23:18 UTC
[R] Plot residuals against standard normal distribution
> On 02 Feb 2015, at 23:42 , Mikael Olai Milh?j <mikaelmilhoj at gmail.com> wrote: > > Hi, > > I'm having trouble trying to plot the density of the residuals against the > standard normal distribution N(0,1). (I'm trying to see if my residuals are > well-behaved). > > I know hwo to calculate the standardized residuals (I guess that there may > be a simple way using a R function) and then plot this by using the density > function > > y<-(model$residuals-mean(model$residuals))/sd(model$residuals) > plot(density(y)) > > But I don't know how to add the N(0,1) curve. Any suggestions? Thanks in > advanceI'd try curve(dnorm(x), add=TRUE) Some diddling of ylim= is usually required. I'd usually prefer qqnorm() for normality checks, though; it is pretty hard to assess the tails of density plots. Also notice rstandard(), rstudent(). -pd> > > /Mikael > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Since you are plotting densities, check out the sm package. It has been over a year or so since I've used it, but there was a setting on the univariate densities to check the data against a normal distribution. Best, MEH On Mon, Feb 2, 2015 at 2:42 PM, Mikael Olai Milh?j <mikaelmilhoj at gmail.com> wrote:> Hi, > > I'm having trouble trying to plot the density of the residuals against the > standard normal distribution N(0,1). (I'm trying to see if my residuals are > well-behaved). > > I know hwo to calculate the standardized residuals (I guess that there may > be a simple way using a R function) and then plot this by using the density > function > > y<-(model$residuals-mean(model$residuals))/sd(model$residuals) > plot(density(y)) > > But I don't know how to add the N(0,1) curve. Any suggestions? Thanks in > advance > > > /Mikael > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Mark E. Hall, PhD Assistant Field Manager, Black Rock Field Office Winnemucca District Office 775-623-1529. [[alternative HTML version deleted]]
Mikael Olai Milhøj
2015-Feb-03 20:13 UTC
[R] Plot residuals against standard normal distribution
Thx - this was exactly what I needed. Yes I also prefer the qqnorm but I like the distrubution chart as well. Thx ! /Mikael On Tue, Feb 3, 2015 at 12:18 AM, peter dalgaard <pdalgd at gmail.com> wrote:> > > On 02 Feb 2015, at 23:42 , Mikael Olai Milh?j <mikaelmilhoj at gmail.com> > wrote: > > > > Hi, > > > > I'm having trouble trying to plot the density of the residuals against > the > > standard normal distribution N(0,1). (I'm trying to see if my residuals > are > > well-behaved). > > > > I know hwo to calculate the standardized residuals (I guess that there > may > > be a simple way using a R function) and then plot this by using the > density > > function > > > > y<-(model$residuals-mean(model$residuals))/sd(model$residuals) > > plot(density(y)) > > > > But I don't know how to add the N(0,1) curve. Any suggestions? Thanks in > > advance > > I'd try > > curve(dnorm(x), add=TRUE) > > Some diddling of ylim= is usually required. > > > I'd usually prefer qqnorm() for normality checks, though; it is pretty > hard to assess the tails of density plots. > > Also notice rstandard(), rstudent(). > > -pd > > > > > > > /Mikael > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > -- > Peter Dalgaard, Professor, > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com > > > > > > > > >[[alternative HTML version deleted]]
On Mon, 2 Feb 2015, Mikael Olai Milh?j wrote:> I'm having trouble trying to plot the density of the residuals against > the standard normal distribution N(0,1). (I'm trying to see if my > residuals are well-behaved). > > I know hwo to calculate the standardized residuals (I guess that there > may be a simple way using a R function) and then plot this by using the > density function > > y<-(model$residuals-mean(model$residuals))/sd(model$residuals) > plot(density(y)) > > But I don't know how to add the N(0,1) curve. Any suggestions? Thanks in > advanceThis isn't good for you? qqnorm( residuals( model ) ) Residuals usually have a zero mean, but I guess you can center, anyway, and if there could be NAs, you will need to deal with them: res <- residuals( model ) resStd <- ( res - mean( res, na.rm=TRUE ) ) / sd( res, na.rm=TRUE ) qqnorm( resStd ) Another issue is how to make the theoretical quantiles for the normal distribution. There are a few methods: https://www.statsdirect.com/help/data_preparation/normal_scores.htm I usually use Blom: r <- rank( resStd ) c <- 3/8 N <- sum( !is.na( resStd ) ) resNorm <- qnorm( ( r - c ) / ( N - 2*c + 1 ) ) resNorm[ is.nan( resNorm ) ] <- NA Then you could plot it directly: plot(resNorm, resStd) When we use qqnorm() in R, it looks like R is using a Blom method with c=1/2 instead of c=3/8. I believe Blom recommended 3/8 and programs that offer Blom normal scores use c=3/8. Best, Mike -- Michael B. Miller, Ph.D. Minnesota Center for Twin and Family Research Department of Psychology University of Minnesota http://scholar.google.com/citations?user=EV_phq4AAAAJ
On Sat, 7 Feb 2015, Mike Miller wrote:> res <- residuals( model ) > > resStd <- ( res - mean( res, na.rm=TRUE ) ) / sd( res, na.rm=TRUE ) > > Another issue is how to make the theoretical quantiles for the normal > distribution. There are a few methods: > > https://www.statsdirect.com/help/data_preparation/normal_scores.htm > > I usually use Blom: > > r <- rank( resStd ) > c <- 3/8 > N <- sum( !is.na( resStd ) ) > resNorm <- qnorm( ( r - c ) / ( N - 2*c + 1 ) ) > resNorm[ is.nan( resNorm ) ] <- NA > > Then you could plot it directly: > > plot(resNorm, resStd) > > When we use qqnorm() in R, it looks like R is using a Blom method with > c=1/2 instead of c=3/8. I believe Blom recommended 3/8 and programs > that offer Blom normal scores use c=3/8.I don't know if that was off-track because the OP was asking about density, but he also was asking about checking the distribution of residuals, so maybe this is appropriate. I should add, if you don't mind using R's c=1/2, you can get the normal scores very quickly this way: resNorm <- qqnorm( residuals( model ), plot.it=FALSE )$x Apparently, 11 years ago R was using c=3/8 in qqnorm(), so I guess it changed. Nordheim, Clayton and Yandell wrote about it in this document dated September 9, 2003: https://www.stat.wisc.edu/~yandell/st571/R/append8.pdf It is definitely using c=1/2 today. I don't know where that is documented. When I do a QQ-plot of uniform p-values, I like to add a confidence region with these lmits: qb95 <- qbeta(.95,1:N,N+1-(1:N)) qb05 <- qbeta(.05,1:N,N+1-(1:N)) If we have N observations from a normal distribution with unknown mean and variance, can we create some kind of analogous region on a qqnorm() kind of plot? It seems like there should be a way to get at least an approximate result using beta and t distributions, probably building on the qbeta() code above. Mike