hi: sorry to bother you all again. I am running a simple lm(y~x+z) regression, in which some of the observations are missing. Unfortunately, the residuals vector from the lm object omits all the missing values, which means that I cannot simply do residual diagnostics (e.g., plot(y,x)). Would it not make more sense to have the residuals propagate the missing values, so that the residuals are guaranteed to have the same length as the variables? Alternatively, maybe the residuals() function could do this instead. But the documentation is not clear: Methods can make use of 'naresid' methods to compensate for the omission of missing values. The default method does. How? I have figured out how to write my own function to do what I need (using the names of the residuals object), so this is more a "how to properly do this?" question, and/or "suggestion for improved documentation" than it is a desparate need of mine. sincerely, /iaw
Dear Ivo, The default na.action is na.omit, which behaves as you describe. Setting options(na.action="na.exclude"), or specifying the argument na.action="na.exclude" in the call to lm(), will produce residuals and other case statistics that have NA for omitted observations. See ?na.exclude and ?lm for details. I hope that this helps, John> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of ivo welch > Sent: Sunday, March 28, 2004 11:25 AM > To: r-help at stat.math.ethz.ch > Subject: [R] residuals with missing values > > > hi: sorry to bother you all again. I am running a simple > lm(y~x+z) regression, in which some of the observations are missing. > Unfortunately, the residuals vector from the lm object omits > all the missing values, which means that I cannot simply do > residual diagnostics (e.g., plot(y,x)). Would it not make > more sense to have the residuals propagate the missing > values, so that the residuals are guaranteed to > have the same length as the variables? Alternatively, maybe the > residuals() function could do this instead. But the > documentation is not clear: > > Methods can make use of 'naresid' methods to compensate for the > omission of missing values. The default method does. > > How? I have figured out how to write my own function to do > what I need (using the names of the residuals object), so > this is more a "how to properly do this?" question, and/or > "suggestion for improved documentation" than it is a > desparate need of mine. >
> hi: sorry to bother you all again. I am running a simple lm(y~x+z) > regression, in which some of the observations are missing. > Unfortunately, the residuals vector from the lm object omits all the > missing values, which means that I cannot simply do residual > diagnostics (e.g., plot(y,x)). Would it not make more sense to have > the residuals propagate the missing values, so that the residuals > are guaranteed to have the same length as the variables? > Alternatively, maybe the residuals() function could do this instead. > But the documentation is not clear:I had a similar situation, and Brian Ripley said to me: If you have missing data in your data frame and want residuals for all observations, you need to use na.action=na.exclude, not the default na.omit. -- Ajay Shah Consultant ajayshah at mayin.org Department of Economic Affairs http://www.mayin.org/ajayshah Ministry of Finance, New Delhi