Let me try to be more specific. The x y coordinates are different because of NAs in the dataset. In this analysis, a set of hat values (a measure of influence in regression) is given for each observation. On the basis of the regression that was run to get these hat values, the sample size was 1164 (one removed due to NA). The length of the data set is 1165. If I remove the NA from the data set, I can get identify to run. What I would like to know is if there is a way to get identify to ignore the NAs? Thanks in advance, -- ======================================================================David Kaplan, Ph.D. Professor Department of Educational Psychology University of Wisconsin - Madison Educational Sciences, Room, 1061 1025 W. Johnson Street Madison, WI 53706 email: dkaplan at education.wisc.edu homepage: http://www.education.wisc.edu/edpsych/default.aspx?content=kaplan.html Phone: 608-262-0836
2008/11/20 David Kaplan <dkaplan at education.wisc.edu>:> Let me try to be more specific. > > The x y coordinates are different because of NAs in the dataset. In this > analysis, a set of hat values (a measure of influence in regression) is > given for each observation. On the basis of the regression that was run to > get these hat values, the sample size was 1164 (one removed due to NA). The > length of the data set is 1165. If I remove the NA from the data set, I can > get identify to run. What I would like to know is if there is a way to get > identify to ignore the NAs?Still not clear. Your failing example was: identify(1:1165, hatvalues(scireg3),row.names(sciach)) So are you saying that hatvalues(scireg3) is of length 1164? What you really want is for hatvalues to return NA in the places where you have missing data. identify is quite happy with NA values - try: > x=1:10 > y=runif(10);y[5]=NA > plot(x,y) > identify(x,y) If you can't change hatvalues to do this, then you'll just have to remove the corresponding values of 1:1165 so that it is of length 1164. So something like: okdata = !is.na(dataset) plot((1:1165)[okdata],hatvalues(dataset)) Barry
Reading in between the lines a little, maybe you want lm(..., na.action = na.exclude) That should return missing values for the influence statistics when the predictor or responses is missing in the input. Hadley On Thu, Nov 20, 2008 at 4:19 PM, David Kaplan <dkaplan at education.wisc.edu> wrote:> Let me try to be more specific. > > The x y coordinates are different because of NAs in the dataset. In this > analysis, a set of hat values (a measure of influence in regression) is > given for each observation. On the basis of the regression that was run to > get these hat values, the sample size was 1164 (one removed due to NA). The > length of the data set is 1165. If I remove the NA from the data set, I can > get identify to run. What I would like to know is if there is a way to get > identify to ignore the NAs? > > > > Thanks in advance, > > > -- > ======================================================================> David Kaplan, Ph.D. > Professor > Department of Educational Psychology > University of Wisconsin - Madison > Educational Sciences, Room, 1061 > 1025 W. Johnson Street > Madison, WI 53706 > > email: dkaplan at education.wisc.edu > homepage: > http://www.education.wisc.edu/edpsych/default.aspx?content=kaplan.html > Phone: 608-262-0836 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- http://had.co.nz/