justin jarvis
2011-Sep-20 00:49 UTC
[R] Tabulating Baseline Characteristics on specific observations
I have a data set with many missing observations. When I run a regression, R of course discards the observations (the whole row) that have "NA". I want to tabulate some baseline characteristics (column means) but only for the observations that R used for the regression. I tried to recreate this data frame by using na.omit on the original data frame, but this will not work as this will discard an observation with an "NA" in any column, and not just in the covariates. In summary, I only want to remove observations that have an "NA" in the covariate columns. Something like Stata's e(sample), as far as I can tell. Justin Jarvis PhD student, University of California, Irvine
David Winsemius
2011-Sep-20 11:54 UTC
[R] Tabulating Baseline Characteristics on specific observations
On Sep 19, 2011, at 8:49 PM, justin jarvis wrote:> I have a data set with many missing observations. When I run a > regression, R of course discards the observations (the whole row) that > have "NA". I want to tabulate some baseline characteristics (column > means) but only for the observations that R used for the regression. > I tried to recreate this data frame by using na.omit on the original > data frame, but this will not work as this will discard an observation > with an "NA" in any column, and not just in the covariates. > > In summary, I only want to remove observations that have an "NA" in > the covariate columns. Something like Stata's e(sample), as far as Ina.omit(subset(dfrm, select= <covariate-vector> ) # or equivalent> can tell. > > Justin Jarvis > PhD student, University of California, Irvine > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT