dear R wizards: an operation I execute often is the deletion of all observations (in a matrix or data set) that have at least one NA. (I now need this operation for kde2d, because its internal quantile call complains; could this be considered a buglet?) usually, my data sets are small enough for speed not to matter, and there I do not care whether my method is pretty inefficient (ok, I admit it: I use the sum() function and test whether the result is NA)---but now I have some bigger data sets. Is there a recommended method of doing NA elimination most efficiently? sincerely, /iaw --- ivo welch professor of finance and economics brown / nber / yale
I find complete.cases() to be very useful for this kind of stuff (and
very fast). As in,
> d <- data.frame(x = c(1,2,3,NA,5), y = c(1,NA,3,4,5))
> d
x y
1 1 1
2 2 NA
3 3 3
4 NA 4
5 5 5
> complete.cases(d)
[1] TRUE FALSE TRUE FALSE TRUE
> use <- complete.cases(d)
> d[use, ]
x y
1 1 1
3 3 3
5 5 5
>
-roger
ivo welch wrote:>
> dear R wizards: an operation I execute often is the deletion of all
> observations (in a matrix or data set) that have at least one NA. (I now
> need this operation for kde2d, because its internal quantile call
> complains; could this be considered a buglet?) usually, my data sets
> are small enough for speed not to matter, and there I do not care
> whether my method is pretty inefficient (ok, I admit it: I use the sum()
> function and test whether the result is NA)---but now I have some bigger
> data sets. Is there a recommended method of doing NA elimination most
> efficiently? sincerely, /iaw
> ---
> ivo welch
> professor of finance and economics
> brown / nber / yale
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
--
Roger D. Peng
http://www.biostat.jhsph.edu/~rpeng/
On Wed, 2004-07-07 at 09:35, ivo welch wrote:> dear R wizards: an operation I execute often is the deletion of all > observations (in a matrix or data set) that have at least one NA. (I > now need this operation for kde2d, because its internal quantile call > complains; could this be considered a buglet?) usually, my data sets > are small enough for speed not to matter, and there I do not care > whether my method is pretty inefficient (ok, I admit it: I use the > sum() function and test whether the result is NA)---but now I have some > bigger data sets. Is there a recommended method of doing NA elimination > most efficiently? sincerely, /iaw > --- > ivo welch > professor of finance and economics > brown / nber / yaleTake a look at ?complete.cases HTH, Marc Schwartz
Hi Ivo Try ?na.omit Example :>d <- data.frame(x = c(1:5,NA), y = c(NA,3:7)) dx y 1 1 NA 2 2 3 3 3 4 4 4 5 5 5 6 6 NA 7>do<-na.omit(d) >dox y 2 2 3 3 3 4 4 4 5 5 5 6 I usually pass na.omit within the data argument of a function i.e. m<-lm(x~y,data=na.omit(d)). In this way you don't have to store 2 datasets. I hopw that this helps Francisco>From: Marc Schwartz <MSchwartz at MedAnalytics.com> >Reply-To: MSchwartz at MedAnalytics.com >To: ivo welch <ivo_welch at mailblocks.com> >CC: R-Help <r-help at stat.math.ethz.ch> >Subject: Re: [R] fast NA elimination ? >Date: Wed, 07 Jul 2004 09:41:39 -0500 > >On Wed, 2004-07-07 at 09:35, ivo welch wrote: > > dear R wizards: an operation I execute often is the deletion of all > > observations (in a matrix or data set) that have at least one NA. (I > > now need this operation for kde2d, because its internal quantile call > > complains; could this be considered a buglet?) usually, my data sets > > are small enough for speed not to matter, and there I do not care > > whether my method is pretty inefficient (ok, I admit it: I use the > > sum() function and test whether the result is NA)---but now I have some > > bigger data sets. Is there a recommended method of doing NA elimination > > most efficiently? sincerely, /iaw > > --- > > ivo welch > > professor of finance and economics > > brown / nber / yale > > >Take a look at ?complete.cases > >HTH, > >Marc Schwartz > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://www.stat.math.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide! >http://www.R-project.org/posting-guide.htmlTechnology 101. http://special.msn.com/tech/technology101.armx