Hi all, How can one evaluate NAs in a numeric dataframe column? For example, I have a dataframe (demo) with a column of numbers and several NAs. If I write demo.df >= 10, numerals will return TRUE or FALSE, but if the value is "NA", "NA" is returned. But if I write demo.df == "NA", it returns as "NA" also. I know that I can remove NAs, but would like to keep the dataframe as is without creating a subset. I basically want to add a line that evaluates the NA in the demo dataframe. As an example, I want to assign rows to classes based on values in demo$Area. Some of the values in demo$Area are "NA" for (i in 1:nrow(demo)) { if (demo$Area[i] > 0 && demo$Area[i] < 10) {Class[i]<-"S01"} ## 1-10 cm2 if (demo$Area[i] >= 10 && demo$Area[i] < 25) {Class[i] <- "S02"} ## 10-25cm2 if (demo$Area[i] >= 25 && demo$Area[i] < 50) {Class[i] <-"S03"} ## 25-50 cm2 if (demo$Area[i] >= 50 && demo$Area[i] < 100) {Class[i] <-"S04"} ## 50-100 cm2 if (demo$Area[i] >= 100 && demo$Area[i] < 200) {Class[i] <- "S05"} ## 100-200 cm2 if (demo$Area[i] >= 200 && demo$Area[i] < 400) {Class[i] <- "S06"} ## 200-400 cm2 if (demo$Area[i] >= 400 && demo$Area[i] < 800) {Class[i] <- "S07"} ## 400-800 cm2 if (demo$Area[i] >= 800 && demo$Area[i] < 1600) {Class[i] <- "S08"} ## 800-1600 cm2 if (demo$Area[i] >= 1600 && demo$Area[i] < 3200) {Class[i] <- "S09"} ## 1600-3200 cm2 if (demo$Area[i] >=3200) {Class[i] <- "S10"} ## >3200 cm2 } What happens is that I get the message "Error in if (demo$Area[i] > 0 && demo$Area[i] < 10) { : missing value where TRUE/FALSE needed" Thanks for any help Wade [[alternative HTML version deleted]]
On 2010-12-08 12:10, Wade Wall wrote:> Hi all, > > How can one evaluate NAs in a numeric dataframe column? For example, I have > a dataframe (demo) with a column of numbers and several NAs. If I write > demo.df>= 10, numerals will return TRUE or FALSE, but if the value is > "NA", "NA" is returned. But if I write demo.df == "NA", it returns as "NA" > also. I know that I can remove NAs, but would like to keep the dataframe as > is without creating a subset. I basically want to add a line that evaluates > the NA in the demo dataframe. > > As an example, I want to assign rows to classes based on values in > demo$Area. Some of the values in demo$Area are "NA" > > for (i in 1:nrow(demo)) { > if (demo$Area[i]> 0&& demo$Area[i]< 10) {Class[i]<-"S01"} ## 1-10 cm2 > if (demo$Area[i]>= 10&& demo$Area[i]< 25) {Class[i]<- "S02"} ## > 10-25cm2 > if (demo$Area[i]>= 25&& demo$Area[i]< 50) {Class[i]<-"S03"} ## 25-50 > cm2 > if (demo$Area[i]>= 50&& demo$Area[i]< 100) {Class[i]<-"S04"} ## 50-100 > cm2 > if (demo$Area[i]>= 100&& demo$Area[i]< 200) {Class[i]<- "S05"} ## > 100-200 cm2 > if (demo$Area[i]>= 200&& demo$Area[i]< 400) {Class[i]<- "S06"} ## > 200-400 cm2 > if (demo$Area[i]>= 400&& demo$Area[i]< 800) {Class[i]<- "S07"} ## > 400-800 cm2 > if (demo$Area[i]>= 800&& demo$Area[i]< 1600) {Class[i]<- "S08"} ## > 800-1600 cm2 > if (demo$Area[i]>= 1600&& demo$Area[i]< 3200) {Class[i]<- "S09"} ## > 1600-3200 cm2 > if (demo$Area[i]>=3200) {Class[i]<- "S10"} ##>3200 cm2 > } > > What happens is that I get the message "Error in if (demo$Area[i]> 0&& > demo$Area[i]< 10) { : missing value where TRUE/FALSE needed" >You don't say what you want to have occur when x is NA. (I don't know what 'evaluate NA' means.) But why not just use something like: for(....){ if(!is.na(x[i]){ .... your stuff, preferably replacing '&&' with '&' .... } else {....} } Peter Ehlers> Thanks for any help > > Wade >
Hi!> How can one evaluate NAs in a numeric dataframe column? For example, I have > a dataframe (demo) with a column of numbers and several NAs. If I write > demo.df >= 10, numerals will return TRUE or FALSE, but if the value is > "NA", "NA" is returned. But if I write demo.df == "NA", it returns as "NA"Sounds like you are looking for is.na :> is.na(c(1,NA,3))[1] FALSE TRUE FALSE> As an example, I want to assign rows to classes based on values in > demo$Area. Some of the values in demo$Area are "NA" > > for (i in 1:nrow(demo)) { > if (demo$Area[i] > 0 && demo$Area[i] < 10) {Class[i]<-"S01"} ## 1-10 cm2 > if (demo$Area[i] >= 10 && demo$Area[i] < 25) {Class[i] <- "S02"} ## > 10-25cm2[...]> if (demo$Area[i] >=3200) {Class[i] <- "S10"} ## >3200 cm2 > } > > What happens is that I get the message "Error in if (demo$Area[i] > 0 && > demo$Area[i] < 10) { : missing value where TRUE/FALSE needed"First of all, you don't need a loop here. Example: # make up some data foo <- data.frame(a=sample(1:20, 20, replace=TRUE)) # assign to classes foo$class <- cut(foo$a, breaks=c(-1, 7, 13, 20), labels=c('small', 'medium', 'large')) This also works in the presence of NAs - but of course the class will be NA in those cases which, at least in my opinion, is the correct value. cu Philipp -- Dr. Philipp Pagel Lehrstuhl f?r Genomorientierte Bioinformatik Technische Universit?t M?nchen Wissenschaftszentrum Weihenstephan Maximus-von-Imhof-Forum 3 85354 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/
> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] > On Behalf Of Wade Wall > Sent: Wednesday, December 08, 2010 12:11 PM > To: r-help at stat.math.ethz.ch > Subject: [R] evaluating NAs in a dataframe > > Hi all, > > How can one evaluate NAs in a numeric dataframe column? For example, I > have > a dataframe (demo) with a column of numbers and several NAs. If I write > demo.df >= 10, numerals will return TRUE or FALSE, but if the value is > "NA", "NA" is returned. But if I write demo.df == "NA", it returns as > "NA" > also. I know that I can remove NAs, but would like to keep the dataframe > as > is without creating a subset. I basically want to add a line that > evaluates > the NA in the demo dataframe. > > As an example, I want to assign rows to classes based on values in > demo$Area. Some of the values in demo$Area are "NA" > > for (i in 1:nrow(demo)) { > if (demo$Area[i] > 0 && demo$Area[i] < 10) {Class[i]<-"S01"} ## 1-10 cm2 > if (demo$Area[i] >= 10 && demo$Area[i] < 25) {Class[i] <- "S02"} ## > 10-25cm2 > if (demo$Area[i] >= 25 && demo$Area[i] < 50) {Class[i] <-"S03"} ## 25-50 > cm2 > if (demo$Area[i] >= 50 && demo$Area[i] < 100) {Class[i] <-"S04"} ## 50- > 100 > cm2 > if (demo$Area[i] >= 100 && demo$Area[i] < 200) {Class[i] <- "S05"} ## > 100-200 cm2 > if (demo$Area[i] >= 200 && demo$Area[i] < 400) {Class[i] <- "S06"} ## > 200-400 cm2 > if (demo$Area[i] >= 400 && demo$Area[i] < 800) {Class[i] <- "S07"} ## > 400-800 cm2 > if (demo$Area[i] >= 800 && demo$Area[i] < 1600) {Class[i] <- "S08"} ## > 800-1600 cm2 > if (demo$Area[i] >= 1600 && demo$Area[i] < 3200) {Class[i] <- "S09"} ## > 1600-3200 cm2 > if (demo$Area[i] >=3200) {Class[i] <- "S10"} ## >3200 cm2 > } > > What happens is that I get the message "Error in if (demo$Area[i] > 0 && > demo$Area[i] < 10) { : missing value where TRUE/FALSE needed" > > Thanks for any help > > Wade >Wade, As you have discovered, you need to test for NA first, and to do that you need to use is.na(). Something like this should work for (i in 1:nrow(demo)) { if (is.na(demo$Area[i])) Class[i] <- "Sna" else if (demo$Area[i] < 10) Class[i] <- "S01" else if (demo$Area[i] < 25) Class[i] <- "S02" else if (demo$Area[i] < 50) Class[i] <- "S03" else if (demo$Area[i] < 100) Class[i] <- "S04" else if (demo$Area[i] < 200) Class[i] <- "S05" else if (demo$Area[i] < 400) Class[i] <- "S06" else if (demo$Area[i] < 800) Class[i] <- "S07" else if (demo$Area[i] < 1600) Class[i] <- "S08" else if (demo$Area[i] < 3200) Class[i] <- "S09" else Class[i] <- "S10" } Hope this is helpful, Dan Daniel Nordlund Bothell, WA USA
On Dec 8, 2010, at 3:10 PM, Wade Wall wrote:> Hi all, > > How can one evaluate NAs in a numeric dataframe column? For > example, I have > a dataframe (demo) with a column of numbers and several NAs. If I > write > demo.df >= 10, numerals will return TRUE or FALSE, but if the value is > "NA", "NA" is returned. But if I write demo.df == "NA", it returns > as "NA" > also. I know that I can remove NAs, but would like to keep the > dataframe as > is without creating a subset. I basically want to add a line that > evaluates > the NA in the demo dataframe.That looks really, really painful. Why not use the function findInterval and then do a lookup in a character vector. Then you can throw away that loopy construct completely. > demo <- data.frame(Area = runif(10, 0, 100)) > demo$catarea <- findInterval(demo$Area, c(0,25,50,75,100)) > demo Area catarea 1 71.440401 3 2 8.438097 1 3 45.492178 2 4 50.669996 3 5 15.444114 1 6 33.954948 2 7 19.738747 1 8 56.485654 3 9 29.218921 2 10 74.204611 3 > demo$catname <- c("S01","S02", "S03","S04")[demo$catarea] > demo Area catarea catname 1 71.440401 3 S03 2 8.438097 1 S01 3 45.492178 2 S02 4 50.669996 3 S03 5 15.444114 1 S01 6 33.954948 2 S02 7 19.738747 1 S01 8 56.485654 3 S03 9 29.218921 2 S02 10 74.204611 3 S03 -- David.> > As an example, I want to assign rows to classes based on values in > demo$Area. Some of the values in demo$Area are "NA" > > for (i in 1:nrow(demo)) { > if (demo$Area[i] > 0 && demo$Area[i] < 10) {Class[i]<-"S01"} ## > 1-10 cm2 > if (demo$Area[i] >= 10 && demo$Area[i] < 25) {Class[i] <- "S02"} ## > 10-25cm2 > if (demo$Area[i] >= 25 && demo$Area[i] < 50) {Class[i] <-"S03"} ## > 25-50 > cm2 > if (demo$Area[i] >= 50 && demo$Area[i] < 100) {Class[i] <-"S04"} ## > 50-100 > cm2 > if (demo$Area[i] >= 100 && demo$Area[i] < 200) {Class[i] <- "S05"} ## > 100-200 cm2 > if (demo$Area[i] >= 200 && demo$Area[i] < 400) {Class[i] <- "S06"} ## > 200-400 cm2 > if (demo$Area[i] >= 400 && demo$Area[i] < 800) {Class[i] <- "S07"} ## > 400-800 cm2 > if (demo$Area[i] >= 800 && demo$Area[i] < 1600) {Class[i] <- "S08"} > ## > 800-1600 cm2 > if (demo$Area[i] >= 1600 && demo$Area[i] < 3200) {Class[i] <- > "S09"} ## > 1600-3200 cm2 > if (demo$Area[i] >=3200) {Class[i] <- "S10"} ## >3200 cm2 > } > > What happens is that I get the message "Error in if (demo$Area[i] > > 0 && > demo$Area[i] < 10) { : missing value where TRUE/FALSE needed" > > Thanks for any help > > Wade > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT
Seemingly Similar Threads
- "file changed as we read it" message during tar file creation on GlusterFS
- "file changed as we read it" message during tar file creation on GlusterFS
- "file changed as we read it" message during tar file creation on GlusterFS
- "file changed as we read it" message during tar file creation on GlusterFS
- "file changed as we read it" message during tar file creation on GlusterFS