Dear R-community, I am using R (V 2.14.1) on Windows 7. I have a dataset which consists of 19 variables for 91 individuals or rows. Two of my variables are Age (adult/chick, with no NA values) and Sex (0 for females/1 for females, with quite a few NA values). The sex of many adult birds is unknown (entered as NA in dataframe). At some point of my analyses, I happen to need to need to work with only male adults, so I tried subsetting the dataframe as follows (see code below) but I get a new dataframe containing all the males but also a lot of unneeded information such as data in rows 1-7 (NAs), 13, 14, 19 and 21-30. I suspect this is caused by NAs in the variable Sex because everything goes fine (I get a dataframe containing adults) if I run the same code but without the "& Data$Sex == 1" part. How can I fix this problem? I there a straightforward way of subsetting efficiently when NAs are present in the original dataset? Thank you so much! Luciano adult.males <- Data[Data$Category == "Adult" & Data$Sex == 1,] adult.males ID Category Sex Beak Head NA <NA> <NA> NA NA NA NA.1 <NA> <NA> NA NA NA NA.2 <NA> <NA> NA NA NA NA.3 <NA> <NA> NA NA NA NA.4 <NA> <NA> NA NA NA NA.5 <NA> <NA> NA NA NA NA.6 <NA> <NA> NA NA NA 9 LAA10 Adult 1 57.40 121.95 10 LAA11 Adult 1 56.40 113.00 11 LAA12 Adult 1 52.00 111.85 13 LAA14 Adult 1 56.55 124.85 15 LAA16 Adult 1 57.15 120.10 NA.7 <NA> <NA> NA NA NA NA.8 <NA> <NA> NA NA NA 21 LAA22 Adult 1 56.85 117.35 22 LAA23 Adult 1 54.80 117.45 27 LAA28 Adult 1 59.00 116.75 28 LAA29 Adult 1 55.95 124.25 NA.9 <NA> <NA> NA NA NA 30 LAA31 Adult 1 57.70 112.80 NA.10 <NA> <NA> NA NA NA NA.11 <NA> <NA> NA NA NA NA.12 <NA> <NA> NA NA NA NA.13 <NA> <NA> NA NA NA NA.14 <NA> <NA> NA NA NA NA.15 <NA> <NA> NA NA NA NA.16 <NA> <NA> NA NA NA NA.17 <NA> <NA> NA NA NA NA.18 <NA> <NA> NA NA NA NA.19 <NA> <NA> NA NA NA
Hello, Try ?is.na. In the example below I've changed your first row name from NA to NA.0 x <- read.table(text=" ID Category Sex Beak Head NA.0 <NA> <NA> NA NA NA NA.1 <NA> <NA> NA NA NA NA.2 <NA> <NA> NA NA NA NA.3 <NA> <NA> NA NA NA NA.4 <NA> <NA> NA NA NA NA.5 <NA> <NA> NA NA NA NA.6 <NA> <NA> NA NA NA 9 LAA10 Adult 1 57.40 121.95 10 LAA11 Adult 1 56.40 113.00 11 LAA12 Adult 1 52.00 111.85 13 LAA14 Adult 1 56.55 124.85 15 LAA16 Adult 1 57.15 120.10 NA.7 <NA> <NA> NA NA NA NA.8 <NA> <NA> NA NA NA 21 LAA22 Adult 1 56.85 117.35 22 LAA23 Adult 1 54.80 117.45 27 LAA28 Adult 1 59.00 116.75 28 LAA29 Adult 1 55.95 124.25 NA.9 <NA> <NA> NA NA NA 30 LAA31 Adult 1 57.70 112.80 NA.10 <NA> <NA> NA NA NA NA.11 <NA> <NA> NA NA NA NA.12 <NA> <NA> NA NA NA NA.13 <NA> <NA> NA NA NA NA.14 <NA> <NA> NA NA NA NA.15 <NA> <NA> NA NA NA NA.16 <NA> <NA> NA NA NA NA.17 <NA> <NA> NA NA NA NA.18 <NA> <NA> NA NA NA NA.19 <NA> <NA> NA NA NA ", header=TRUE) rownames(x) <- seq.int(nrow(x)) head(x) i1 <- is.na(x$Category) i2 <- is.na(x$Sex) x[!i1 & !i2, ] Hope this helps, Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/Subsetting-dataframe-with-missing-values-tp4590819p4591052.html Sent from the R help mailing list archive at Nabble.com.
Hi> > Dear R-community, > > I am using R (V 2.14.1) on Windows 7. I have a dataset which consists of19> variables for 91 individuals or rows. Two of my variables are Age > (adult/chick, with no NA values) and Sex (0 for females/1 for females,with> quite a few NA values). The sex of many adult birds is unknown (enteredas> NA in dataframe). At some point of my analyses, I happen to need to needto> work with only male adults, so I tried subsetting the dataframe asfollows> (see code below) but I get a new dataframe containing all the males butalso> a lot of unneeded information such as data in rows 1-7 (NAs), 13, 14, 19and> 21-30. I suspect this is caused by NAs in the variable Sex because > everything goes fine (I get a dataframe containing adults) if I run thesame> code but without the "& Data$Sex == 1" part. > > How can I fix this problem? I there a straightforward way of subsetting > efficiently when NAs are present in the original dataset? > Thank you so much!I usually do it in 2 lines selection<- which(Data$Category == "Adult" & Data$Sex == 1) Data[selection, ] could be what you want. Or you can do adult.males <- adult.males[!is.na(adult.males$Sex),] Regards Petr> > Luciano > > adult.males <- Data[Data$Category == "Adult" & Data$Sex == 1,] > adult.males > > ID Category Sex Beak Head > NA <NA> <NA> NA NA NA > NA.1 <NA> <NA> NA NA NA > NA.2 <NA> <NA> NA NA NA > NA.3 <NA> <NA> NA NA NA > NA.4 <NA> <NA> NA NA NA > NA.5 <NA> <NA> NA NA NA > NA.6 <NA> <NA> NA NA NA > 9 LAA10 Adult 1 57.40 121.95 > 10 LAA11 Adult 1 56.40 113.00 > 11 LAA12 Adult 1 52.00 111.85 > 13 LAA14 Adult 1 56.55 124.85 > 15 LAA16 Adult 1 57.15 120.10 > NA.7 <NA> <NA> NA NA NA > NA.8 <NA> <NA> NA NA NA > 21 LAA22 Adult 1 56.85 117.35 > 22 LAA23 Adult 1 54.80 117.45 > 27 LAA28 Adult 1 59.00 116.75 > 28 LAA29 Adult 1 55.95 124.25 > NA.9 <NA> <NA> NA NA NA > 30 LAA31 Adult 1 57.70 112.80 > NA.10 <NA> <NA> NA NA NA > NA.11 <NA> <NA> NA NA NA > NA.12 <NA> <NA> NA NA NA > NA.13 <NA> <NA> NA NA NA > NA.14 <NA> <NA> NA NA NA > NA.15 <NA> <NA> NA NA NA > NA.16 <NA> <NA> NA NA NA > NA.17 <NA> <NA> NA NA NA > NA.18 <NA> <NA> NA NA NA > NA.19 <NA> <NA> NA NA NA > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.