Dear colleges, I do not understand the following behaviour:> aa <- data.frame(a1= 1:10, a2= c(rep(NA, 5), 1:5) ) > aaa1 a2 1 1 NA 2 2 NA 3 3 NA 4 4 NA 5 5 NA 6 6 1 7 7 2 8 8 3 9 9 4 10 10 5> aa[!aa$a2==1, ] # removing rows with a2==1a1 a2 NA NA NA NA.1 NA NA NA.2 NA NA NA.3 NA NA NA.4 NA NA 7 7 2 8 8 3 9 9 4 10 10 5 I didn't expect a1 to be affected. Is aa[!aa$a2==1, ] an incorrect way to remove rows? Any other way? (R 1.8.1. for Windows) Thanks in advance Juli
On Thu, 18 Dec 2003 15:03:04 +0100 juli g. pausas wrote:> Dear colleges, > I do not understand the following behaviour: > > > aa <- data.frame(a1= 1:10, a2= c(rep(NA, 5), 1:5) ) > > aa > a1 a2 > 1 1 NA > 2 2 NA > 3 3 NA > 4 4 NA > 5 5 NA > 6 6 1 > 7 7 2 > 8 8 3 > 9 9 4 > 10 10 5 > > aa[!aa$a2==1, ] # removing rows with a2==1 > a1 a2 > NA NA NA > NA.1 NA NA > NA.2 NA NA > NA.3 NA NA > NA.4 NA NA > 7 7 2 > 8 8 3 > 9 9 4 > 10 10 5 > > I didn't expect a1 to be affected. > Is aa[!aa$a2==1, ] an incorrect way to remove rows?It leads to the behaviour above if there are NAs in the logical vector used for indexing: R> !aa$a2==1 [1] NA NA NA NA NA FALSE TRUE TRUE TRUE TRUE> Any other way?Several other ways are conceivable to treat the NA rows differently. This precise problem is solved, e.g., by R> aa[-which(aa$a2==1), ] a1 a2 1 1 NA 2 2 NA 3 3 NA 4 4 NA 5 5 NA 7 7 2 8 8 3 9 9 4 10 10 5 hth, Z> (R 1.8.1. for Windows) > Thanks in advance > > Juli > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help >
Take a look at what (aa$a2 == 1) returns and it may clear things up. Try aa[-which(aa$a2 == 1), ] or subset(aa, a2 != 1 | is.na(a2)) HTH, Sundar juli g. pausas wrote:> Dear colleges, > I do not understand the following behaviour: > >> aa <- data.frame(a1= 1:10, a2= c(rep(NA, 5), 1:5) ) >> aa > > a1 a2 > 1 1 NA > 2 2 NA > 3 3 NA > 4 4 NA > 5 5 NA > 6 6 1 > 7 7 2 > 8 8 3 > 9 9 4 > 10 10 5 > >> aa[!aa$a2==1, ] # removing rows with a2==1 > > a1 a2 > NA NA NA > NA.1 NA NA > NA.2 NA NA > NA.3 NA NA > NA.4 NA NA > 7 7 2 > 8 8 3 > 9 9 4 > 10 10 5 > > I didn't expect a1 to be affected. > Is aa[!aa$a2==1, ] an incorrect way to remove rows? Any other way? > > (R 1.8.1. for Windows) > Thanks in advance > > Juli > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
On Thu, 18 Dec 2003, juli g. pausas wrote:> Dear colleges, > I do not understand the following behaviour: > > > aa <- data.frame(a1= 1:10, a2= c(rep(NA, 5), 1:5) )> > aa[!aa$a2==1, ] # removing rows with a2==1 > a1 a2 > NA NA NA > NA.1 NA NA > NA.2 NA NA > NA.3 NA NA > NA.4 NA NA > 7 7 2 > 8 8 3 > 9 9 4 > 10 10 5 > > I didn't expect a1 to be affected.You should think of NA as being pronounced "Don't Know". That is you are asking for all rows where a2==1 to be removed and you don't know whether to remove the first five rows. The result is that you don't know what the result is in the first five rows. There is a good case for making this give an error or warning. -thomas