apjaworski@mmm.com
2002-Mar-06 23:36 UTC
[R] Strange behavior when subsetting data frames with NAs
Here is what I get using R 1.4.1 on Win2k (using precompiled version from CRAN) and RH 7.2 Linux (compiled form source): > data.frame(a=c(1, 2, 3, NA, NA), b=c(3, 1, 3, NA, NA)) -> zz > zz[zz[,2]>2, ] a b X1 1 3 X3 3 3 NA NA NA NA1 NA NA (if there are more rows with NAs, I get consecutive labels NA2, NA3, ...) > zz1 <- na.omit(zz) > zz1[zz1[,2]>2, ] a b 1 1 3 3 3 3 also > as.matrix(zz) -> zz > zz[zz[,2]>2, ] a b 1 1 3 3 3 3 NA NA NA NA NA NA I am not sure if this is bug or a feature, so I am reporting it here. Andy __________________________________ Andy Jaworski Engineering Systems Technology Center 3M Center, 518-1-01 St. Paul, MN 55144-1000 ----- E-mail: apjaworski at mmm.com Tel: (651) 733-6092 Fax: (651) 736-3122 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Gary Collins
2002-Mar-07 07:02 UTC
[R] Re: Strange behavior when subsetting data frames with NAs
>Here is what I get using R 1.4.1 on Win2k (using precompiled version from >CRAN) and RH 7.2 Linux (compiled form source):> > data.frame(a=c(1, 2, 3, NA, NA), b=c(3, 1, 3, NA, NA)) -> zz > > zz[zz[,2]>2, ] > a b > X1 1 3 > X3 3 3 > NA NA NA > NA1 NA NA (if there are more >rows with NAs, I get consecutive labels NA2, NA3, ...) > > zz1 <- na.omit(zz) > > zz1[zz1[,2]>2, ] > a b > 1 1 3 > 3 3 3 > > also> > as.matrix(zz) -> zz > > zz[zz[,2]>2, ] > a b > 1 1 3 > 3 3 3 > NA NA NA > NA NA NAin the second case, you have not done an na.omit() operation, that you did in the case of the data.frame. So if you did > na.omit(zz[zz[,2]>2,]) a b X1 1 3 X3 3 3 This compares to the data.frame operation.> I am not sure if this is bug or a feature, so I am reporting it here.>Andy__________________________________________________ Gary S. Collins, PhD, Statistics Research Fellow, Quality of Life Unit, European Organisation for Research and Treatment of Cancer, EORTC Data Center, Avenue E. Mounier 83, bte. 11, B-1200 Brussels, Belgium. Tel: +32 2 774 1 606 Fax: +32 2 779 4 568 http://www.eortc.be/home/qol/ __________________________________________________ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Prof Brian D Ripley
2002-Mar-07 07:22 UTC
[R] Strange behavior when subsetting data frames with NAs
On Wed, 6 Mar 2002 apjaworski at mmm.com wrote:> Here is what I get using R 1.4.1 on Win2k (using precompiled version from > CRAN) and RH 7.2 Linux (compiled form source): > > > data.frame(a=c(1, 2, 3, NA, NA), b=c(3, 1, 3, NA, NA)) -> zz > > zz[zz[,2]>2, ] > a b > X1 1 3 > X3 3 3 > NA NA NA > NA1 NA NA (if there are more > rows with NAs, I get consecutive labels NA2, NA3, ...) > > zz1 <- na.omit(zz) > > zz1[zz1[,2]>2, ] > a b > 1 1 3 > 3 3 3 > > also > > > as.matrix(zz) -> zz > > zz[zz[,2]>2, ] > a b > 1 1 3 > 3 3 3 > NA NA NA > NA NA NA > > I am not sure if this is bug or a feature, so I am reporting it here.What exactly do you find strange? It is the correct behaviour and replicates that of S. Remember that data frames have to have unique row names, and you asked for rows> zz[,2]>2[1] TRUE FALSE TRUE NA NA so new row names have to be created. Matrices do not have to have unique dimnames. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
apjaworski@mmm.com
2002-Mar-07 15:38 UTC
[R] Strange behavior when subsetting data frames with NAs
I guess I did not expect the change in row names for rows without NAs, that is the change from (1, 3) to (X1, X3) in my zz example. Andy __________________________________ Andy Jaworski Engineering Systems Technology Center 3M Center, 518-1-01 St. Paul, MN 55144-1000 ----- E-mail: apjaworski at mmm.com Tel: (651) 733-6092 Fax: (651) 736-3122 Prof Brian D Ripley To: Andrzej P. Jaworski/US-Corporate/3M/US at 3M-Corporate <ripley at stats.o cc: r-help at stat.math.ethz.ch x.ac.uk> Subject: Re: [R] Strange behavior when subsetting data frames with NAs 03/07/2002 01:22 AM On Wed, 6 Mar 2002 apjaworski at mmm.com wrote:> Here is what I get using R 1.4.1 on Win2k (using precompiled version from > CRAN) and RH 7.2 Linux (compiled form source): > > > data.frame(a=c(1, 2, 3, NA, NA), b=c(3, 1, 3, NA, NA)) -> zz > > zz[zz[,2]>2, ] > a b > X1 1 3 > X3 3 3 > NA NA NA > NA1 NA NA (if there are more > rows with NAs, I get consecutive labels NA2, NA3, ...) > > zz1 <- na.omit(zz) > > zz1[zz1[,2]>2, ] > a b > 1 1 3 > 3 3 3 > > also > > > as.matrix(zz) -> zz > > zz[zz[,2]>2, ] > a b > 1 1 3 > 3 3 3 > NA NA NA > NA NA NA > > I am not sure if this is bug or a feature, so I am reporting it here.What exactly do you find strange? It is the correct behaviour and replicates that of S. Remember that data frames have to have unique row names, and you asked for rows> zz[,2]>2[1] TRUE FALSE TRUE NA NA so new row names have to be created. Matrices do not have to have unique dimnames. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Prof Brian D Ripley
2002-Mar-07 16:27 UTC
[R] Strange behavior when subsetting data frames with NAs
On Thu, 7 Mar 2002 apjaworski at mmm.com wrote:> > I guess I did not expect the change in row names for rows without NAs, that > is the change from (1, 3) to (X1, X3) in my zz example.It tells you this is a new set of row names, not just the old ones repeated.> > Andy > > __________________________________ > Andy Jaworski > Engineering Systems Technology Center > 3M Center, 518-1-01 > St. Paul, MN 55144-1000 > ----- > E-mail: apjaworski at mmm.com > Tel: (651) 733-6092 > Fax: (651) 736-3122 > > > > Prof Brian D > Ripley To: Andrzej P. Jaworski/US-Corporate/3M/US at 3M-Corporate > <ripley at stats.o cc: r-help at stat.math.ethz.ch > x.ac.uk> Subject: Re: [R] Strange behavior when subsetting data frames with NAs > > 03/07/2002 > 01:22 AM > > > > > > > > On Wed, 6 Mar 2002 apjaworski at mmm.com wrote: > > > Here is what I get using R 1.4.1 on Win2k (using precompiled version from > > CRAN) and RH 7.2 Linux (compiled form source): > > > > > data.frame(a=c(1, 2, 3, NA, NA), b=c(3, 1, 3, NA, NA)) -> zz > > > zz[zz[,2]>2, ] > > a b > > X1 1 3 > > X3 3 3 > > NA NA NA > > NA1 NA NA (if there are more > > rows with NAs, I get consecutive labels NA2, NA3, ...) > > > zz1 <- na.omit(zz) > > > zz1[zz1[,2]>2, ] > > a b > > 1 1 3 > > 3 3 3 > > > > also > > > > > as.matrix(zz) -> zz > > > zz[zz[,2]>2, ] > > a b > > 1 1 3 > > 3 3 3 > > NA NA NA > > NA NA NA > > > > I am not sure if this is bug or a feature, so I am reporting it here. > > What exactly do you find strange? It is the correct behaviour and > replicates that of S. Remember that data frames have to have unique row > names, and you asked for rows > > > zz[,2]>2 > [1] TRUE FALSE TRUE NA NA > > so new row names have to be created. Matrices do not have to have unique > dimnames. > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272860 (secr) > Oxford OX1 3TG, UK Fax: +44 1865 272595 > > > > > >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._