apjaworski@mmm.com
2002-Mar-06 23:36 UTC
[R] Strange behavior when subsetting data frames with NAs
Here is what I get using R 1.4.1 on Win2k (using precompiled version from
CRAN) and RH 7.2 Linux (compiled form source):
> data.frame(a=c(1, 2, 3, NA, NA), b=c(3, 1, 3, NA, NA)) -> zz
> zz[zz[,2]>2, ]
a b
X1 1 3
X3 3 3
NA NA NA
NA1 NA NA (if there are more
rows with NAs, I get consecutive labels NA2, NA3, ...)
> zz1 <- na.omit(zz)
> zz1[zz1[,2]>2, ]
a b
1 1 3
3 3 3
also
> as.matrix(zz) -> zz
> zz[zz[,2]>2, ]
a b
1 1 3
3 3 3
NA NA NA
NA NA NA
I am not sure if this is bug or a feature, so I am reporting it here.
Andy
__________________________________
Andy Jaworski
Engineering Systems Technology Center
3M Center, 518-1-01
St. Paul, MN 55144-1000
-----
E-mail: apjaworski at mmm.com
Tel: (651) 733-6092
Fax: (651) 736-3122
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Gary Collins
2002-Mar-07 07:02 UTC
[R] Re: Strange behavior when subsetting data frames with NAs
>Here is what I get using R 1.4.1 on Win2k (using precompiled version from >CRAN) and RH 7.2 Linux (compiled form source):> > data.frame(a=c(1, 2, 3, NA, NA), b=c(3, 1, 3, NA, NA)) -> zz > > zz[zz[,2]>2, ] > a b > X1 1 3 > X3 3 3 > NA NA NA > NA1 NA NA (if there are more >rows with NAs, I get consecutive labels NA2, NA3, ...) > > zz1 <- na.omit(zz) > > zz1[zz1[,2]>2, ] > a b > 1 1 3 > 3 3 3 > > also> > as.matrix(zz) -> zz > > zz[zz[,2]>2, ] > a b > 1 1 3 > 3 3 3 > NA NA NA > NA NA NAin the second case, you have not done an na.omit() operation, that you did in the case of the data.frame. So if you did > na.omit(zz[zz[,2]>2,]) a b X1 1 3 X3 3 3 This compares to the data.frame operation.> I am not sure if this is bug or a feature, so I am reporting it here.>Andy__________________________________________________ Gary S. Collins, PhD, Statistics Research Fellow, Quality of Life Unit, European Organisation for Research and Treatment of Cancer, EORTC Data Center, Avenue E. Mounier 83, bte. 11, B-1200 Brussels, Belgium. Tel: +32 2 774 1 606 Fax: +32 2 779 4 568 http://www.eortc.be/home/qol/ __________________________________________________ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Prof Brian D Ripley
2002-Mar-07 07:22 UTC
[R] Strange behavior when subsetting data frames with NAs
On Wed, 6 Mar 2002 apjaworski at mmm.com wrote:> Here is what I get using R 1.4.1 on Win2k (using precompiled version from > CRAN) and RH 7.2 Linux (compiled form source): > > > data.frame(a=c(1, 2, 3, NA, NA), b=c(3, 1, 3, NA, NA)) -> zz > > zz[zz[,2]>2, ] > a b > X1 1 3 > X3 3 3 > NA NA NA > NA1 NA NA (if there are more > rows with NAs, I get consecutive labels NA2, NA3, ...) > > zz1 <- na.omit(zz) > > zz1[zz1[,2]>2, ] > a b > 1 1 3 > 3 3 3 > > also > > > as.matrix(zz) -> zz > > zz[zz[,2]>2, ] > a b > 1 1 3 > 3 3 3 > NA NA NA > NA NA NA > > I am not sure if this is bug or a feature, so I am reporting it here.What exactly do you find strange? It is the correct behaviour and replicates that of S. Remember that data frames have to have unique row names, and you asked for rows> zz[,2]>2[1] TRUE FALSE TRUE NA NA so new row names have to be created. Matrices do not have to have unique dimnames. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
apjaworski@mmm.com
2002-Mar-07 15:38 UTC
[R] Strange behavior when subsetting data frames with NAs
I guess I did not expect the change in row names for rows without NAs, that
is the change from (1, 3) to (X1, X3) in my zz example.
Andy
__________________________________
Andy Jaworski
Engineering Systems Technology Center
3M Center, 518-1-01
St. Paul, MN 55144-1000
-----
E-mail: apjaworski at mmm.com
Tel: (651) 733-6092
Fax: (651) 736-3122
Prof Brian D
Ripley To: Andrzej P.
Jaworski/US-Corporate/3M/US at 3M-Corporate
<ripley at stats.o cc: r-help at
stat.math.ethz.ch
x.ac.uk> Subject: Re: [R] Strange
behavior when subsetting data frames with NAs
03/07/2002
01:22 AM
On Wed, 6 Mar 2002 apjaworski at mmm.com wrote:
> Here is what I get using R 1.4.1 on Win2k (using precompiled version from
> CRAN) and RH 7.2 Linux (compiled form source):
>
> > data.frame(a=c(1, 2, 3, NA, NA), b=c(3, 1, 3, NA, NA)) -> zz
> > zz[zz[,2]>2, ]
> a b
> X1 1 3
> X3 3 3
> NA NA NA
> NA1 NA NA (if there are more
> rows with NAs, I get consecutive labels NA2, NA3, ...)
> > zz1 <- na.omit(zz)
> > zz1[zz1[,2]>2, ]
> a b
> 1 1 3
> 3 3 3
>
> also
>
> > as.matrix(zz) -> zz
> > zz[zz[,2]>2, ]
> a b
> 1 1 3
> 3 3 3
> NA NA NA
> NA NA NA
>
> I am not sure if this is bug or a feature, so I am reporting it here.
What exactly do you find strange? It is the correct behaviour and
replicates that of S. Remember that data frames have to have unique row
names, and you asked for rows
> zz[,2]>2
[1] TRUE FALSE TRUE NA NA
so new row names have to be created. Matrices do not have to have unique
dimnames.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272860 (secr)
Oxford OX1 3TG, UK Fax: +44 1865 272595
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Prof Brian D Ripley
2002-Mar-07 16:27 UTC
[R] Strange behavior when subsetting data frames with NAs
On Thu, 7 Mar 2002 apjaworski at mmm.com wrote:> > I guess I did not expect the change in row names for rows without NAs, that > is the change from (1, 3) to (X1, X3) in my zz example.It tells you this is a new set of row names, not just the old ones repeated.> > Andy > > __________________________________ > Andy Jaworski > Engineering Systems Technology Center > 3M Center, 518-1-01 > St. Paul, MN 55144-1000 > ----- > E-mail: apjaworski at mmm.com > Tel: (651) 733-6092 > Fax: (651) 736-3122 > > > > Prof Brian D > Ripley To: Andrzej P. Jaworski/US-Corporate/3M/US at 3M-Corporate > <ripley at stats.o cc: r-help at stat.math.ethz.ch > x.ac.uk> Subject: Re: [R] Strange behavior when subsetting data frames with NAs > > 03/07/2002 > 01:22 AM > > > > > > > > On Wed, 6 Mar 2002 apjaworski at mmm.com wrote: > > > Here is what I get using R 1.4.1 on Win2k (using precompiled version from > > CRAN) and RH 7.2 Linux (compiled form source): > > > > > data.frame(a=c(1, 2, 3, NA, NA), b=c(3, 1, 3, NA, NA)) -> zz > > > zz[zz[,2]>2, ] > > a b > > X1 1 3 > > X3 3 3 > > NA NA NA > > NA1 NA NA (if there are more > > rows with NAs, I get consecutive labels NA2, NA3, ...) > > > zz1 <- na.omit(zz) > > > zz1[zz1[,2]>2, ] > > a b > > 1 1 3 > > 3 3 3 > > > > also > > > > > as.matrix(zz) -> zz > > > zz[zz[,2]>2, ] > > a b > > 1 1 3 > > 3 3 3 > > NA NA NA > > NA NA NA > > > > I am not sure if this is bug or a feature, so I am reporting it here. > > What exactly do you find strange? It is the correct behaviour and > replicates that of S. Remember that data frames have to have unique row > names, and you asked for rows > > > zz[,2]>2 > [1] TRUE FALSE TRUE NA NA > > so new row names have to be created. Matrices do not have to have unique > dimnames. > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272860 (secr) > Oxford OX1 3TG, UK Fax: +44 1865 272595 > > > > > >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._