thr3ads.net - R devel - [Rd] Inconsistencies in subassignment (PR#7210) [Sep 2004]

If this information is useful, please help other people find it:
Share via:

tlumley@u.washington.edu

2004-Sep-04 01:47 UTC

[Rd] Inconsistencies in subassignment (PR#7210)

I have made the 3-d case do the same as the vector case, which is what the
C code clearly intended (a goto label was in the wrong place).

This leaves the bigger question of the right thing to do. I note that data
frames give an error when any indices are NA.

	-thomas

On Fri, 3 Sep 2004 ripley@stats.ox.ac.uk wrote:
> Apart from the inconsistencies, there are two clear bugs here:
>
> 1) miscalculating the number of values needed, in the matrix case.  E.g.
>
> > AA[idx, 1] <- B[1:4]
> Error in "[<-"(`*tmp*`, idx, 1, value = B[1:4]) :
>         number of items to replace is not a multiple of replacement length
>
> although only 4 values are replaced by AA[idx, 1] <- B.
>
> 2) the behaviour of the 3D case.
>
> ---------- Forwarded message ----------
> Date: Fri, 3 Sep 2004 16:40:24 +0100 (BST)
> From: Prof Brian Ripley <ripley@stats.ox.ac.uk>
> To: "Yao, Minghua" <myao@ou.edu>
> Cc: R Help <r-help@stat.math.ethz.ch>
> Subject: Re: [R] Different Index behaviors of Array and Matrix
>
> [I will copy a version of this to R-bugs: please be careful when you reply
> to only copy to R-bugs a version with a PR number in the subject.]
>
> On Fri, 3 Sep 2004, Yao, Minghua wrote:
>
> >  I found a difference between the indexing of an array and that of a
> > matrix when there are NA's in the index array. The screen copy is
as
> > follows.
> >
> > > A <- array(NA, dim=6)
> > > A
> > [1] NA NA NA NA NA NA
>
> > > idx <- c(1,NA,NA,4,5,6)
> > > B <- c(10,20,30,40,50,60)
> > > A[idx] <- B
> > > A
> > [1] 10 NA NA 40 50 60
> > > AA <- matrix(NA,6,1)
> > > AA
> >      [,1]
> > [1,]   NA
> > [2,]   NA
> > [3,]   NA
> > [4,]   NA
> > [5,]   NA
> > [6,]   NA
> > > AA[idx,1] <- B
> > > AA
> >      [,1]
> > [1,]   10
> > [2,]   NA
> > [3,]   NA
> > [4,]   20
> > [5,]   30
> > [6,]   40
> > >
> >  In the case of a array, we miss the elements (20 and 30) in B
> > corresponding to the NA's in the index array. In the case of a
matrix,
> > 20 and 30 are assigned to the elements indexed by the indexes
following
> > the NA's. Is this a reasonable behavior. Thanks in advance for
> > explanation.
>
> A is a 1D array but it behaves just like a vector.
> Wierder things happen with multi-dimensional arrrays
>
> > A <- array(NA, dim=c(6,1,1))
> > A[idx,1,1] <- B
> > A
> , , 1
>
>      [,1]
> [1,]   10
> [2,]   NA
> [3,]   NA
> [4,]   NA
> [5,]   NA
> [6,]   NA
>
> One problem with what happens for matrices is that
>
> > idx <- c(1,4,5,6)
> > AA <- matrix(NA,6,1)
> > AA[idx,1] <- B
> Error in "[<-"(`*tmp*`, idx, 1, value = B) :
>         number of items to replace is not a multiple of replacement length
>
> is an error, so it is not counting the values consistently.
>
> The only discussion I could find (Blue Book p.103, which is also
> discussing LHS subscripting) just says
>
> 	If a subscript is NA, an NA is returned.
>
> S normally does not use up values when encountering an NA in an index set,
> although it does for logical matrix indexing of data frames.
>
> I can see two possible interpretations.
>
> 1) The NA indicates the values was lost after assignment. We don't know
> what index the first NA was, so 20 got assigned somewhere.  And as we
> don't know where, all the elements had better be NA. However, that is
> unless the NA was 0, when no assignment took place any no value was used.
>
> 2) The NA indicates the value was lost before assignment, so no assignment
> took place and no value was used.
>
> R does neither of those.  I suspect the correct course of action is to ban
> NAs in subscripted assignments.
>
>
> --
> Brian D. Ripley,                  ripley@stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>
> ______________________________________________
> R-devel@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
Thomas Lumley			Assoc. Professor, Biostatistics
tlumley@u.washington.edu	University of Washington, Seattle

Prof Brian Ripley

2004-Sep-07 10:16 UTC

head link

[Rd] Inconsistencies in subassignment (PR#7210)

On Sat, 4 Sep 2004 tlumley@u.washington.edu wrote:
 > I have made the 3-d case do the same as the vector case, which is what the
> C code clearly intended (a goto label was in the wrong place).
> 
> This leaves the bigger question of the right thing to do. I note that data
> frames give an error when any indices are NA.
One case is unambiguous and common:

	x[ind] <- val

where `val' is of length one.  I've written code to ban all other 
subassignments involving NAs.  Once I fixed occurrences in R itself 
(notably in ifelse), only three problems remain in tests over the CRAN 
packages

< Running examples in ape-Ex.R failed.
< > ### * popsize
    area[a == 0] <- stepfunction[a == 0]

< Running examples in RandomFields-Ex.R failed.
< > ### * ShowModels
expr[pmatch(covlist, namen)] <- exprlist

< Running examples in sm-Ex.R failed.
< > ### * sm.sphere
    z[xyzok < 0] <- (za - zb)[xyzok < 0]

The first and third are a typical usage, where R makes more sense than S.
[Worryingly, sm was written for S-PLUS and would seem to be incorrect 
there.]

So in R 2.0.0 we will have

\section{NAs in indexing}{
  When subscripting, a numerical, logical or character \code{NA} picks
  an unknown element and so returns \code{NA} in the corresponding
  element of a logical, integer, numeric, complex or character result,
  and \code{NULL} for a list.

  When replacing (that is using subscripting on the lhs of an
  assignment) \code{NA} does not select any element to be replaced.  As
  there is ambiguity as to whether an element of the rhs should
  be used or not (and \R handled this inconsistently prior to \R 2.0.0),
  this is only allowed if the rhs value is of length one (so the two
  interpretations would have the same outcome).  
}

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Apparently Analagous Threads

Search for more apparently analagous threads

R devel - Sep 2004 - Inconsistencies in subassignment (PR#7210)

[Rd] Inconsistencies in subassignment (PR#7210)

[Rd] Inconsistencies in subassignment (PR#7210)

Apparently Analagous Threads