thr3ads.net - R devel - [Rd] Definition of [[ [Mar 2009]

If this information is useful, please help other people find it:
Share via:

Stavros Macrakis

2009-Mar-15 18:31 UTC

[Rd] Definition of [[

The semantics of [ and [[ don't seem to be fully specified in the
Reference manual.  In particular, I can't find where the following
cases are covered:
> cc <- c(1); ll <- list(1)
> cc[3][1] NA
OK, RefMan says: If i is positive and exceeds length(x) then the
corresponding selection is NA.
> dput(ll[3])list(NULL)
? i is positive and exceeds length(x); why isn't this list(NA)?
> ll[[3]]Error in list(1)[[3]] : subscript out of bounds
? Why does this return NA for an atomic vector, but give an error for
a generic vector?
> cc[[3]] <- 34; dput(cc)c(1, NA, 34)
OK

ll[[3]] <- 34; dput(ll)
list(1, NULL, 34)
Why is second element NULL, not NA?
And why is it OK to set an undefined ll[[3]], but not to get it?

I assume that these are features, not bugs, but I can't find
documentation for them.

            -s

Duncan Murdoch

2009-Mar-15 20:43 UTC

head link

[Rd] Definition of [[

On 15/03/2009 2:31 PM, Stavros Macrakis wrote:> The semantics of [ and [[ don't seem to be fully specified in the
> Reference manual.  In particular, I can't find where the following
> cases are covered:
> 
>> cc <- c(1); ll <- list(1)
> 
>> cc[3]
> [1] NA
> OK, RefMan says: If i is positive and exceeds length(x) then the
> corresponding selection is NA.
> 
>> dput(ll[3])
> list(NULL)
> ? i is positive and exceeds length(x); why isn't this list(NA)?
Because the sentence you read was talking about "simple vectors", and
ll
is presumably not a simple vector.  So what is a simple vector?  That is 
not explicitly defined, and it probably should be.  I think it is 
"atomic vectors, except those with a class that has a method for [".
> 
>> ll[[3]]
> Error in list(1)[[3]] : subscript out of bounds
> ? Why does this return NA for an atomic vector, but give an error for
> a generic vector?
> 
>> cc[[3]] <- 34; dput(cc)
> c(1, NA, 34)
> OK
> 
> ll[[3]] <- 34; dput(ll)
> list(1, NULL, 34)
> Why is second element NULL, not NA?
NA is a length 1 atomic vector with a specific type matching the type of 
c.  It makes more sense in this context to put in a NULL, and return a 
list(NULL) for ll[3].
> And why is it OK to set an undefined ll[[3]], but not to get it?
Lots of code grows vectors by setting elements beyond the end of them, 
so whether or not that's a good idea, it's not likely to change.

I think an argument could be made that ll[[toobig]] should return NULL 
rather than trigger an error, but on the other hand, the current 
behaviour allows the programmer to choose:  if you are assuming that a 
particular element exists, use ll[[element]], and R will tell you when 
your assumption is wrong.  If you aren't sure, use ll[element] and 
you'll get NA or list(NULL) if the element isn't there.
> I assume that these are features, not bugs, but I can't find
> documentation for them.
There is more documentation in the man page for Extract, but I think it 
is incomplete.  The most complete documentation is of course the source 
code, but it may not answer the question of what's intentional and 
what's accidental.

Duncan Murdoch

Thomas Lumley

2009-Mar-16 08:06 UTC

head link

[Rd] Definition of [[

On Sun, 15 Mar 2009, Stavros Macrakis wrote:
> The semantics of [ and [[ don't seem to be fully specified in the
> Reference manual.  In particular, I can't find where the following
> cases are covered:
>
>> cc <- c(1); ll <- list(1)
>
>> cc[3]
> [1] NA
> OK, RefMan says: If i is positive and exceeds length(x) then the
> corresponding selection is NA.
>
>> dput(ll[3])
> list(NULL)
> ? i is positive and exceeds length(x); why isn't this list(NA)?
I think some of these are because there are only NAs for character, logical, and
the numeric types. There isn't an NA of list type.

This one shouldn't be list(NA) - which NA would it use?  It should be some
sort of list(_NA_list_) type, and list(NULL) is playing that role.

>> ll[[3]]
> Error in list(1)[[3]] : subscript out of bounds
> ? Why does this return NA for an atomic vector, but give an error for
> a generic vector?
Again, because there isn't an NA of generic vector type.
>> cc[[3]] <- 34; dput(cc)
> c(1, NA, 34)
> OK
>
> ll[[3]] <- 34; dput(ll)
> list(1, NULL, 34)
> Why is second element NULL, not NA?
> And why is it OK to set an undefined ll[[3]], but not to get it?
Same reason for NULL vs NA.  The fact that setting works may just be an
inconsistency -- as you can see from previous discussions, R often does not
effectively forbid code that shouldn't work -- or it may be
bug-compatibility with some version of S or S-PLUS.


      -thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

Reasonably Related Threads

Search for more possibly parallel threads

R devel - Mar 2009 - Definition of [[

[Rd] Definition of [[

[Rd] Definition of [[

[Rd] Definition of [[

Reasonably Related Threads