thr3ads.net - R devel - [Rd] Inconsistency when naming a vector [Apr 2015]

If this information is useful, please help other people find it:
Share via:

Hadley Wickham

2015-Apr-27 11:48 UTC

[Rd] Inconsistency when naming a vector

Sometimes the absence of a name is maked by an NA:

x <- 1:2
names(x)[[1]] <- "a"
names(x)
# [1] "a" NA

Whereas other times its

y <- c(a = 1, 2)
names(y)
# [1] "a" ""

Is this deliberate? The help for names() is a bit murky, but an
example shows the NA behaviour.

Hadley

-- 
http://had.co.nz/

Suzen, Mehmet

2015-Apr-27 13:08 UTC

head link

[Rd] Inconsistency when naming a vector

There is no inconsistency. Documentation of `names` says "...value
should be a character vector of up to the same length as x..."
In the first definition your character vector is not the same length
as length of x, so you enforce NA by not defining value[2]

x <- 1:2
value<-c("a")
value[2]
[1] NA

where as in the second case, R uses default value "", from `names`
documentation "..The name "" is special: it is used to indicate
that
there is no name associated with an element.". Since you defined the
first one, it internally assigns "" to non-defined names to match the
length of the vector.

Kevin Ushey

2015-Apr-27 13:30 UTC

head link

[Rd] Inconsistency when naming a vector

In `?names`:

     If ?value? is shorter than ?x?, it is extended by character ?NA?s
     to the length of ?x?.

So it is as documented.

That said, it's somewhat surprising that both NA and "" serve as a
placeholder for a 'missing name'; I believe they're treated
identically by R under the hood (e.g. in subsetting operations) but
there may be some subtle cases where they're not.


On Mon, Apr 27, 2015 at 6:08 AM, Suzen, Mehmet <msuzen at gmail.com>
wrote:>
> There is no inconsistency. Documentation of `names` says "...value
> should be a character vector of up to the same length as x..."
> In the first definition your character vector is not the same length
> as length of x, so you enforce NA by not defining value[2]
>
> x <- 1:2
> value<-c("a")
> value[2]
> [1] NA
>
> where as in the second case, R uses default value "", from
`names`
> documentation "..The name "" is special: it is used to
indicate that
> there is no name associated with an element.". Since you defined the
> first one, it internally assigns "" to non-defined names to match
the
> length of the vector.
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

peter dalgaard

2015-Apr-27 13:33 UTC

head link

[Rd] Inconsistency when naming a vector

> On 27 Apr 2015, at 13:48 , Hadley Wickham <h.wickham at gmail.com>
wrote:
> 
> Sometimes the absence of a name is maked by an NA:
> 
> x <- 1:2
> names(x)[[1]] <- "a"
> names(x)
> # [1] "a" NA
> 
> Whereas other times its
> 
> y <- c(a = 1, 2)
> names(y)
> # [1] "a" ""
> 
> Is this deliberate? The help for names() is a bit murky, but an
> example shows the NA behaviour.
I think it is 

(a) impossible to change
(b) at least somewhat coherent

The situation is partially due to the fact that character-NA is a relative
latecomer to the language. In the beginning, there was no real distinction
between NA and "NA", causing issues when abbreviating Noradrenaline,
North America, Nelson Anderson, etc. At some point, it was decided to fix things
up, as far as possible in a backawards compatible way. Some common idioms were
retained but others were changed to comply with the rules for other vector
types.

We have the empty string convention on (AFAICT) all constructor usages:

c(a=1, 3) 
list(a=1, 3)
cbind(a=1, 3)

and also in the lists implied by argument matching
> f <- function(...) names(match.call(expand.dots=TRUE))
> f(a=1,3)[1] ""  "a" "" 

In contrast, assignment forms have the NA convention. This is consistent with
the general rules for complex assignment. E.g. we have
> a <- "a"
> a[[5]] <- "b"
> a[1] "a" NA  NA  NA  "b"

and even
> a <- NULL
> a[[5]] <- "a"
> a[1] NA  NA  NA  NA  "a"

also, we have
> l <- list(1,2,3)
> names(l) <- c("a","b")
> l$a
[1] 1

$b
[1] 2

$<NA>
[1] 3

and we do want to obey general rules like

names(l)[[2]] <- "a" 

being (nearly) equivalent to

`*tmp*`<- names(l)
`*tmp*`[[2]] <- "a"
names(l) <- `*tmp*`


- pd
> 
> Hadley
> 
> -- 
> http://had.co.nz/
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

Hadley Wickham

2015-Apr-28 16:24 UTC

head link

[Rd] Inconsistency when naming a vector

On Mon, Apr 27, 2015 at 8:33 AM, peter dalgaard <pdalgd at gmail.com>
wrote:>
>> On 27 Apr 2015, at 13:48 , Hadley Wickham <h.wickham at
gmail.com> wrote:
>>
>> Sometimes the absence of a name is maked by an NA:
>>
>> x <- 1:2
>> names(x)[[1]] <- "a"
>> names(x)
>> # [1] "a" NA
>>
>> Whereas other times its
>>
>> y <- c(a = 1, 2)
>> names(y)
>> # [1] "a" ""
>>
>> Is this deliberate? The help for names() is a bit murky, but an
>> example shows the NA behaviour.
>
> I think it is
>
> (a) impossible to change
> (b) at least somewhat coherent
>
> The situation is partially due to the fact that character-NA is a relative
latecomer to the language. In the beginning, there was no real distinction
between NA and "NA", causing issues when abbreviating Noradrenaline,
North America, Nelson Anderson, etc. At some point, it was decided to fix things
up, as far as possible in a backawards compatible way. Some common idioms were
retained but others were changed to comply with the rules for other vector
types.
>
> We have the empty string convention on (AFAICT) all constructor usages:
>
> c(a=1, 3)
> list(a=1, 3)
> cbind(a=1, 3)
>
> and also in the lists implied by argument matching
>
>> f <- function(...) names(match.call(expand.dots=TRUE))
>> f(a=1,3)
> [1] ""  "a" ""
>
> In contrast, assignment forms have the NA convention. This is consistent
with the general rules for complex assignment. E.g. we have
>
Ah, that explanation makes sense. Thanks.

It would be helpful to have a isNamed function that abstracted over
all these differences:

isNamed <- function(x) {
 nms <- names(x)
 if (is.null(nms)) return(rep(FALSE, length(x))

 !is.na(x) && x != ""
}

Hadley


-- 
http://had.co.nz/

Maybe Matching Threads

Search for more maybe matching threads

R devel - Apr 2015 - Inconsistency when naming a vector

[Rd] Inconsistency when naming a vector

[Rd] Inconsistency when naming a vector

[Rd] Inconsistency when naming a vector

[Rd] Inconsistency when naming a vector

[Rd] Inconsistency when naming a vector

Maybe Matching Threads