"Jens Oehlschlägel"
2005-Sep-30 13:12 UTC
[Rd] Subscripting fails if name of element is "" (PR#8161)
Dear all, I resend this mail because it was blocked: I submitted a bug from the r-bug webpage and hypatia seems to block mail that is send from a different IP than that usually associated with the email. Looks like it is currently impossible to correctly submit bugs from the website. However, here is the original bug report: (PR#8161) Dear all, The following shows cases where accessing elements via their name fails (if the name is a string of length zero). Best regards Jens Oehlschl?gel> p <- 1:3 > names(p) <- c("a","", as.character(NA)) > pa <NA> 1 2 3> > for (i in names(p))+ print(p[[i]]) [1] 1 [1] 2 [1] 3> > # error 1: vector subsripting with "" fails in second element > for (i in names(p))+ print(p[i]) a 1 <NA> NA <NA> 3> > # error 2: print method for list shows no name for second element > p <- as.list(p) > > > for (i in names(p))+ print(p[[i]]) [1] 1 [1] 2 [1] 3> > # error 3: list subsripting with "" fails in second element > for (i in names(p))+ print(p[i]) $a [1] 1 $"NA" NULL $"NA" [1] 3> > version_ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 1.1 year 2005 month 06 day 20 language R # -- replication code ---------------------------------- p <- 1:3 names(p) <- c("a","", as.character(NA)) p for (i in names(p)) print(p[[i]]) # error 1: vector subsripting with "" fails in second element for (i in names(p)) print(p[i]) # error 2: print method for list shows no name for second element p <- as.list(p) for (i in names(p)) print(p[[i]]) # error 3: list subsripting with "" fails in second element for (i in names(p)) print(p[i]) --
Thomas Lumley
2005-Sep-30 15:47 UTC
[Rd] Subscripting fails if name of element is "" (PR#8161)
On Fri, 30 Sep 2005, "Jens Oehlschl?gel" wrote:> Dear all, > > The following shows cases where accessing elements via their name fails (if > the > name is a string of length zero).This looks deliberate (there is a function NonNullStringMatch that does the matching). I assume this is because there is no other way to indicate that an element has no name. If so, it is a documentation bug -- help(names) and FAQ 7.14 should specify this behaviour. Too late for 2.2.0, unfortunately. -thomas> > Best regards > > > Jens Oehlschl?gel > > >> p <- 1:3 >> names(p) <- c("a","", as.character(NA)) >> p > a <NA> > 1 2 3 >> >> for (i in names(p)) > + print(p[[i]]) > [1] 1 > [1] 2 > [1] 3 >> >> # error 1: vector subsripting with "" fails in second element >> for (i in names(p)) > + print(p[i]) > a > 1 > <NA> > NA > <NA> > 3 >> >> # error 2: print method for list shows no name for second element >> p <- as.list(p) >> >> >> for (i in names(p)) > + print(p[[i]]) > [1] 1 > [1] 2 > [1] 3 >> >> # error 3: list subsripting with "" fails in second element >> for (i in names(p)) > + print(p[i]) > $a > [1] 1 > > $"NA" > NULL > > $"NA" > [1] 3 > >> >> version > _ > platform i386-pc-mingw32 > arch i386 > os mingw32 > system i386, mingw32 > status > major 2 > minor 1.1 > year 2005 > month 06 > day 20 > language R > > > > > # -- replication code ---------------------------------- > > p <- 1:3 > names(p) <- c("a","", as.character(NA)) > p > > for (i in names(p)) > print(p[[i]]) > > # error 1: vector subsripting with "" fails in second element > for (i in names(p)) > print(p[i]) > > # error 2: print method for list shows no name for second element > p <- as.list(p) > > > for (i in names(p)) > print(p[[i]]) > > # error 3: list subsripting with "" fails in second element > for (i in names(p)) > print(p[i]) > > > > > -- > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle
"Jens Oehlschlägel"
2005-Oct-06 16:38 UTC
[Rd] Subscripting fails if name of element is "" (PR#8161)
Dear Thomas,> This looks deliberate (there is a function NonNullStringMatch that does > the matching). I assume this is because there is no other way to > indicate that an element has no name.> If so, it is a documentation bug -- help(names) and FAQ 7.14 should > specify this behaviour. Too late for 2.2.0, unfortunately.I respectfully disagree: the element has a name, its an empty string. Of course "" is a doubtful name for an element, but as long as we allow this name when assigning names()<- we also should handle it like a name in subscripting. The alternative would be to disallow "" in names at all. However, both alternatives rather look like code changes, not only documentation. Best regards Jens Oehlschl?gel -- Highspeed-Freiheit. Bei GMX superg?nstig, z.B. GMX DSL_Cityflat,
"Jens Oehlschlägel"
2005-Oct-07 12:09 UTC
[Rd] Subscripting fails if name of element is "" (PR#8161)
Dear Brian, Thanks for picking this up. I think the critical point is that it is not a single isolated bug and it would be a main effort to get this stuff consistent, because it (and implications) seems to be spread all over the code. The to be applauded efforts to properly sort out "NA" vs. as.character(NA) have not been fully successful yet and "" is a similar issue. Please consider the following, sorry for the length: # ERROR 1 # I agree that c() disallows "" and NA names # it makes sense discouraging users from using such names> c(as.character(NA)=1)Fehler: Syntaxfehler in Zeile "c(as.character(NA)="> c("NA"=2, "a"=3)NA a 2 3> c(""=4)Fehler: Versuch einen Variablennamen der L?nge 0 zu nutzen # however, "NA" must be expected as a legal name, e.g. when importing data # and in your example specifying "no-name" in fact results in a "" name> names(c(a=1, 2))[1] "a" ""># My interpreteation is that the user specifies a mixture of elements with and without names, # and therefore the no-names must be co-erced to "" names, and in principle that's completely fine # a character vector is defined to have either as.character(NA) OR "NA" OR "" or another positive length string # (which is complicated enough) # formally the names is an attribute (character vector) of an object and can be manipulated as such> x <- 1:4 > names(x) <- c(NA, "NA", "a", "") > names(x)[1] NA "NA" "a" ""> # and in principle all of those can be properly distinguished > x[match(names(x), names(x))]<NA> NA a 1 2 3 4 # introducing a fifth non-name state that sometimes equals "" and sometimes not, introduces inconsistency into the language # e.g. the fact that elements can be selected by their name but not by their non-name # Thus currently selecting by names is a mess from a consistency perspective> x[names(x)]<NA> <NA> a <NA> 1 1 3 NA # in the following subscripting with "" works, but not with "NA"> for (i in names(x))+ print(x[[i]]) [1] 1 [1] 1 [1] 3 [1] 4 # ERROR 1a: If failing on "NA" is not a bug, I switch from programming to Kafka> x["NA"]<NA> 1 # ERROR 1b: clearly wrong> x[["NA"]][1] 1 # ERROR 1c: and from my humble understanding failing on "" is a bug as well> x[""]<NA> NA # wheras interestingly this is correct> x[[""]][1] 4 # I think it is obvious how to remove these inconsistencies # (as long as we do not disallow "" in names alltogether, # which is almost impossible, since every users legally can set the names vector in a variety of ways ) # these are not easy, but perfectly fine> x[as.character(NA)]<NA> 1> x[as.integer(NA)]<NA> NA # and these are really debatable difficult ones> x[NA]<NA> <NA> <NA> <NA> NA NA NA NA> x[as.logical(NA)]<NA> <NA> <NA> <NA> NA NA NA NA ## ERROR 2+3: the above inconsistencies generalize to lists lx <- as.list(x)> lx$"NA" (ERROR 2a) [1] 1 $"NA" [1] 2 $a [1] 3 [[4]] (ERROR 2b) [1] 4 # and should read> lx$NA ( or $as.character(NA) for clarity and warning ) [1] 1 $"NA" [1] 2 $a [1] 3 $"" [1] 4 # Note that - except for printing - match works perfectly in> lx[match(names(lx), names(lx))]$"NA" [1] 1 $"NA" [1] 2 $a [1] 3 [[4]] [1] 4 # and also in> for (i in match(names(lx), names(lx)))+ print(lx[[i]]) [1] 1 [1] 2 [1] 3 [1] 4 # Of course I consider the following behaviour as inconsistent> lx[names(lx)]$"NA" [1] 1 $"NA" [1] 1 (ERROR 3a) $a [1] 3 $"NA" NULL (ERROR 3b) # using [[ the second one fails> for (i in names(lx))+ print(lx[[i]]) [1] 1 [1] 1 (ERROR 3c) [1] 3 [1] 4 (interestingly correct) # finally note that this works> eval(substitute(lx$y, list(y=as.character(NA))))# but not this> get("$")(lx, as.character(NA))Fehler in get("$")(lx, as.character(NA)) : ung?ltiger Indextyp # and both go wrong with "NA" --