i wonder about the following examples showing incoherence in how type conversions are done in r: x = TRUE x[2] = as.raw(1) # Error in x[2] = as.raw(1) : # incompatible types (from raw to logical) in subassignment type fix it seems that there is an attempt to coerce the raw value to logical here, which fails, even though as.logical(as.raw(1)) # TRUE likewise, x[2] = 1L # the vector is silently coerced upwards to integer x[2] = as.raw(1) # Error in x[2] = as.raw(1) : # incompatible types (from raw to integer) in subassignment type fix even though as.integer(as.raw(1)) # 1 and likewise for double and complex. there's another incoherence: x = 1 x[2] = 1i x # 1i 1i x = 1i x[2] = 1 x # 1i 1i in both cases, the higher type is used for the result; in the former case, the vector is coerced upwards, in the latter, the assigned value is coerced upwards. however: x = 1 x[2] = as.raw(1) # error: incompatible types (from raw to double) x = as.raw(1) x[2] = 1 # error: incompatible types (from double to raw) leaving aside that as.double(as.raw(1)) # 1 as.raw(as.double(1)) # 1 work just fine, in both cases there is an attempt to coerce the assigned value to the vector type, and not to the higher type (which would presumably qbe double, as in ?c), as in the previous example. interestingly, c(1, as.raw(1)) # error: type 'raw' is unimplemented in 'RealAnswer' (note the 'real', not 'double'), whereas 1 == as.raw(1) # TRUE works just fine. furthermore, c('1', as.raw(1)) # "1" "01" whereas x = '1' x[2] = as.raw(1) # error: incompatible types (from raw to character) yet another issue is that of indexing a raw vector with an out-of-bounds index. the r language definition, sec. 3.4.1 [1] says: " We shall discuss indexing of simple vectors first. For simplicity, assume that the expression is x[i]. (...) If i is positive and exceeds length(x) then the corresponding selection is NA. " it's probably correct to assume that 'simple vector' means 'atomic vector', though not all r core members seem to be quite sure [2]: " So what is a simple vector? That is not explicitly defined, and it probably should be. I think it is "atomic vectors, except those with a class that has a method for [". " it appears that raw vectors are atomic vectors: is.atomic(as.raw(1)) # TRUE so an index out of bounds (at least, a positive integer index exceeding the length of the vector) should (?) produce an NA; however, as.raw(1)[2] # 00 this is presumably because there is no raw NA, and an NA of whatever type is converted to 0 by as.raw: as.raw(NA) # 00 # warning: out-of-range values treated as 0 in coercion to raw but in this case there's a warning. why does not out-of-bounds indexing of a raw vector not produce a warning? following the language definition, the fact that a raw vector is atomic, and the above informal statement on simple and atomic vectors, the selection should first produce an NA, which only subsequently is coerced to the raw 0 -- with a warning. there's an analogous issue with out-of-bounds assignment: x = as.raw(1) x[3] = as.raw(3) x # 01 00 03 but x = 1 x[3] = 3 x # 1 NA 3 as.raw(x) # warning: out-of-range values treated as 0 in coercion to raw is all this an intended feature? vQ [1] http://cran.r-project.org/doc/manuals/R-lang.html#Indexing-by-vectors [2] http://tolstoy.newcastle.edu.au/R/e6/devel/09/03/0954.html
Wacek Kusnierczyk wrote:> interestingly, > > c(1, as.raw(1)) > # error: type 'raw' is unimplemented in 'RealAnswer' > >three more comments. (1) the above is interesting in the light of what ?c says: " The output type is determined from the highest type of the components in the hierarchy NULL < raw < logical < integer < real < complex < character < list < expression. " which seems to suggest that raw components should be coerced to whatever the highest type among all arguments to c, which clearly doesn't happen: test = function(type) c(as.raw(1), get(sprintf('as.%s',type))(1)) for (type in c('null', 'logical', 'integer', 'real', 'complex', 'character', 'list', 'expression')) tryCatch(test(type), error = function(e) cat(sprintf("raw won't coerce to %s type\n", type))) which shows that raw won't coerce to the four first types in the 'hierarchy' (excluding NULL), but it will to character, list, and expression. suggestion: improve the documentation, or adapt the implementation to a more coherent design. (2) incidentally, there's a bug somewhere there related to the condition system and printing: tryCatch(stop(), error=function(e) print(e)) # works just fine tryCatch(stop(), error=function(e) sprintf('%s', e)) # *** caught segfault *** # address (nil), cause 'memory not mapped' # Traceback: # 1: sprintf("%s", e) # 2: value[[3]](cond) # 3: tryCatchOne(expr, names, parentenv, handlers[[1]]) # 4: tryCatchList(expr, classes, parentenv, handlers) # 5: tryCatch(stop(), error = function(e) sprintf("%s", e)) # Possible actions: # 1: abort (with core dump, if enabled) # 2: normal R exit # 3: exit R without saving workspace # 4: exit R saving workspace # Selection: interestingly, it is possible to stay in the session by typing ^C. the session seems to work, but if the tryCatch above is tried once again, a segfault causes r to crash immediately: # ^C tryCatch(stop(), error=function(e) sprintf('%s', e)) # [whoever at wherever] $ however, this doesn't happen if some other code is evaluated first: # ^C x = 1:10^8 tryCatch(stop(), error=function(e) sprintf('%s', e)) # Error in sprintf("%s", e) : 'getEncChar' must be called on a CHARSXP this can't be a feature. (tried in both 2.8.0 and r-devel; version info at the bottom.) suggestion: trace down and fix the bug. (3) the error argument to tryCatch is used in two examples in ?tryCatch, but it is not explained anywhere in the help page. one can guess that the argument name corresponds to the class of conditions the handler will handle, but it would be helpful to have this stated explicitly. the help page simply says: " If a condition is signaled while evaluating 'expr' then established handlers are checked, starting with the most recently established ones, for one matching the class of the condition. When several handlers are supplied in a single 'tryCatch' then the first one is considered more recent than the second. " which is uninformative in this respect -- what does 'one matching the class' mean? suggestion: improve the documentation. vQ> version_ platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status major 2 minor 8.0 year 2008 month 10 day 20 svn rev 46754 language R version.string R version 2.8.0 (2008-10-20)> version_ platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status Under development (unstable) major 2 minor 9.0 year 2009 month 03 day 19 svn rev 48152 language R version.string R version 2.9.0 Under development (unstable) (2009-03-19 r48152)
>>>>> "WK" == Wacek Kusnierczyk <Waclaw.Marcin.Kusnierczyk at idi.ntnu.no> >>>>> on Thu, 19 Mar 2009 10:17:20 +0100 writes:WK> Wacek Kusnierczyk wrote: >> interestingly, >> >> c(1, as.raw(1)) >> # error: type 'raw' is unimplemented in 'RealAnswer' >> >> WK> three more comments. WK> (1) WK> the above is interesting in the light of what ?c says: WK> " WK> The output type is determined from the highest type of the WK> components in the hierarchy NULL < raw < logical < integer < real WK> < complex < character < list < expression. WK> " WK> which seems to suggest that raw components should be coerced to whatever WK> the highest type among all arguments to c, which clearly doesn't happen: WK> test = function(type) WK> c(as.raw(1), get(sprintf('as.%s',type))(1)) WK> for (type in c('null', 'logical', 'integer', 'real', 'complex', WK> 'character', 'list', 'expression')) WK> tryCatch(test(type), error = function(e) cat(sprintf("raw won't WK> coerce to %s type\n", type))) WK> which shows that raw won't coerce to the four first types in the WK> 'hierarchy' (excluding NULL), but it will to character, list, and WK> expression. WK> suggestion: improve the documentation, or adapt the implementation to WK> a more coherent design. Thank you, Wacek. I've decided to adapt the implementation such that all the above c(<raw> , <type>) calls' implicit coercions will work. WK> (2) WK> incidentally, there's a bug somewhere there related to the condition WK> system and printing: WK> tryCatch(stop(), error=function(e) print(e)) WK> # works just fine WK> tryCatch(stop(), error=function(e) sprintf('%s', e)) WK> # *** caught segfault *** WK> # address (nil), cause 'memory not mapped' WK> # Traceback: WK> # 1: sprintf("%s", e) WK> # 2: value[[3]](cond) WK> # 3: tryCatchOne(expr, names, parentenv, handlers[[1]]) WK> # 4: tryCatchList(expr, classes, parentenv, handlers) WK> # 5: tryCatch(stop(), error = function(e) sprintf("%s", e)) WK> # Possible actions: WK> # 1: abort (with core dump, if enabled) WK> # 2: normal R exit WK> # 3: exit R without saving workspace WK> # 4: exit R saving workspace WK> # Selection: WK> interestingly, it is possible to stay in the session by typing ^C. the WK> session seems to work, but if the tryCatch above is tried once again, a WK> segfault causes r to crash immediately: WK> # ^C WK> tryCatch(stop(), error=function(e) sprintf('%s', e)) WK> # [whoever at wherever] $ WK> however, this doesn't happen if some other code is evaluated first: WK> # ^C WK> x = 1:10^8 WK> tryCatch(stop(), error=function(e) sprintf('%s', e)) WK> # Error in sprintf("%s", e) : 'getEncChar' must be called on a CHARSXP WK> this can't be a feature. (tried in both 2.8.0 and r-devel; version WK> info at the bottom.) WK> suggestion: trace down and fix the bug. [not me, at least not now.] WK> (3) WK> the error argument to tryCatch is used in two examples in ?tryCatch, but WK> it is not explained anywhere in the help page. one can guess that the WK> argument name corresponds to the class of conditions the handler will WK> handle, but it would be helpful to have this stated explicitly. the WK> help page simply says: WK> " WK> If a condition is signaled while evaluating 'expr' then WK> established handlers are checked, starting with the most recently WK> established ones, for one matching the class of the condition. WK> When several handlers are supplied in a single 'tryCatch' then the WK> first one is considered more recent than the second. WK> " WK> which is uninformative in this respect -- what does 'one matching the WK> class' mean? WK> suggestion: improve the documentation. Patches to tryCatch.Rd are gladly accepted and quite possibly applied to the sources without much changes. Thanks in advance! Martin Maechler, ETH Zurich