Martin Maechler
2003-Oct-23 14:35 UTC
[Rd] Re: ichar() function in R : 1st implementation, RFC
(RFC := Request For Comments)>>>>> "Tim" == Tim Keighley <Tim.Keighley@csiro.au> >>>>> on Thu, 23 Oct 2003 11:45:22 +1000 writes:Tim> Hi Martin, Tim> In October 2000 you wrote to r-help: >>> which reminds me that I've had a desire for something like >>> the old S function [from the blue book, and library(examples) I think] >>> ichar(ch) >>> which would return a vector of integers, each the (decimal) equivalent of >>> the (ISO-latin1) representation of the corresponding characters in ch. >>> >>> This should be easy enough (and be done in C). >>> Any volunteers? Tim> Did you get any volunteers? no. Thank you for reminding me! Tim> Is this function or an Tim> equivalent now available in R? I have been unsuccessful Tim> in my investigation but I do not have all the available Tim> CRAN packages, so it might be in one of them. I've searched myself (we have almost all of them installed), and didn't find anything. Tim> Cheers, Tim> Tim Keighley I now did a first cut, using R only code, (and realizing that most of chars8bit() should really happen in C). I'm proposing to add something like this to R-devel in the near future. Note that AsciiToInt() and ichar() ar for S-plus and "old S" compatibility, whereas I think we'd really want (equivalents) of the three functions digitsBase() chars8bit() strcodes() in R eventually. I'm very interested in feedback, particularly - function and arguments' naming - proposals for improvements - neat examples of usage. Martin Martin Maechler <maechler@stat.math.ethz.ch> http://stat.ethz.ch/~maechler/ Seminar fuer Statistik, ETH-Zentrum LEO C16 Leonhardstr. 27 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-1-632-3408 fax: ...-1228 <>< -------------- next part -------------- ### This was digits.v() in library(sfsmisc): ### --> get it's help() file /u/maechler/R/Pkgs/sfsmisc/man/digits.Rd digitsBase <- function(x, base = 2, ndigits = 1 + floor(log(max(x),base))) { ## Purpose: Give the vector A of the base-_base_ representation of _n_: ## ------- n = sum_{k=0}^M A_{M-k} base ^ k , where M = length(a) - 1 ## Value: MATRIX M where M[,i] corresponds to x[i] ## c( result ) then contains the blocks in proper order .. ## Author: Martin Maechler, Date: Wed Dec 4 14:10:27 1991 ## ---------------------------------------------------------------- ## Arguments: x: vector of non-negative integers ## base: Base for representation ## ndigits: Number of digits/bits to use ## EXAMPLE: digitsBase(1:24, 8) #-- octal representation ## ---------------------------------------------------------------- if(any((x <- as.integer(x)) < 0)) stop("`x' must be non-negative integers") r <- matrix(0, nrow = ndigits, ncol = length(x)) if(ndigits >= 1) for (i in ndigits:1) { r[i,] <- x %% base if (i > 1) x <- x %/% base } r } ### This is an improved version of make.ASCII() in 1991's ~/S/Good-string.S ! chars8bit <- function(i = 0:255) { ## Purpose: Compute a character vector from its "ASCII" codes. ## We seem to have to use this complicated way thru text and parse. ## Author: Martin Maechler, Original date: Wed Dec 4, 1991 ## ---------------------------------------------------------------- i <- as.integer(i) if(any(i < 0 | i > 255)) stop("`i' must be in 0:255") i8 <- apply(digitsBase(i, base = 8), 2, paste, collapse="") c8 <- paste('"\\', i8, '"', sep="") eval(parse(text = paste("c(",paste(c8, collapse=","),")", sep=""))) } strcodes <- function(x, table = chars8bit(0:255)) { ## Purpose: R (code) implementation of old S's ichar() ## ---------------------------------------------------------------------- ## Arguments: x: character vector ## ---------------------------------------------------------------------- ## Author: Martin Maechler, Date: 23 Oct 2003, 12:42 lapply(strsplit(x, ""), match, table = table) } ## S-PLUS has AsciiToInt() officially, and ichar() in library(examples): AsciiToInt <- ichar <- function(strings) unname(unlist(strcodes(strings))) ## Examples: all8bit <- chars8bit(0:255) matrix(all8bit, 32, 8, byrow = TRUE) x <- c(a = "abc", bb = "BLA & blu", Person = "M?chler, Z?rich") strcodes(x) AsciiToInt(x)
Kurt Hornik
2003-Oct-23 17:15 UTC
[Rd] Re: ichar() function in R : 1st implementation, RFC
>>>>> Martin Maechler writes:> (RFC := Request For Comments)My preference would be that we start by adding a version of sscanf(). -k>>>>> "Tim" == Tim Keighley <Tim.Keighley@csiro.au> >>>>> on Thu, 23 Oct 2003 11:45:22 +1000 writes:Tim> Hi Martin, Tim> In October 2000 you wrote to r-help:>>>> which reminds me that I've had a desire for something like >>>> the old S function [from the blue book, and library(examples) I think] >>>> ichar(ch) >>>> which would return a vector of integers, each the (decimal) equivalent of >>>> the (ISO-latin1) representation of the corresponding characters in ch. >>>> >>>> This should be easy enough (and be done in C). >>>> Any volunteers?Tim> Did you get any volunteers?> no. > Thank you for reminding me!Tim> Is this function or an Tim> equivalent now available in R? I have been unsuccessful Tim> in my investigation but I do not have all the available Tim> CRAN packages, so it might be in one of them.> I've searched myself (we have almost all of them installed), and > didn't find anything.Tim> Cheers, Tim> Tim Keighley> I now did a first cut, using R only code, > (and realizing that most of chars8bit() should really happen in C).> I'm proposing to add something like this to R-devel in the near > future. > Note that AsciiToInt() and ichar() ar for S-plus and "old S" > compatibility, whereas I think we'd really want (equivalents) of > the three functions > digitsBase() > chars8bit() > strcodes() > in R eventually.> I'm very interested in feedback, > particularly > - function and arguments' naming > - proposals for improvements > - neat examples of usage.> Martin> Martin Maechler <maechler@stat.math.ethz.ch> http://stat.ethz.ch/~maechler/ > Seminar fuer Statistik, ETH-Zentrum LEO C16 Leonhardstr. 27 > ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND > phone: x-41-1-632-3408 fax: ...-1228 <><> ### This was digits.v() in library(sfsmisc): > ### --> get it's help() file /u/maechler/R/Pkgs/sfsmisc/man/digits.Rd > digitsBase <- function(x, base = 2, ndigits = 1 + floor(log(max(x),base))) > { > ## Purpose: Give the vector A of the base-_base_ representation of _n_: > ## ------- n = sum_{k=0}^M A_{M-k} base ^ k , where M = length(a) - 1 > ## Value: MATRIX M where M[,i] corresponds to x[i] > ## c( result ) then contains the blocks in proper order .. > ## Author: Martin Maechler, Date: Wed Dec 4 14:10:27 1991 > ## ---------------------------------------------------------------- > ## Arguments: x: vector of non-negative integers > ## base: Base for representation > ## ndigits: Number of digits/bits to use > ## EXAMPLE: digitsBase(1:24, 8) #-- octal representation > ## ---------------------------------------------------------------- > if(any((x <- as.integer(x)) < 0)) > stop("`x' must be non-negative integers") > r <- matrix(0, nrow = ndigits, ncol = length(x)) > if(ndigits >= 1) for (i in ndigits:1) { > r[i,] <- x %% base > if (i > 1) x <- x %/% base > } > r > }> ### This is an improved version of make.ASCII() in 1991's ~/S/Good-string.S ! > chars8bit <- function(i = 0:255) > { > ## Purpose: Compute a character vector from its "ASCII" codes. > ## We seem to have to use this complicated way thru text and parse.> ## Author: Martin Maechler, Original date: Wed Dec 4, 1991 > ## ---------------------------------------------------------------- > i <- as.integer(i) > if(any(i < 0 | i > 255)) stop("`i' must be in 0:255") > i8 <- apply(digitsBase(i, base = 8), 2, paste, collapse="") > c8 <- paste('"\\', i8, '"', sep="") > eval(parse(text = paste("c(",paste(c8, collapse=","),")", sep=""))) > }> strcodes <- function(x, table = chars8bit(0:255)) > { > ## Purpose: R (code) implementation of old S's ichar() > ## ---------------------------------------------------------------------- > ## Arguments: x: character vector > ## ---------------------------------------------------------------------- > ## Author: Martin Maechler, Date: 23 Oct 2003, 12:42> lapply(strsplit(x, ""), match, table = table) > }> ## S-PLUS has AsciiToInt() officially, and ichar() in library(examples): > AsciiToInt <- ichar <- function(strings) unname(unlist(strcodes(strings)))> ## Examples:> all8bit <- chars8bit(0:255) > matrix(all8bit, 32, 8, byrow = TRUE)> x <- c(a = "abc", bb = "BLA & blu", Person = "M?chler, Z?rich") > strcodes(x) > AsciiToInt(x) > ______________________________________________ > R-devel@stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-devel