R pattern-matching and replacement functions are vectorized: they can operate on vectors of targets. However, they can only use one pattern and replacement. Here is code to apply a different pattern and replacement for every target. My question: can it be done better? sub2 <- function(pattern, replacement, x) { len <- length(x) if (length(pattern) == 1) pattern <- rep(pattern, len) if (length(replacement) == 1) replacement <- rep(replacement, len) FUN <- function(i, ...) { sub(pattern[i], replacement[i], x[i], fixed = TRUE) } idx <- 1:length(x) sapply(idx, FUN) } #Example X <- c("ab", "cd", "ef") patt <- c("b", "cd", "a") repl <- c("B", "CD", "A") sub2(patt, repl, X) -John Confidentiality Notice: This e-mail message, including a...{{dropped:8}}
Un texte encapsul? et encod? dans un jeu de caract?res inconnu a ?t? nettoy?... Nom : non disponible URL : <https://stat.ethz.ch/pipermail/r-help/attachments/20081007/be151350/attachment.pl>
John, Try the following:> mapply(function(p, r, x) sub(p, r, x, fixed = TRUE), p=patt, r=repl, x=X)b cd a "aB" "CD" "ef" -Christos> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Thaden, John J > Sent: Tuesday, October 07, 2008 3:59 PM > To: r-help at r-project.org > Cc: jjthaden at flash.net > Subject: [R] vectorized sub, gsub, grep, etc. > > R pattern-matching and replacement functions are > vectorized: they can operate on vectors of targets. > However, they can only use one pattern and replacement. > Here is code to apply a different pattern and replacement for > every target. My question: can it be done better? > > sub2 <- function(pattern, replacement, x) { > len <- length(x) > if (length(pattern) == 1) > pattern <- rep(pattern, len) > if (length(replacement) == 1) > replacement <- rep(replacement, len) > FUN <- function(i, ...) { > sub(pattern[i], replacement[i], x[i], fixed = TRUE) > } > idx <- 1:length(x) > sapply(idx, FUN) > } > > #Example > X <- c("ab", "cd", "ef") > patt <- c("b", "cd", "a") > repl <- c("B", "CD", "A") > sub2(patt, repl, X) > > -John > > Confidentiality Notice: This e-mail message, including > a...{{dropped:8}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >
Hello Christos, To my surprise, vectorization actually hurt processing speed! #Example X <- c("ab", "cd", "ef") patt <- c("b", "cd", "a") repl <- c("B", "CD", "A") sub2 <- function(pattern, replacement, x) { len <- length(x) if (length(pattern) == 1) pattern <- rep(pattern, len) if (length(replacement) == 1) replacement <- rep(replacement, len) FUN <- function(i, ...) { sub(pattern[i], replacement[i], x[i], fixed = TRUE) } idx <- 1:length(x) sapply(idx, FUN) } system.time( for(i in 1:10000) sub2(patt, repl, X) ) user system elapsed 1.18 0.07 1.26 system.time( for(i in 1:10000) mapply(function(p, r, x) sub(p, r, x, fixed = TRUE), p=patt, r=repl, x=X) ) user system elapsed 1.42 0.05 1.47 So much for avoiding loops. John Thaden ======= At 2008-10-07, 14:58:10 Christos wrote: ======>John, >Try the following: > > mapply(function(p, r, x) sub(p, r, x, fixed = TRUE), p=patt, r=repl, x=X) > b cd a >"aB" "CD" "ef" > >-Christos>> -----My Original Message----- >> R pattern-matching and replacement functions are >> vectorized: they can operate on vectors of targets. >> However, they can only use one pattern and replacement. >> Here is code to apply a different pattern and replacement for >> every target. My question: can it be done better? >> >> sub2 <- function(pattern, replacement, x) { >> len <- length(x) >> if (length(pattern) == 1) >> pattern <- rep(pattern, len) >> if (length(replacement) == 1) >> replacement <- rep(replacement, len) >> FUN <- function(i, ...) { >> sub(pattern[i], replacement[i], x[i], fixed = TRUE) >> } >> idx <- 1:length(x) >> sapply(idx, FUN) >> } >> >> #Example >> X <- c("ab", "cd", "ef") >> patt <- c("b", "cd", "a") >> repl <- c("B", "CD", "A") >> sub2(patt, repl, X) >> >> -John
Maybe Matching Threads
- eliminate a partial argument match warning in R CMD check
- gsub: replacing double backslashes with single backslash
- [supermin] Be smarter about finding suitable kernel images
- Change in grep functionality causes Rd_db to fail silently (PR#9846)
- gsub() issue...