Liviu Andronic
2012-Jul-30 15:53 UTC
[R] replace values in vector from a replacement table
Dear all I've got stuck when trying to replace values in a vector by selecting replacements from a replacement table. I'm trying to use only base functions. Here's a dummy example:> (x <- rep(letters,2))[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" [23] "w" "x" "y" "z" "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" [45] "s" "t" "u" "v" "w" "x" "y" "z"> values <- c("aa", "a", "b", NA, "d", "zz") > repl <- c("aa", "A", "B", NA, "D", "zz") > (repl.tab <- cbind(values, repl))values repl [1,] "aa" "aa" [2,] "a" "A" [3,] "b" "B" [4,] NA NA [5,] "d" "D" [6,] "zz" "zz" Now I can easily compute all four combinations of 'match' and '%in%':> (ind <- match(x, repl.tab[ ,1]))[1] 2 3 NA 5 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2 3 NA [30] 5 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA> (ind <- match(repl.tab[ ,1], x))[1] NA 1 2 NA 4 NA> (ind <- x %in% repl.tab[ ,1])[1] TRUE TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [15] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE [29] FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE [43] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE> (ind <- repl.tab[ ,1] %in% x)[1] FALSE TRUE TRUE FALSE TRUE FALSE But how do I actually proceed to obtain the following vector? Can it be done without an explicit apply() or loop?> res[1] "A" "B" "c" "D" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" [23] "w" "x" "y" "z" "A" "B" "c" "D" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" [45] "s" "t" "u" "v" "w" "x" "y" "z" Regards Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail
try this:> (x <- rep(letters,2))[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" [24] "x" "y" "z" "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" [47] "u" "v" "w" "x" "y" "z"> values <- c("aa", "a", "b", NA, "d", "zz") > repl <- c("aa", "A", "B", NA, "D", "zz") > (repl.tab <- cbind(values, repl))values repl [1,] "aa" "aa" [2,] "a" "A" [3,] "b" "B" [4,] NA NA [5,] "d" "D" [6,] "zz" "zz"> indx <- match(x, repl.tab[, 1], nomatch = 0) > x[indx != 0] <- repl.tab[indx, 2] > x[1] "A" "B" "c" "D" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" [24] "x" "y" "z" "A" "B" "c" "D" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" [47] "u" "v" "w" "x" "y" "z"> >On Mon, Jul 30, 2012 at 11:53 AM, Liviu Andronic <landronimirc at gmail.com> wrote:> Dear all > I've got stuck when trying to replace values in a vector by selecting > replacements from a replacement table. I'm trying to use only base > functions. Here's a dummy example: >> (x <- rep(letters,2)) > [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" > "q" "r" "s" "t" "u" "v" > [23] "w" "x" "y" "z" "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" > "m" "n" "o" "p" "q" "r" > [45] "s" "t" "u" "v" "w" "x" "y" "z" >> values <- c("aa", "a", "b", NA, "d", "zz") >> repl <- c("aa", "A", "B", NA, "D", "zz") >> (repl.tab <- cbind(values, repl)) > values repl > [1,] "aa" "aa" > [2,] "a" "A" > [3,] "b" "B" > [4,] NA NA > [5,] "d" "D" > [6,] "zz" "zz" > > > Now I can easily compute all four combinations of 'match' and '%in%': >> (ind <- match(x, repl.tab[ ,1])) > [1] 2 3 NA 5 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA > NA NA NA NA 2 3 NA > [30] 5 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA >> (ind <- match(repl.tab[ ,1], x)) > [1] NA 1 2 NA 4 NA >> (ind <- x %in% repl.tab[ ,1]) > [1] TRUE TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE > FALSE FALSE FALSE > [15] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE > FALSE TRUE TRUE > [29] FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE > FALSE FALSE FALSE > [43] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE >> (ind <- repl.tab[ ,1] %in% x) > [1] FALSE TRUE TRUE FALSE TRUE FALSE > > > But how do I actually proceed to obtain the following vector? Can it > be done without an explicit apply() or loop? >> res > [1] "A" "B" "c" "D" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" > "q" "r" "s" "t" "u" "v" > [23] "w" "x" "y" "z" "A" "B" "c" "D" "e" "f" "g" "h" "i" "j" "k" "l" > "m" "n" "o" "p" "q" "r" > [45] "s" "t" "u" "v" "w" "x" "y" "z" > > > Regards > Liviu > > > -- > Do you know how to read? > http://www.alienetworks.com/srtest.cfm > http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader > Do you know how to write? > http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it.
Liviu Andronic
2012-Jul-30 16:29 UTC
[R] replace values in vector from a replacement table
On Mon, Jul 30, 2012 at 6:00 PM, jim holtman <jholtman at gmail.com> wrote:> try this: >> indx <- match(x, repl.tab[, 1], nomatch = 0) >> x[indx != 0] <- repl.tab[indx, 2] >> x > [1] "A" "B" "c" "D" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" > "q" "r" "s" "t" "u" "v" "w" > [24] "x" "y" "z" "A" "B" "c" "D" "e" "f" "g" "h" "i" "j" "k" "l" "m" > "n" "o" "p" "q" "r" "s" "t" > [47] "u" "v" "w" "x" "y" "z" >> >This is excellent! Thank you Liviu
Liviu Andronic
2012-Jul-31 10:44 UTC
[R] replace values in vector from a replacement table
On Mon, Jul 30, 2012 at 6:00 PM, jim holtman <jholtman at gmail.com> wrote:> try this: > >> (x <- rep(letters,2)) > [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" > "q" "r" "s" "t" "u" "v" "w" > [24] "x" "y" "z" "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" > "n" "o" "p" "q" "r" "s" "t" > [47] "u" "v" "w" "x" "y" "z" >> values <- c("aa", "a", "b", NA, "d", "zz") >> repl <- c("aa", "A", "B", NA, "D", "zz") >> (repl.tab <- cbind(values, repl)) > values repl > [1,] "aa" "aa" > [2,] "a" "A" > [3,] "b" "B" > [4,] NA NA > [5,] "d" "D" > [6,] "zz" "zz" >> indx <- match(x, repl.tab[, 1], nomatch = 0) >> x[indx != 0] <- repl.tab[indx, 2] >> x > [1] "A" "B" "c" "D" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" > "q" "r" "s" "t" "u" "v" "w" > [24] "x" "y" "z" "A" "B" "c" "D" "e" "f" "g" "h" "i" "j" "k" "l" "m" > "n" "o" "p" "q" "r" "s" "t" > [47] "u" "v" "w" "x" "y" "z" >> >Based on this code I came up with the following function. replace2 <- function(x, ind, repl){ if(any(is.na(ind))) ind[is.na(ind)] <- 0 if(is.vector(x) & is.vector(repl)) { (x[ind != 0] <- repl[ind]) return(x) } else if(identical(ncol(x), ncol(repl))){ (x[ind != 0, ] <- repl[ind, ]) return(x) } } Whereas replicate() can be used only on vectors of same dimension, replicate2() can be used on vectors and matrices/dataframes, and the replacement data can have different nr of rows. It also works with index vectors containing NAs.> ##for vectors > (indx <- match(x, repl.tab[, 1], nomatch = 0))[1] 2 3 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 3 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [46] 0 0 0 0 0 0 0> head(replace2(x, indx, repl.tab[, 2]))[1] "A" "B" "c" "D" "e" "f"> (indx <- match(x, repl.tab[, 1])) ##index vector with NAs[1] 2 3 NA 5 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2 3 NA 5 [31] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA> head(replace2(x, indx, repl.tab[, 2]))[1] "A" "B" "c" "D" "e" "f"> ##for matrices/dataframes > head(xx <- cbind(x, x))x x [1,] "a" "a" [2,] "b" "b" [3,] "c" "c" [4,] "d" "d" [5,] "e" "e" [6,] "f" "f"> (repl.tab2 <- cbind(repl.tab[, 2], repl.tab[, 2]))[,1] [,2] [1,] "aa" "aa" [2,] "A" "A" [3,] "B" "B" [4,] NA NA [5,] "D" "D" [6,] "zz" "zz"> head(replace2(xx, indx, repl.tab2))x x [1,] "A" "A" [2,] "B" "B" [3,] "c" "c" [4,] "D" "D" [5,] "e" "e" [6,] "f" "f" Does this function have any generic value? Are there obvious implementation mistakes? Regards Liviu