Ram H. Sharma
2011-May-01 20:26 UTC
[R] quick help needed: split a number and "find and replace" type of function that works like in MS excel
Hi R experts I have a couple of quick question: Q1 #my data set.seed(12341) SN <- 1:100 pool<- c(12,13,14, 23, 24, 34) CT1<- sample(pool, 100, replace= TRUE) set.seed(1242) CT2 <- sample(pool, 100, replace= TRUE) set.seed(142) CT3 <- sample(pool, 100, replace= TRUE) # the number of variables run to end of coulmn 20000 mydf <- data.frame(SN, CT1, CT2, CT3) First question: how can I split 12 into 1 2, 13 into 1 3, 14 into 1 4? What I am trying here is to split each number into two and make seperate variable CT1a and CT1b, CT2a and CT2b, CT3a and CT3b. Tried with strsplit () but I believe this works with characters only Q2 Is there any function that works in the same manner as find and replace function MS excel. Just for example, if I want to replace all 1s in the above data frame with "A", 2 with "B". Thus the number 12 will be converted to "AB". I tried with car but it very slow as I need to very large dataframe. Thanks; -- Ram H [[alternative HTML version deleted]]
Steve Lianoglou
2011-May-01 21:03 UTC
[R] quick help needed: split a number and "find and replace" type of function that works like in MS excel
Hi, There are a couple of ways to do what you want. I'll provide the fodder and let you finish the implementation. On Sun, May 1, 2011 at 4:26 PM, Ram H. Sharma <sharma.ram.h at gmail.com> wrote:> Hi R experts > > I have a couple of ?quick question: > > Q1 > #my data > set.seed(12341) > SN <- 1:100 > pool<- c(12,13,14, 23, 24, 34) > CT1<- sample(pool, 100, replace= TRUE) > ?set.seed(1242) > CT2 <- sample(pool, 100, replace= TRUE) > ?set.seed(142) > CT3 <- sample(pool, 100, replace= TRUE) > # the number of variables run to end of coulmn 20000 > mydf <- data.frame(SN, CT1, CT2, CT3) > > First question: how can I split 12 into 1 ?2, 13 into 1 ?3, ?14 into 1 ?4? > What I am trying here is to split each number into two and make seperate > variable CT1a and CT1b, CT2a and CT2b, CT3a and CT3b. > > ?Tried with strsplit () but I believe this works with characters onlyYou can convert your numbers to characters, if you like. Using your dataset, consider: R> ct1.char <- as.character(mydf$CT1) R> ct1.char <- strsplit(as.character(mydf$CT1), '') R> ct1a <- sapply(ct1.char, '[', 1) ## "non-obvious" use of '[' as R> ct1b <- sapply(ct1.char, '[', 2) ## a function is intentional :-) R> head(data.frame(ct1a=ct1a, ct1b=ct1b)) ct1a ct1b 1 3 4 2 1 4 3 2 3 4 1 4 5 3 4 6 2 3> Q2 > Is there any function that works in the same manner as find and replace > function MS excel. Just for example, if I want to replace all 1s in the > above data frame with "A", 2 with "B". Thus the number 12 will be converted > to "AB". ?I tried with car but it very slow as I need to very large > dataframe.Try gsub: R> head(ct1a) [1] "3" "1" "2" "1" "3" "2" R> head(gsub("1", "A", ct1a)) [1] "3" "A" "2" "A" "3" "2" or you can use a "translation table" R> xlate <- c('1'='A', '2'='B', '3'='C') R> head(xlate[ct1a]) 3 1 2 1 3 2 "C" "A" "B" "A" "C" "B" You might also consider not converting your original data into characters and splitting off the integers -- you can use modulo arithmetic to get each digit, ie: R> head(mydf$CT1) [1] 34 14 23 14 34 23 ## First digit R> head(as.integer(mydf$CT1 / 10)) [1] 3 1 2 1 3 2 ## Second digit R> head(mydf$CT1 %% 10) [1] 4 4 3 4 4 3 There's some food for thought .. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact