Hi guys, I was wondering if any one is able to help me on a problem that I was stuck with for a long time. It involves the replacement of character strings with numbers. The character string can take on only 3 possible values, for instance: AA AT TT I would want R to replace AT with 0. Between AA and TT, I want to compare the frequency of either value, and then for the one which occurs more, I want it to be replaced with 1, and the other with -1. So using the same example, say, I have AA - 50 AT-34 TT- 57 I would want R to substitute it in this way: AA= -1 AT=0 TT = 1 The strings are not necessarily AA,AT, or TT. Any ideas? Thanks! Jeremy [[alternative HTML version deleted]]
Hi, Not sure how your data looks like.? May be this helps. dat1<- read.table(text=" col1 AA-50 AT-34 TT-57 TT-45 TA-42 ",sep="",header=TRUE,stringsAsFactors=FALSE) vec1<-gsub("\\-.*","",dat1[,1]) vec2<- ifelse(vec1=="AA",-1,ifelse(vec1=="AT",0, ifelse(vec1=="TT",1,NA))) library(stringr) ?abs(vec2-as.numeric(unlist( str_extract_all(dat1[,1],"[0-9]+")))) #[1] 51 34 56 44 NA A.K. ----- Original Message ----- From: Jeremy Ng <jeremy.ng.wk1990 at gmail.com> To: r-help at r-project.org Cc: Sent: Tuesday, July 2, 2013 8:31 AM Subject: [R] Replacing strings to numbers Hi guys, I was wondering if any one is able to help me on a problem that I was stuck with for a long time. It involves the replacement of character strings with numbers. The character string can take on only 3 possible values, for instance: AA AT TT I would want R to replace AT with 0. Between AA and TT, I want to compare the frequency of either value, and then for the one which occurs more, I want it to be replaced with 1, and the other with -1. So using the same example, say, I have AA - 50 AT-34 TT- 57 I would want R to substitute it in this way: AA= -1 AT=0 TT = 1 The strings are not necessarily AA,AT, or TT. Any ideas? Thanks! Jeremy ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On 02/07/2013 13:31, Jeremy Ng wrote:> Hi guys, > > I was wondering if any one is able to help me on a problem that I was stuck > with for a long time. It involves the replacement of character strings with > numbers. The character string can take on only 3 possible values, for > instance: > > AA > AT > TT > > I would want R to replace AT with 0. Between AA and TT, I want to compare > the frequency of either value, and then for the one which occurs more, I > want it to be replaced with 1, and the other with -1. So using the same > example, say, I have > > AA - 50 > AT-34 > TT- 57 > > I would want R to substitute it in this way: > AA= -1 > AT=0 > TT = 1 > > The strings are not necessarily AA,AT, or TT.If not, how are we to know which one is to be replaced by 0? And does 'more' mean 'greater than' or 'greater than or equal to'? Adapt the following depending on your answers > set.seed(1) > x <- sample(c(rep("AA", 2), "AT", rep("TT", 3))) > fr <- table(x) > recode <- if(fr[1] < fr[3]) c(-1, 0, 1) else c(1, 0, -1) # or < > x [1] "AA" "TT" "AT" "TT" "AA" "TT" > recode[match(x, names(fr))] # or however the strings are arranged. [1] -1 1 0 1 -1 1> > Any ideas? > > Thanks! > Jeremy > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595