Hi everyone, I have a vector of strings, each string made up by different number of words. I want to get a new vector which has only the first word of each string in the first vector. I came up with this: str <- c('aaa bbb', 'cc', 'd eee aa', 'mmm o n') str1 <- rep(1, length(str)) for (i in 1:length(str)) { str1[i] <- strsplit(str, " ")[[i]][1] } str1 'aaa' 'cc' 'd' 'mmm' Now, is there any way to do this simpler? Thanks, Monica _________________________________________________________________ Get the power of Windows + Web with the new Windows Live. [[alternative HTML version deleted]]
Hi, if you only want the first word, then this should do the trick : R> sub( " +.*", "", str ) [1] "aaa" "cc" "d" "mmm" Cheers, Romain Monica Pisica wrote:> Hi everyone, > > I have a vector of strings, each string made up by different number of words. I want to get a new vector which has only the first word of each string in the first vector. I came up with this: > > str <- c('aaa bbb', 'cc', 'd eee aa', 'mmm o n') > str1 <- rep(1, length(str)) > for (i in 1:length(str)) { > str1[i] <- strsplit(str, " ")[[i]][1] > } > str1 > 'aaa' 'cc' 'd' 'mmm' > > Now, is there any way to do this simpler? > > Thanks, > > Monica > > _________________________________________________________________ > Get the power of Windows + Web with the new Windows Live. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- Mango Solutions data analysis that delivers Tel: +44(0) 1249 467 467 Fax: +44(0) 1249 467 468 Mob: +44(0) 7813 526 123
one way is the following: str <- c('aaa bbb', 'cc', 'd eee aa', 'mmm o n') sapply(strsplit(str, " "), "[", 1) I hope it helps. Best, Dimitris ---- Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm ----- Original Message ----- From: "Monica Pisica" <pisicandru at hotmail.com> To: <r-help at stat.math.ethz.ch> Sent: Thursday, December 13, 2007 2:41 PM Subject: [R] spliting strings ...> > Hi everyone, > > I have a vector of strings, each string made up by different number > of words. I want to get a new vector which has only the first word > of each string in the first vector. I came up with this: > > str <- c('aaa bbb', 'cc', 'd eee aa', 'mmm o n') > str1 <- rep(1, length(str)) > for (i in 1:length(str)) { > str1[i] <- strsplit(str, " ")[[i]][1] > } > str1 > 'aaa' 'cc' 'd' 'mmm' > > Now, is there any way to do this simpler? > > Thanks, > > Monica > > _________________________________________________________________ > Get the power of Windows + Web with the new Windows Live. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
Good afternoon Monica, Relying on regular expressions, substituting nothing "" for everything starting with a space until the end of the "line" (i.e. with a dollar sign) str1 <- sub(" .*$", "", str) Regards, Sean Monica Pisica wrote:> > > Hi everyone, > > I have a vector of strings, each string made up by different number of > words. I want to get a new vector which has only the first word of each > string in the first vector. I came up with this: > > str <- c('aaa bbb', 'cc', 'd eee aa', 'mmm o n') > str1 <- rep(1, length(str)) > for (i in 1:length(str)) { > str1[i] <- strsplit(str, " ")[[i]][1] > } > str1 > 'aaa' 'cc' 'd' 'mmm' > > Now, is there any way to do this simpler? > > Thanks, > > Monica > > _________________________________________________________________ > Get the power of Windows + Web with the new Windows Live. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- View this message in context: http://www.nabble.com/spliting-strings-...-tp14316255p14316361.html Sent from the R help mailing list archive at Nabble.com.
On Thu, 13 Dec 2007, Monica Pisica wrote:> > Hi everyone, > > I have a vector of strings, each string made up by different number of > words.You need to define 'word' and 'first'. Your solution says the first word of " aa" is "", which is not what most people would think.> I want to get a new vector which has only the first word of each > string in the first vector. I came up with this: > > str <- c('aaa bbb', 'cc', 'd eee aa', 'mmm o n') > str1 <- rep(1, length(str)) > for (i in 1:length(str)) { > str1[i] <- strsplit(str, " ")[[i]][1] > } > str1 > 'aaa' 'cc' 'd' 'mmm' > > Now, is there any way to do this simpler?> sapply(strsplit(str, " "), `[`, 1)[1] "aaa" "cc" "d" "mmm" or> sub("([^ ]+).*", "\\1", str)I don't see how you got your answer: R does not print like that (and never has).> > Thanks, > > Monica > > _________________________________________________________________ > Get the power of Windows + Web with the new Windows Live. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.PLEASE do! -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Monica Pisica wrote:> Hi everyone, > > I have a vector of strings, each string made up by different number of words. I want to get a new vector which has only the first word of each string in the first vector. I came up with this: > > str <- c('aaa bbb', 'cc', 'd eee aa', 'mmm o n') > str1 <- rep(1, length(str)) > for (i in 1:length(str)) { > str1[i] <- strsplit(str, " ")[[i]][1] > } > str1 > 'aaa' 'cc' 'd' 'mmm' > > Now, is there any way to do this simpler? > > sapply(strsplit(str, " "), "[", 1)[1] "aaa" "cc" "d" "mmm" -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
Hi Monica, Try sapply(as.list(str),function(x) unlist(strsplit(x," "))[1]) --- Monica Pisica <pisicandru at hotmail.com> wrote:> > Hi everyone, > > I have a vector of strings, each string made up by > different number of words. I want to get a new > vector which has only the first word of each string > in the first vector. I came up with this: > > str <- c('aaa bbb', 'cc', 'd eee aa', 'mmm o n') > str1 <- rep(1, length(str)) > for (i in 1:length(str)) { > str1[i] <- strsplit(str, " ")[[i]][1] > } > str1 > 'aaa' 'cc' 'd' 'mmm' > > Now, is there any way to do this simpler? > > Thanks, > > Monica > >_________________________________________________________________> Get the power of Windows + Web with the new Windows > Live. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. >