farnoosh sheikhi
2014-Jul-08 19:11 UTC
Re: Seprate last name and first name into two columns
It actually worked perfectly. Thank you so much. I always learn a lot from you:). Farnoosh On Monday, July 7, 2014 6:51 PM, arun <smartpink111@yahoo.com> wrote: Not sure this helps as you provided only very little info. library(stringr) str1 <- c("TE MA CRUZ ABEL","JOSE AN ANDRA","AL MIGUEL","ALV", "Farnoosh Sheikhi", "D Art", "Rob K Kim") do.call(rbind,lapply(str_split(str_replace_all(str1,perl('(?<= )([A-Za-z]+)$'), ",\\1"),","),function(x) {x1 <- str_trim(x); if(length(x1)==1) c(NA, x1) else x1})) [,1] [,2] [1,] "TE MA CRUZ" "ABEL" [2,] "JOSE AN" "ANDRA" [3,] "AL" "MIGUEL" [4,] NA "ALV" [5,] "Farnoosh" "Sheikhi" [6,] "D" "Art" [7,] "Rob K" "Kim" On Monday, July 7, 2014 3:39 PM, farnoosh sheikhi <farnoosh_81@yahoo.com> wrote: Last names: TE MA CRUZ JOSE AN AL ALV Farnoosh On Monday, July 7, 2014 10:41 AM, arun <smartpink111@yahoo.com> wrote: Can you specify what should be the last name in the cases you showed? I thought First name will be "Te Ma Cruz", "Jose An", AL, and what about "ALV". On Monday, July 7, 2014 1:34 PM, farnoosh sheikhi <farnoosh_81@yahoo.com> wrote: 4: "TE MA CRUZ ABEL" 3: "JOSE AN ANDRA" 2: "AL MIGUEL" 1: "ALV" There is only 4 last name with one pattern and the pattern is "LastName FirstName" Thanks. Farnoosh On Monday, July 7, 2014 10:29 AM, arun <smartpink111@yahoo.com> wrote: Please show all the five patterns. Also, in this case, which one is the last name? On Monday, July 7, 2014 1:21 PM, farnoosh sheikhi <farnoosh_81@yahoo.com> wrote: Something like this: "DE LI TRUZ ANGEL J. Farnoosh On Monday, July 7, 2014 10:16 AM, arun <smartpink111@yahoo.com> wrote: Could you show the 5 patterns? Arun On Monday, July 7, 2014 12:44 PM, farnoosh sheikhi <farnoosh_81@yahoo.com> wrote: Hi Arun, Thanks for the email. It looks like I have 5 patterns, but most the names are 2 and 3 pattern. What is the best way to separate them? Thanks tons. On Friday, July 4, 2014 2:01 AM, arun <smartpink111@yahoo.com> wrote: Hi Farnoosh, You dont' have to go through all the lines. You can try: library(stringr) vec1 <- c("Arun Kirshna Sasikala Appukuttan", "Farnoosh Sheikhi", "D Art") sapply(str_match_all(vec1, '\\S+'),length) #[1] 4 2 2 which(sapply(str_match_all(vec1, '\\S+'),length)>2) #gives the index of lines or strings with >2 words #[1] 1 Arun On Friday, July 4, 2014 2:39 AM, farnoosh sheikhi <farnoosh_81@yahoo.com> wrote: Hi Arun, I have about 20000 names. So hard to go through every line. I mostly have space between first and last name. Is there a way to check that by writing codes? Sent from Yahoo Mail on Android ________________________________ Subject: Re: Seprate last name and first name into two columns Sent: Fri, Jul 4, 2014 2:03:13 AM Hi Farnoosh, In cases with multiple first or last names, For e.g. "Rob K Kim", here, last name should be Kim. But, for "Arun Kirshna Sasikala Appukuttan", the last name would be "Sasikala Appukuttan". Do you have any other pattern to identify the first and last names? Arun wrote: Hi Arun, Hope all is well. I have a data set that last name and first name recorded into one column with space between them. Some thing like this: data<-as.data.frame(c("Farnoosh Sheikhi", "D Art", "Rob K Kim")) data colnames(data)<-"Name" I want to separate them into two columns. Thanks a lot and happy 4th in advance. Farnoosh [[alternative HTML version deleted]]