RockO
2010-Dec-15 02:00 UTC
[R] How to apitalize leading letters & else of personal names?
Dear R world, Do you know about a function that would capitalize in the correct manner first and family names? I found in the cwhmisc only the CapLeading function, but it just does not do the job, taking care only to capitalize the first letter of a word. I am looking for a function that would recognize " |'|Mc|-" and capitalize the first letter following these characters. An example: names<-c("jean-francois st-john","helene o'donnel", "joe mcintyre") Desired result:>"Jean-Francois St-John" "Helene O'Donnel" "Joe McIntyre"Thanks, Rock -- View this message in context: r.789695.n4.nabble.com/How-to-apitalize-leading-letters-else-of-personal-names-tp3088336p3088336.html Sent from the R help mailing list archive at Nabble.com.
Ben Bolker
2010-Dec-15 15:07 UTC
[R] How to apitalize leading letters & else of personal names?
RockO <rock.ouimet <at> gmail.com> writes:> > > Dear R world, > > Do you know about a function that would capitalize in the correct manner > first and family names? > I found in the cwhmisc only the CapLeading function, but it just does not do > the job, taking care only to capitalize the first letter of a word. > > I am looking for a function that would recognize " |'|Mc|-" and capitalize > the first letter following these characters. > > An example: > names<-c("jean-francois st-john","helene o'donnel", "joe mcintyre") > > Desired result: > > >"Jean-Francois St-John" "Helene O'Donnel" "Joe McIntyre"This is pretty tricky. gsub() can do some pretty slick things, including replace with capitalized versions, so you could probably write a gsub string to capitalize letters appearing at the beginning of words OR after non-alphabetic characters. (See the end of the examples in ?gsub ...) "McIntyre" represents a whole other class of difficulty. Some Scots capitalize after "Mc", others don't. And what about all the rules about capitalization (or not) after de/du/van/von? What would you do with a Dutch name like "'t Hooft" ... ?
David Winsemius
2010-Dec-15 18:17 UTC
[R] How to apitalize leading letters & else of personal names?
On Dec 14, 2010, at 9:00 PM, RockO wrote:> > Dear R world, > > Do you know about a function that would capitalize in the correct > manner > first and family names? > I found in the cwhmisc only the CapLeading function, but it just > does not do > the job, taking care only to capitalize the first letter of a word. > > I am looking for a function that would recognize " |'|Mc|-" and > capitalize > the first letter following these characters. > > An example: > names<-c("jean-francois st-john","helene o'donnel", "joe mcintyre") > > Desired result: > >> "Jean-Francois St-John" "Helene O'Donnel" "Joe McIntyre"Here are four individually crafted gsub functions that could be serially applied: > gsub("^([a-z])", "\\U\\1", names, perl=TRUE) [1] "Jean-francois st-john" "Helene o'donnel" "Joe mcintyre" > gsub(" ([a-z])", " \\U\\1", names, perl=TRUE) [1] "jean-francois St-john" "helene O'donnel" "joe Mcintyre" > gsub("\\-([a-z])", "-\\U\\1", names, perl=TRUE) [1] "jean-Francois st-John" "helene o'donnel" "joe mcintyre" > gsub("\\'([a-z])", "'\\U\\1", names, perl=TRUE) [1] "jean-francois st-john" "helene o'Donnel" "joe mcintyre" > t2 <- gsub("^([a-z])", "\\U\\1", names, perl=TRUE) > t2 <- gsub(" ([a-z])", " \\U\\1", t2, perl=TRUE) > t2 <- gsub("\\-([a-z])", "-\\U\\1", t2, perl=TRUE) > t2 <- gsub("\\'([a-z])", "'\\U\\1", t2, perl=TRUE) > t2 [1] "Jean-Francois St-John" "Helene O'Donnel" "Joe Mcintyre" Oooops forgot the mc: > gsub("Mc([a-z])", "Mc\\U\\1", t2, perl=TRUE) [1] "Jean-Francois St-John" "Helene O'Donnel" "Joe McIntyre" -- David.> > Thanks, > > Rock > > > -- > View this message in context: r.789695.n4.nabble.com/How-to-apitalize-leading-letters-else-of-personal-names-tp3088336p3088336.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT
RockO
2010-Dec-15 18:26 UTC
[R] How to apitalize leading letters & else of personal names?
David, Thank you very much! Indeed Capitalizing names is very tricky, particularly for people not having English -mother language (as I am). Hopefully, Using your script will much better than simply having names in uppercase. Happy Holidays! Rock -- View this message in context: r.789695.n4.nabble.com/How-to-apitalize-leading-letters-else-of-personal-names-tp3088336p3089585.html Sent from the R help mailing list archive at Nabble.com.