R-help, I have a data frame which contains a character string column that is something like; II11 II18 II23 III1 III13 III16 III19 III2 III7 IV10 IV11 IV12 IX16 IX4 V12 V18 V2 V20 V23 V4 VII14 VII18 VII21 VII26 VII28 VII33 VII4 VII48 VII5 .... .... .... I want to apply a function (e.g mean) by grouping according to the roman part of the string, i.e, by I by V by VII ... ... and so on. I have looked at string manipulation functions (grep, pmatch,,,) but I can't really get it the way I want. Can anyone help? Thanks in advance.
Dear Luis, How about gsub("[0-9]", "", x) ? This assumes that x contains the character data and not a factor, as would usually be the case in a data frame. If the variable is really a factor, then use as.character(x) in the call to gsub(). I hope this helps, John -------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox --------------------------------> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Luis Ridao Cruz > Sent: Thursday, October 20, 2005 8:24 AM > To: r-help at stat.math.ethz.ch > Subject: [R] String manipulation > > R-help, > > I have a data frame which contains a character string column > that is something like; > > II11 > II18 > II23 > III1 > III13 > III16 > III19 > III2 > III7 > IV10 > IV11 > IV12 > IX16 > IX4 > V12 > V18 > V2 > V20 > V23 > V4 > VII14 > VII18 > VII21 > VII26 > VII28 > VII33 > VII4 > VII48 > VII5 > .... > .... > .... > > I want to apply a function (e.g mean) by grouping according > to the roman part of the string, i.e, > > by I > by V > by VII > ... > ... > and so on. > > I have looked at string manipulation functions (grep, > pmatch,,,) but I can't really get it the way I want. > Can anyone help? > > Thanks in advance. > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html
you could use "gsub()", i.e., strg <- c("II11", "II18", "II23", "III1", "III13", "III16", "III19", "III2", "III7", "IV10", "IV11", "IV12") ######### x <- as.numeric(gsub("[^0-9]", "", strg)) y <- gsub("[0-9]", "", strg) tapply(x, y, mean) I hope it helps. Best, Dimitris ---- Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://www.med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm ----- Original Message ----- From: "Luis Ridao Cruz" <Luisr at frs.fo> To: <r-help at stat.math.ethz.ch> Sent: Thursday, October 20, 2005 3:23 PM Subject: [R] String manipulation> R-help, > > I have a data frame which contains a character string column that is > something like; > > II11 > II18 > II23 > III1 > III13 > III16 > III19 > III2 > III7 > IV10 > IV11 > IV12 > IX16 > IX4 > V12 > V18 > V2 > V20 > V23 > V4 > VII14 > VII18 > VII21 > VII26 > VII28 > VII33 > VII4 > VII48 > VII5 > .... > .... > .... > > I want to apply a function (e.g mean) by grouping according to the > roman part of the string, i.e, > > by I > by V > by VII > ... > ... > and so on. > > I have looked at string manipulation functions (grep, pmatch,,,) but > I > can't really get it the way I want. > Can anyone help? > > Thanks in advance. > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm