Hello, I have a data.frame with a column that I would like to split into based around the delimiter ":". This is a useful feature in Excel. I cannot work out the best way to do it in R. I am sure you need to use strsplit, but that returns a list. The problem is that some values in the column do not contain a ":" so should have a "NA" in the second column of the result, and this makes doing an unlist a non-starter. Any ideas? Many thanks Daniel Brewer The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the a...{{dropped:2}}
try the following: strg <- c("123:abc", "qwe:789f", "abcde", "a:fd", "567") sapply(strsplit(strg, ":"), function(x){ if (length(x) == 1) x <- c(x, NA) x }) I hope it helps. Best, Dimitris ---- Dimitris Rizopoulos Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm ----- Original Message ----- From: "Daniel Brewer" <daniel.brewer at icr.ac.uk> To: <r-help at stat.math.ethz.ch> Sent: Tuesday, March 04, 2008 2:54 PM Subject: [R] Best way to strsplit a column> Hello, > > I have a data.frame with a column that I would like to split into > based > around the delimiter ":". This is a useful feature in Excel. I > cannot > work out the best way to do it in R. I am sure you need to use > strsplit, but that returns a list. The problem is that some values > in > the column do not contain a ":" so should have a "NA" in the second > column of the result, and this makes doing an unlist a non-starter. > > Any ideas? > > Many thanks > > Daniel Brewer > > The Institute of Cancer Research: Royal Cancer Hospital, a > charitable Company Limited by Guarantee, Registered in England under > Company No. 534147 with its Registered Office at 123 Old Brompton > Road, London SW7 3RP. > > This e-mail message is confidential and for use by the...{{dropped:14}}
Hi Daniel, After using strsplit() you can call a user-written function to extend the length of each list element to a uniform value and then use do.call() with rbind. For instance,> txt <- "1:2+ 3:4 + 5 + 6:7"> x<- readLines(textConnection(txt)) > > f <- function(x)+ do.call(rbind, + lapply(x,function(a,n) c(a,rep(NA,n-length(a))), + n = max(sapply(x,length))))> f(strsplit(x,":"))[,1] [,2] [1,] "1" "2" [2,] "3" "4" [3,] "5" NA [4,] "6" "7" Hope this helps, ST ----- Original Message ---- From: Daniel Brewer <daniel.brewer at icr.ac.uk> To: r-help at stat.math.ethz.ch Sent: Tuesday, March 4, 2008 5:54:51 AM Subject: [R] Best way to strsplit a column Hello, I have a data.frame with a column that I would like to split into based around the delimiter ":". This is a useful feature in Excel. I cannot work out the best way to do it in R. I am sure you need to use strsplit, but that returns a list. The problem is that some values in the column do not contain a ":" so should have a "NA" in the second column of the result, and this makes doing an unlist a non-starter. Any ideas? Many thanks Daniel Brewer The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the a...{{dropped:16}}
Or using the same strg as below: read.table(textConnection(strg), sep = ":", fill = TRUE, as.is = TRUE, na.strings = "") On Tue, Mar 4, 2008 at 9:06 AM, Dimitris Rizopoulos <dimitris.rizopoulos at med.kuleuven.be> wrote:> try the following: > > strg <- c("123:abc", "qwe:789f", "abcde", "a:fd", "567") > sapply(strsplit(strg, ":"), function(x){ > if (length(x) == 1) x <- c(x, NA) > x > }) > > > I hope it helps. > > Best, > Dimitris > > ---- > Dimitris Rizopoulos > Biostatistical Centre > School of Public Health > Catholic University of Leuven > > Address: Kapucijnenvoer 35, Leuven, Belgium > Tel: +32/(0)16/336899 > Fax: +32/(0)16/337015 > Web: http://med.kuleuven.be/biostat/ > http://www.student.kuleuven.be/~m0390867/dimitris.htm > > > ----- Original Message ----- > From: "Daniel Brewer" <daniel.brewer at icr.ac.uk> > To: <r-help at stat.math.ethz.ch> > Sent: Tuesday, March 04, 2008 2:54 PM > Subject: [R] Best way to strsplit a column > > > > Hello, > > > > I have a data.frame with a column that I would like to split into > > based > > around the delimiter ":". This is a useful feature in Excel. I > > cannot > > work out the best way to do it in R. I am sure you need to use > > strsplit, but that returns a list. The problem is that some values > > in > > the column do not contain a ":" so should have a "NA" in the second > > column of the result, and this makes doing an unlist a non-starter. > > > > Any ideas? > > > > Many thanks > > > > Daniel Brewer > > > > The Institute of Cancer Research: Royal Cancer Hospital, a > > charitable Company Limited by Guarantee, Registered in England under > > Company No. 534147 with its Registered Office at 123 Old Brompton > > Road, London SW7 3RP. > > > > This e-mail message is confidential and for use by the...{{dropped:14}} > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >