downtowater
2012-Nov-29 14:43 UTC
[R] splitting a string by space except when contained within quotes
I've been trying to split a space delimited string with double-quotes in R for some time but without success. An example of a string is as follows: /rainfall snowfall "Channel storage" "Rivulet storage"/ It's important for us because these are column headings that must match the subsequent data. Here is some code I've been trying: str <- 'rainfall snowfall "Channel storage" "Rivulet storage"' regex <- "[^\\s\"']+|\"([^\"]*)\"" split <- strsplit(str, regex, perl=T) what I would like is [1] "rainfall" "snowfall" "Channel storage" "Rivulet storage" but what I get is: [1] "" " " " " " " The vector is the right length (which is encouraging) but of course the strings are empty or contain a single space. Any suggestions? Thanks in advance! -- View this message in context: http://r.789695.n4.nabble.com/splitting-a-string-by-space-except-when-contained-within-quotes-tp4651286.html Sent from the R help mailing list archive at Nabble.com.
William Dunlap
2012-Nov-29 15:50 UTC
[R] splitting a string by space except when contained within quotes
Try using scan(quote='"', ...), as in the following > str <- 'rainfall snowfall "Channel storage" "Rivulet storage"' > scan(text=str, what="", quote='"', quiet=TRUE) [1] "rainfall" "snowfall" "Channel storage" "Rivulet storage" Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf > Of downtowater > Sent: Thursday, November 29, 2012 6:44 AM > To: r-help at r-project.org > Subject: [R] splitting a string by space except when contained within quotes > > I've been trying to split a space delimited string with double-quotes in R > for some time but without success. An example of a string is as follows: > > /rainfall snowfall "Channel storage" "Rivulet storage"/ > > It's important for us because these are column headings that must match the > subsequent data. > > Here is some code I've been trying: > > str <- 'rainfall snowfall "Channel storage" "Rivulet storage"' > regex <- "[^\\s\"']+|\"([^\"]*)\"" > split <- strsplit(str, regex, perl=T) > what I would like is > > [1] "rainfall" "snowfall" "Channel storage" "Rivulet storage" > > but what I get is: > > [1] "" " " " " " " > > The vector is the right length (which is encouraging) but of course the > strings are empty or contain a single space. Any suggestions? > > Thanks in advance! > > > > -- > View this message in context: http://r.789695.n4.nabble.com/splitting-a-string-by- > space-except-when-contained-within-quotes-tp4651286.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Gabor Grothendieck
2012-Nov-29 16:06 UTC
[R] splitting a string by space except when contained within quotes
On Thu, Nov 29, 2012 at 9:43 AM, downtowater <downtowater at yahoo.ca> wrote:> I've been trying to split a space delimited string with double-quotes in R > for some time but without success. An example of a string is as follows: > > /rainfall snowfall "Channel storage" "Rivulet storage"/ > > It's important for us because these are column headings that must match the > subsequent data. > > Here is some code I've been trying: > > str <- 'rainfall snowfall "Channel storage" "Rivulet storage"' > regex <- "[^\\s\"']+|\"([^\"]*)\"" > split <- strsplit(str, regex, perl=T) > what I would like is > > [1] "rainfall" "snowfall" "Channel storage" "Rivulet storage" > > but what I get is: > > [1] "" " " " " " " > > The vector is the right length (which is encouraging) but of course the > strings are empty or contain a single space. Any suggestions?Try this:> scan(con <- textConnection(str), what = "")Read 4 items [1] "rainfall" "snowfall" "Channel storage" "Rivulet storage"> close(con)email: ggrothendieck at gmail.com
arun
2012-Nov-29 16:07 UTC
[R] splitting a string by space except when contained within quotes
Hi, May be this helps: str1 <- 'rainfall snowfall "Channel storage" "Rivulet storage"' res<-unlist(strsplit(gsub("[\"]","",str1)," ")) ?res1<-c(res[1],res[2],paste(res[3],res[4],""),paste(res[5],res[6],collapse="")) ?res1 #[1] "rainfall"???????? "snowfall"???????? "Channel storage " "Rivulet storage" A.K. ----- Original Message ----- From: downtowater <downtowater at yahoo.ca> To: r-help at r-project.org Cc: Sent: Thursday, November 29, 2012 9:43 AM Subject: [R] splitting a string by space except when contained within quotes I've been trying to split a space delimited string with double-quotes in R for some time but without success. An example of a string is as follows: /rainfall snowfall "Channel storage" "Rivulet storage"/ It's important for us because these are column headings that must match the subsequent data. Here is some code I've been trying: str <- 'rainfall snowfall "Channel storage" "Rivulet storage"' regex <- "[^\\s\"']+|\"([^\"]*)\"" split <- strsplit(str, regex, perl=T) what I would like is [1] "rainfall" "snowfall" "Channel storage" "Rivulet storage" but what I get is: [1] ""? " " " " " " The vector is the right length (which is encouraging) but of course the strings are empty or contain a single space. Any suggestions? Thanks in advance! -- View this message in context: http://r.789695.n4.nabble.com/splitting-a-string-by-space-except-when-contained-within-quotes-tp4651286.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
arun
2012-Nov-29 16:23 UTC
[R] splitting a string by space except when contained within quotes
Hi, You could also do this: ?res<-unlist(strsplit(str,"[\"]")) ?res1<-res[res!=" "] res2<-c(unlist(strsplit(res1[grepl("\\s+$",res1)]," ")),res1[!grepl("\\s+$",res1)]) res2 #[1] "rainfall"??????? "snowfall"??????? "Channel storage" "Rivulet storage" A.K. ----- Original Message ----- From: downtowater <downtowater at yahoo.ca> To: r-help at r-project.org Cc: Sent: Thursday, November 29, 2012 9:43 AM Subject: [R] splitting a string by space except when contained within quotes I've been trying to split a space delimited string with double-quotes in R for some time but without success. An example of a string is as follows: /rainfall snowfall "Channel storage" "Rivulet storage"/ It's important for us because these are column headings that must match the subsequent data. Here is some code I've been trying: str <- 'rainfall snowfall "Channel storage" "Rivulet storage"' regex <- "[^\\s\"']+|\"([^\"]*)\"" split <- strsplit(str, regex, perl=T) what I would like is [1] "rainfall" "snowfall" "Channel storage" "Rivulet storage" but what I get is: [1] ""? " " " " " " The vector is the right length (which is encouraging) but of course the strings are empty or contain a single space. Any suggestions? Thanks in advance! -- View this message in context: http://r.789695.n4.nabble.com/splitting-a-string-by-space-except-when-contained-within-quotes-tp4651286.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.