Hello,
Instead of replacing "record_start" with newlines, you can split the
string by it. And use 'gsub' to make it prettier.
x <- readLines("test.txt")
x
y <- gsub("\"", "", x) # remove the double quotes
y <- unlist(strsplit(y, "record_start,")) # do the split, with
comma
# remove leading blanks and trailing blanks/commas
y <- gsub("^[[:blank:]]|[[:blank:]]$|,[[:blank:]]$", "",
y)
# see the result and save it to file
y
writeLines(text=y, con="result.txt")
Hope this helps,
Rui Barradas
Em 25-06-2012 10:17, Thomas escreveu:> I have a comma separated data file with no carriage returns and what
I'd
> like to do is
>
> 1. read the data as a block text
> 2. search for the string that starts each record "record_start",
and
> replace this with a carriage return. Replace will do, or just add a
> carriage return before it. The string is the same for each record, but
> it is enclosed in double quote marks in the file.
> 3. Write the results out as a csv file.
>
> Let's say file.text looked like this:
> "record_start", "data item 1", "data item 2",
"record_start", "data item
> 3", "data item 4"
> and I wanted:
> ,"data item 1", "data item 2",
> "data item 3", "data item 4"
>
> text <- readLines("file.txt",encoding="UTF-8")
> text <- gsub("record_start", "/n", text)
> write.csv(text, "file2.csv")
>
> This gives me "/n" in the text file, enclosed in the double
quotes that
> were there in the file around record_start already. Even if the double
> quotes weren't there though, I'm still not sure this would work.
(BTW, I
> can live with the first incorrect comma in the output file because I can
> just remove it manually.)
> Can anyone suggest a solution?
> Thank you,
> Thomas Chesney
>
> This message and any attachment are intended solely for the addressee
> and may contain confidential information. If you have received this
> message in error, please send it back to me, and immediately delete
> it. Please do not use, copy or disclose the information contained in
> this message or in any attachment. Any views or opinions expressed by
> the author of this email do not necessarily reflect the views of the
> University of Nottingham.
>
> This message has been checked for viruses but the contents of an attachment
> may still contain software viruses which could damage your computer system:
> you are advised to perform your own checks. Email communications with the
> University of Nottingham may be monitored as permitted by UK legislation.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.