Quicke, Donald L J
2006-Aug-19 12:54 UTC
[R] need to find (and distinguish types of) carriage returns in a file that is scanned using scan
Hope this is not too trivial I am reading a large file using scan. In one part of this file there is a chunk of text within which i need to know the positions of line breaks. But scan seems only An example of the file is: " a 0 1 0 bftt 020 cftt T 1 R a 0 1 2 1 2 b 0 1 2 2 2 c 0 10 00 " so precisely i need in the scanned file in R to know where each carriage return is in the file so that i can then identify the text strings (i.e. a, bftt, cftt, a, b, c ) that immediately follow the carriage return On a subsidiary matter, it would be very helpful if i could distinguish between Unix, Dos, and Mac carriage returns in the data file thanks i should note also, that the input file contains much other stuff and is not in the form of a table that can be read using read.table or other read version. Nor do i know beforehand how many elements there are in each line Donald [[alternative HTML version deleted]]
Prof Brian Ripley
2006-Aug-19 15:16 UTC
[R] need to find (and distinguish types of) carriage returns in a file that is scanned using scan
On Sat, 19 Aug 2006, Quicke, Donald L J wrote:> Hope this is not too trivial > I am reading a large file using scan.Why scan?> In one part of this file there is a chunk of text within which i need to > know the positions of line breaks. But scan seems onlyonly what?> An example of the file is: > " > a 0 1 0 > bftt 020 > cftt T 1 R > > a 0 1 2 1 2 > b 0 1 2 2 2 > c 0 10 00 > " > > so precisely i need in the scanned file in R to know where each carriage > return is in the file so that i can then identify the text strings (i.e. > a, bftt, cftt, a, b, c ) that immediately follow the carriage returnSounds like a job for readLines.> On a subsidiary matter, it would be very helpful if i could distinguish > between Unix, Dos, and Mac carriage returns in the data fileAFAIK there is only type of carriage return character (ASCII code Ctrl-M). If you mean between CRLF, LF and perhaps CR line endings, you need to read the files as raw bytes since R's text mode regards all three as equally a line ending. But that can perfectly well be done using binary-mode connections.> > thanks > > i should note also, that the input file contains much other stuff and is > not in the form of a table that can be read using read.table or other > read version. Nor do i know beforehand how many elements there are in > each lineSounds like a job for connections ...> [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.PLEASE do as we ask. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595