Dear all, I have an ASCII file where records are separated by a blank. I would like to read those data; however, only the data in rows 1, 3, 5, 7, ... are important; the other lines (2,4,6,8,....) contain no useful information for me. So far I used awk/gawk to do it: gawk '{if ((FNR % 2) != 0) {print $0}}' infile.txt > outfile.txt What is the recommended way to accomplish this in R? Simply reading the whole file, and deleting all the even-numbered lines is not straightforward since these lines have different length (whereas lines 1,3,5,7,... have the same length). I 'RSiteSearched' for "read every second line from a file" but this search did not yield the desired result. Also trying out the arguments nrows and skip from read.table() did not help. Maybe someone knows an easy way to do it from within R? -- of course not using system("gawk ....") :-) If not, it does not matter too much since I get the job done easily with awk. Thanks, Roland [[alternative HTML version deleted]]
Hi, You can start by reading all lines from your file: lines <- readLines(pathtoyourfile) then keep only odd lines: oddlines <- lines[seq (1, length(lines),2)] You have to split the line into fields, e.g res <- strsplit(oddlines, split = "\t") if you have tab as field seperator hth On 4/30/07, Roland Rau <roland.rproject@gmail.com> wrote:> > Dear all, > > I have an ASCII file where records are separated by a blank. I would like > to > read those data; however, only the data in rows 1, 3, 5, 7, ... are > important; the other lines (2,4,6,8,....) contain no useful information > for > me. > So far I used awk/gawk to do it: > gawk '{if ((FNR % 2) != 0) {print $0}}' infile.txt > outfile.txt > > What is the recommended way to accomplish this in R? > Simply reading the whole file, and deleting all the even-numbered lines is > not straightforward since these lines have different length (whereas lines > 1,3,5,7,... have the same length). > > I 'RSiteSearched' for "read every second line from a file" but this search > did not yield the desired result. > Also trying out the arguments nrows and skip from read.table() did not > help. > > Maybe someone knows an easy way to do it from within R? -- of course not > using system("gawk ....") :-) > If not, it does not matter too much since I get the job done easily with > awk. > > Thanks, > Roland > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Use readLines and then just use the odd numbered lines: x.in <- readLines(yourFile) x.in <- x.in[seq(1, length(x.in), 2)] # every 2nd line or just to make sure, only delete blank lines: x.in <- x.in[!(x.in == "")] On 4/30/07, Roland Rau <roland.rproject at gmail.com> wrote:> Dear all, > > I have an ASCII file where records are separated by a blank. I would like to > read those data; however, only the data in rows 1, 3, 5, 7, ... are > important; the other lines (2,4,6,8,....) contain no useful information for > me. > So far I used awk/gawk to do it: > gawk '{if ((FNR % 2) != 0) {print $0}}' infile.txt > outfile.txt > > What is the recommended way to accomplish this in R? > Simply reading the whole file, and deleting all the even-numbered lines is > not straightforward since these lines have different length (whereas lines > 1,3,5,7,... have the same length). > > I 'RSiteSearched' for "read every second line from a file" but this search > did not yield the desired result. > Also trying out the arguments nrows and skip from read.table() did not > help. > > Maybe someone knows an easy way to do it from within R? -- of course not > using system("gawk ....") :-) > If not, it does not matter too much since I get the job done easily with > awk. > > Thanks, > Roland > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?
Dear all, I just realized that I forgotten to write some kind of final email for this thread and to thank you for your help. It seems that the recommeneded procedure in such circumstances has three steps: 1) readLines() 2) select the desired lines 3) strsplit() Thanks Ferdinand, Jim, and Paul! Roland Roland Rau wrote:> Dear all, > > I have an ASCII file where records are separated by a blank. I would > like to read those data; however, only the data in rows 1, 3, 5, 7, ... > are important; the other lines (2,4,6,8,....) contain no useful > information for me. > So far I used awk/gawk to do it: > gawk '{if ((FNR % 2) != 0) {print $0}}' infile.txt > outfile.txt > > What is the recommended way to accomplish this in R? > Simply reading the whole file, and deleting all the even-numbered lines > is not straightforward since these lines have different length (whereas > lines 1,3,5,7,... have the same length). > > I 'RSiteSearched' for "read every second line from a file" but this > search did not yield the desired result. > Also trying out the arguments nrows and skip from read.table() did not > help. > > Maybe someone knows an easy way to do it from within R? -- of course not > using system("gawk ....") :-) > If not, it does not matter too much since I get the job done easily with > awk. > > Thanks, > Roland > > > >