Hi R-helpers, I need to read some file with different lines (I don't know the number of lines to skip) and I would like to find a way to start reading the data.frame from the word "source". ex: djhsafk asdfhkjash shdfjkash asfhjkash #those lines contain numbers and words, I want to skip then but they have different sizes asdfhjkash asdfhjksa source tret 2 res 3 Can anybody help? Thanks -- View this message in context: http://r.789695.n4.nabble.com/How-to-scan-df-from-a-specific-word-tp3019841p3019841.html Sent from the R help mailing list archive at Nabble.com.
Sorry, the explanation wasn't very good...just to explain better. I am writing a loop to read and process different files in the same script. And what I want to load into a variable is a data.frame that is above the word source in all of my files. So I would like to recognize the word Source in the text file....and read the table bellow source until the next blank line (the file has more written stuff bellow the data frame that I want to read too) Here is an example of the file. I want the df to read from source until the blank line right above the words "Analysis of Variance Notice: 37 singularities detected in design matrix. 1 LogL=-2664.01 S2= 1.0000 8367 df : 2 components constrained 2 LogL=-2269.45 S2= 1.0000 8367 df 3 LogL=-1698.47 S2= 1.0000 8367 df 4 LogL=-1252.72 S2= 1.0000 8367 df 5 LogL=-1013.52 S2= 1.0000 8367 df 6 LogL=-957.409 S2= 1.0000 8367 df 7 LogL=-944.252 S2= 1.0000 8367 df 8 LogL=-939.976 S2= 1.0000 8367 df 9 LogL=-938.908 S2= 1.0000 8367 df 10 LogL=-938.798 S2= 1.0000 8367 df 11 LogL=-938.795 S2= 1.0000 8367 df 12 LogL=-938.795 S2= 1.0000 8367 df Source Model terms Gamma Component Comp/SE % C Residual 8383 8367 at(type,1).Nfam 62 62 10.1131 10.1131 1.81 0 P at(type,2).Nfam 62 62 28.1153 28.1153 2.16 0 P rep.iblk 768 768 63.2919 63.2919 10.94 0 P at(type,1).Nfemale 44 44 29.9049 29.9049 2.93 0 P at(type,1).Nclone 2689 2689 109.560 109.560 12.66 0 P at(type,2).Nfemale 44 44 14.0305 14.0305 1.68 0 P Variance 0 0 479.040 479.040 36.23 0 P Variance 0 0 490.580 490.580 17.51 0 P Variance 0 0 469.932 469.932 36.51 0 P Variance 0 0 544.654 544.654 17.86 0 P Analysis of Variance NumDF F_inc 27 mu 1 5860.84 12 culture 1 0.07 10 type 1 29.59 28 culture.rep 6 14.06 30 culture.rep.type 7 2.17 36 at(type,1).Nfam 62 effects fitted -- View this message in context: http://r.789695.n4.nabble.com/How-to-scan-df-from-a-specific-word-tp3019841p3019844.html Sent from the R help mailing list archive at Nabble.com.
Sorry, the explanation wasn't very good...just to explain better. I am writing a loop to read and process every time a different file in the same script. And what I want to load into a variable each time is a data.frame that is bellow the word source in all of my files. So I would like to recognize the word Source in the text file....and read the table bellow source until the next blank line (the file has more written stuff bellow the data frame that I want to read too) Here is an example of the file. I want the df to read from source until the blank line right above the words "Analysis of Variance Notice: 37 singularities detected in design matrix. 1 LogL=-2664.01 S2= 1.0000 8367 df : 2 components constrained 2 LogL=-2269.45 S2= 1.0000 8367 df 3 LogL=-1698.47 S2= 1.0000 8367 df 4 LogL=-1252.72 S2= 1.0000 8367 df 5 LogL=-1013.52 S2= 1.0000 8367 df 6 LogL=-957.409 S2= 1.0000 8367 df 7 LogL=-944.252 S2= 1.0000 8367 df 8 LogL=-939.976 S2= 1.0000 8367 df 9 LogL=-938.908 S2= 1.0000 8367 df 10 LogL=-938.798 S2= 1.0000 8367 df 11 LogL=-938.795 S2= 1.0000 8367 df 12 LogL=-938.795 S2= 1.0000 8367 df Source Model terms Gamma Component Comp/SE % C Residual 8383 8367 at(type,1).Nfam 62 62 10.1131 10.1131 1.81 0 P at(type,2).Nfam 62 62 28.1153 28.1153 2.16 0 P rep.iblk 768 768 63.2919 63.2919 10.94 0 P at(type,1).Nfemale 44 44 29.9049 29.9049 2.93 0 P at(type,1).Nclone 2689 2689 109.560 109.560 12.66 0 P at(type,2).Nfemale 44 44 14.0305 14.0305 1.68 0 P Variance 0 0 479.040 479.040 36.23 0 P Variance 0 0 490.580 490.580 17.51 0 P Variance 0 0 469.932 469.932 36.51 0 P Variance 0 0 544.654 544.654 17.86 0 P Analysis of Variance NumDF F_inc 27 mu 1 5860.84 12 culture 1 0.07 10 type 1 29.59 28 culture.rep 6 14.06 30 culture.rep.type 7 2.17 36 at(type,1).Nfam 62 effects fitted -- View this message in context: http://r.789695.n4.nabble.com/How-to-scan-df-from-a-specific-word-tp3019841p3019846.html Sent from the R help mailing list archive at Nabble.com.
On Fri, Oct 29, 2010 at 6:34 PM, M.Ribeiro <mresendeufv at yahoo.com.br> wrote:> > Hi R-helpers, > > I need to read some file with different lines (I don't know the number of > lines to skip) and I would like to find a way to start reading the > data.frame from the word "source". > > ex: > > djhsafk > asdfhkjash > shdfjkash > asfhjkash ? ? ? ? #those lines contain numbers and words, I want to skip > then but they have different sizes > asdfhjkash > asdfhjksa > > source > tret 2 > res 3 >Here is a one line solution but it does make use of the external utility, gawk. If you using Linux you probably have it on your system already. You can also get gawk for Windows or if you download Duncan's Rtools distribution its included there too -- gawk.exe is just a single file so just make sure you put it somewhere on your PATH.> read.table(pipe('gawk "/Analysis of Variance/ {exit}; /Source/ {i++}; i" myfile.dat'), header = TRUE, fill = TRUE)Source Model terms Gamma Component Comp.SE X. C 1 Residual 8383 8367 NA NA NA NA 2 at(type,1).Nfam 62 62 10.1131 10.1131 1.81 0 P 3 at(type,2).Nfam 62 62 28.1153 28.1153 2.16 0 P 4 rep.iblk 768 768 63.2919 63.2919 10.94 0 P 5 at(type,1).Nfemale 44 44 29.9049 29.9049 2.93 0 P 6 at(type,1).Nclone 2689 2689 109.5600 109.5600 12.66 0 P 7 at(type,2).Nfemale 44 44 14.0305 14.0305 1.68 0 P 8 Variance 0 0 479.0400 479.0400 36.23 0 P 9 Variance 0 0 490.5800 490.5800 17.51 0 P 10 Variance 0 0 469.9320 469.9320 36.51 0 P 11 Variance 0 0 544.6540 544.6540 17.86 0 P -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com