Hi all, I my c: drive I have possibly 1,000 notepad files, with .txt extension. They are named as the dates on which they were saved i.e. 1st file name is "Volume_4-18-2008", 2nd one is "Volume_4-21-2008", 3rd one "Volume_4-22-2008" and so on............ Also, content of each file are in same format like : ******** content of 1st file ************* section : 1 ----- --------- ---------- ----------- ----- --------- ---------- ----------- ----- --------- ---------- ----------- ----- --------- ---------- ----------- section : 2 ----- --------- ---------- ----------- ----- --------- ---------- ----------- ----- --------- ---------- ----------- ----- --------- ---------- ----------- section : 3 ----- --------- ---------- ----------- ----- --------- ---------- ----------- ----- --------- ---------- ----------- ----- --------- ---------- ----------- section : 4 ----- --------- ---------- ----------- ----- --------- ---------- ----------- ----- --------- ---------- ----------- ----- --------- ---------- ----------- Here all files have 4-sections, just like shown here but contents within each section (i.e. dashed line here) differs file to file. What I have to do is I have to fetch contents of "section : 2" from each file and then save it to a R-object, matrix of list for further analysis. Can you ppl please tell me how to do that? Thanks and regards, -- View this message in context: http://www.nabble.com/How-to-fetch-specific-part-from-a-number-of-Text-files--tp21011017p21011017.html Sent from the R help mailing list archive at Nabble.com.
Charles C. Berry
2008-Dec-15 18:37 UTC
[R] How to fetch specific part from a number of Text files?
On Mon, 15 Dec 2008, megh wrote:> > Hi all, > > I my c: drive I have possibly 1,000 notepad files, with .txt extension. They > are named as the dates on which they were saved i.e. 1st file name is > "Volume_4-18-2008", 2nd one is "Volume_4-21-2008", 3rd one > "Volume_4-22-2008" and so on............ > > Also, content of each file are in same format like : > > ******** content of 1st file ************* > section : 1 > ----- --------- ---------- ----------- > ----- --------- ---------- ----------- > ----- --------- ---------- ----------- > ----- --------- ---------- ----------- > section : 2 > ----- --------- ---------- ----------- > ----- --------- ---------- ----------- > ----- --------- ---------- ----------- > ----- --------- ---------- ----------- > section : 3 > ----- --------- ---------- ----------- > ----- --------- ---------- ----------- > ----- --------- ---------- ----------- > ----- --------- ---------- ----------- > section : 4 > ----- --------- ---------- ----------- > ----- --------- ---------- ----------- > ----- --------- ---------- ----------- > ----- --------- ---------- ----------- > > Here all files have 4-sections, just like shown here but contents within > each section (i.e. dashed line here) differs file to file. > > What I have to do is I have to fetch contents of "section : 2" from each > file and then save it to a R-object, matrix of list for further analysis. > > Can you ppl please tell me how to do that?Here is the outline: *) use list.files() or Sys.glob() to get a list of the files *) write a function that takes the file name as its arg, uses readLines() to swallow the text and uses grep() to find the 'section' lines. Then put the 'dashes' in between two section lines into a separate object (say, dash.lines). Then use as.matrix( read.table(con <- textConnection( dash.lines ) ) close(con) to get the numeric values or maybe sapply( strsplit(dash.lines, "[ ]+"), as.numeric) *) debug this on one file *) use lapply to step thru the list of file names. See ?list.files ?Sys.glob ?readLines ?grep ?textConnection ?strsplit ?sapply HTH, Chuck> > Thanks and regards, > -- > View this message in context: http://www.nabble.com/How-to-fetch-specific-part-from-a-number-of-Text-files--tp21011017p21011017.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cberry at tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
Augusto.Sanabria at ga.gov.au
2008-Dec-15 23:44 UTC
[R] How to fetch specific part from a number of Text files? [SEC=UNCLASSIFIED]
Megh, You can capture all your external files into R using: All_files <- dir(pattern="txt") Then read files one by one and insert the contents from section 2, say, in line 10, to section 3, say in line 40, into file "cont": no_files <- length(All_files) cont <- vector("list",no_files) for(i in 1:no_files)cont[[i]] <- read.csv(files[i],skip=10,nrows=40) Now the 1000 files "cont" contain 'section 2' of all your external files. This is an effective but not very elegant way to do what you want. Hope it helps, Augusto -------------------------------------------- Augusto Sanabria. MSc, PhD. Mathematical Modeller Risk & Impact Analysis Group Geospatial & Earth Monitoring Division Geoscience Australia (www.ga.gov.au) Cnr. Jerrabomberra Av. & Hindmarsh Dr. Symonston ACT 2601 Ph. (02) 6249-9155 -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of megh Sent: Monday, 15 December 2008 8:54 To: r-help at r-project.org Subject: [R] How to fetch specific part from a number of Text files? Hi all, I my c: drive I have possibly 1,000 notepad files, with .txt extension. They are named as the dates on which they were saved i.e. 1st file name is "Volume_4-18-2008", 2nd one is "Volume_4-21-2008", 3rd one "Volume_4-22-2008" and so on............ Also, content of each file are in same format like : ******** content of 1st file ************* section : 1 ----- --------- ---------- ----------- ----- --------- ---------- ----------- ----- --------- ---------- ----------- ----- --------- ---------- ----------- section : 2 ----- --------- ---------- ----------- ----- --------- ---------- ----------- ----- --------- ---------- ----------- ----- --------- ---------- ----------- section : 3 ----- --------- ---------- ----------- ----- --------- ---------- ----------- ----- --------- ---------- ----------- ----- --------- ---------- ----------- section : 4 ----- --------- ---------- ----------- ----- --------- ---------- ----------- ----- --------- ---------- ----------- ----- --------- ---------- ----------- Here all files have 4-sections, just like shown here but contents within each section (i.e. dashed line here) differs file to file. What I have to do is I have to fetch contents of "section : 2" from each file and then save it to a R-object, matrix of list for further analysis. Can you ppl please tell me how to do that? Thanks and regards, -- View this message in context: http://www.nabble.com/How-to-fetch-specific-part-from-a-number-of-Text-files--tp21011017p21011017.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.