I would put the data into a 'long' instead of 'wide' format
since you
say you have files of different lengths. I took you data and
replicated it 3 time and changed the file name for the duration:
> fileNames <- Sys.glob('/da_zone*') # files to process
> result <- lapply(fileNames, function(.file){
+ # read in data after skipping 11 lines
+ .input <- read.csv(.file, skip=11)
+ # extract the duration from file name
+ .dur <- sub(".*_([[:digit:]]+)hr_.*", "\\1", .file,
perl=TRUE)
+ # add to the data frame
+ .input$dur <- .dur
+ .input
+ })> # put into a single data.frame
> do.call(rbind, result)
avgppt areasqmi dur
1 7.67 0 15
2 7.60 1 15
3 7.52 5 15
4 7.32 10 15
5 6.91 20 15
6 5.90 50 15
7 5.02 100 15
8 4.09 200 15
9 3.55 300 15
10 2.96 500 15
11 2.27 1000 15
12 1.64 2000 15
13 0.82 5000 15
14 0.77 5360 15
15 7.67 0 1
16 7.60 1 1
17 7.52 5 1
18 7.32 10 1
19 6.91 20 1
20 5.90 50 1
21 5.02 100 1
22 4.09 200 1
23 3.55 300 1
24 2.96 500 1
25 2.27 1000 1
26 1.64 2000 1
27 0.82 5000 1
28 0.77 5360 1
29 7.67 0 3
30 7.60 1 3
31 7.52 5 3
32 7.32 10 3
33 6.91 20 3
34 5.90 50 3
35 5.02 100 3
36 4.09 200 3
37 3.55 300 3
38 2.96 500 3
39 2.27 1000 3
40 1.64 2000 3
41 0.82 5000 3
42 0.77 5360 3
On Tue, Sep 1, 2009 at 4:24 PM, Douglas M.
Hultstrand<dmhultst at metstat.com> wrote:> Hello,
>
> I am fairly new to R programming and am stuck with the following problem.
>
> I am trying to read in multiple files (see attached file or at end of
> email), the files all have the same general header information and
different
> precipitation (avgppt) and area (areasqmi) values. ?Some times the number
of
> records are different in the files.
>
> I want to read in all files (.stdsummary), and create a dataframe that
> contains the area and precipitation for each file (files are different
> duration), and supply a header name that represents the duration (sixth
line
> down in header information or extracted from data file
> "da_zone1_15hr_1166.stdsummary").
> For example, this is what the final dataframe would look like for 1hr, 3hr,
> and 15hr datafiles:
> 1hrppt ? ? ?1hrarea ? ?3hrppt ? ? ?3hrarea ? ?15hrppt ? ? ?15hrarea 3.8 ?
?0
> ? ?6.86 ? ?0 ? ?7.67 ? ?0
> 3.71 ? ?1 ? ?6.78 ? ?1 ? ?7.6 ? ?1
> 3.69 ? ?5 ? ?6.72 ? ?5 ? ?7.52 ? ?5
> 3.56 ? ?10 ? ?6.55 ? ?10 ? ?7.32 ? ?10
> 3.33 ? ?20 ? ?6.17 ? ?20 ? ?6.91 ? ?20
> 2.87 ? ?50 ? ?5.25 ? ?50 ? ?5.9 ? ?50
> 2.45 ? ?100 ? ?4.35 ? ?100 ? ?5.02 ? ?100
> 1.94 ? ?200 ? ?3.34 ? ?200 ? ?4.09 ? ?200
> 1.67 ? ?300 ? ?2.78 ? ?300 ? ?3.55 ? ?300
>
> The end result is to perform QC statistics and then plot each set of data.
> ?Also, is there away to create a dataframe that has different # of records?
>
> Datafile example of file below:
>
> Storm number: 1166
> Zone number: 1 (ALL zones)
> Number of stations: 172
> Total analyzed area (sq mi): ? ? 5360.8
> Average station density (stns per 1000 sq mi): ? na
> Duration window (hours): 15
> CPP beg hour index: 1
> CPP end hour index: 15
> Ishohyet interval step (inches): 0.2
> Standard area size summary
> Begin run date/time: Tue Aug 25 01:17:43 2009
> avgppt, ?areasqmi
> 00007.67,0000000.00
> 00007.60,0000001.00
> 00007.52,0000005.00
> 00007.32,0000010.00
> 00006.91,0000020.00
> 00005.90,0000050.00
> 00005.02,0000100.00
> 00004.09,0000200.00
> 00003.55,0000300.00
> 00002.96,0000500.00
> 00002.27,0001000.00
> 00001.64,0002000.00
> 00000.82,0005000.00
> 00000.77,0005360.00
>
> --
> ---------------------------------
> Douglas M. Hultstrand, MS
> Senior Hydrometeorologist
> Metstat, Inc. Windsor, Colorado
> voice: 970.686.1253
> email: dmhultst at metstat.com
> web: http://www.metstat.com
> ---------------------------------
>
>
> Storm number: 1166
> Zone number: 1 (ALL zones)
> Number of stations: 172
> Total analyzed area (sq mi): ? ? 5360.8
> Average station density (stns per 1000 sq mi): ? na
> Duration window (hours): 15
> CPP beg hour index: 1
> CPP end hour index: 15
> Ishohyet interval step (inches): 0.2
> Standard area size summary
> Begin run date/time: Tue Aug 25 01:17:43 2009
> avgppt, ?areasqmi
> 00007.67,0000000.00
> 00007.60,0000001.00
> 00007.52,0000005.00
> 00007.32,0000010.00
> 00006.91,0000020.00
> 00005.90,0000050.00
> 00005.02,0000100.00
> 00004.09,0000200.00
> 00003.55,0000300.00
> 00002.96,0000500.00
> 00002.27,0001000.00
> 00001.64,0002000.00
> 00000.82,0005000.00
> 00000.77,0005360.00
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?