thr3ads.net - R help - [R] Parsing Files in R (USGS StreamFlow data) [Oct 2009]

If this information is useful, please help other people find it:
Share via:

stephen sefick

2009-Oct-05 01:49 UTC

[R] Parsing Files in R (USGS StreamFlow data)

http://waterdata.usgs.gov/nwis/uv?format=rdb&period=7&site_no=021973269

I would like to be able to parse this file up:

I can do this
x <-
read.table("http://waterdata.usgs.gov/nwis/uv?format=rdb&period=7&site_no=021973269",
skip=26)

but If I add another gauge to this

x <-
read.table("http://waterdata.usgs.gov/nwis/uv?format=rdb&period=7&site_no=021973269,06018500",
skip=26)
It does not work because there are two files appended to each other.

It would be easy enough to write the code so that each individual
gauge would be read in as a different file, but is there a way to get
this information in using the commented part of the file to give the
headers?  This is probably a job for some other programing language
like perl, but I don't know perl.

any help would be very helpful.
regards,

-- 
Stephen Sefick

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods.  We are mammals, and have not exhausted the
annoying little problems of being mammals.

								-K. Mullis

Gabor Grothendieck

2009-Oct-05 02:14 UTC

head link

[R] Parsing Files in R (USGS StreamFlow data)

Its not completely clear what you want to preserve and what you want
to eliminate but try this:
> L <-
readLines("http://waterdata.usgs.gov/nwis/uv?format=rdb&period=7&site_no=021973269,06018500")
> L.USGS <- grep("^USGS", L, value = TRUE)
> DF <- read.table(textConnection(L.USGS), fill = TRUE)
> head(DF)    V1       V2         V3    V4   V5   V6   V7
1 USGS 21973269 2009-09-27 00:00 6.96 4990 0.00
2 USGS 21973269 2009-09-27 00:15 6.96 4990 0.00
3 USGS 21973269 2009-09-27 00:30 6.97 5000 0.01
4 USGS 21973269 2009-09-27 00:45 6.97 5000 0.00
5 USGS 21973269 2009-09-27 01:00 6.98 5010 0.00
6 USGS 21973269 2009-09-27 01:15 6.98 5010 0.00
> pat <- "^# +([0-9]+) +([0-9]+) +(.*)"
> L.DD <- grep(pat, L, value = TRUE)
> library(gsubfn)
> DD <- strapply(L.DD, pat, c, simplify = rbind)
> head(DD)     [,1] [,2]    [,3]
[1,] "01" "00065" "Gage height, feet"
[2,] "02" "00060" "Discharge, cubic feet per
second"
[3,] "03" "00045" "Precipitation, total, inches"
[4,] "02" "00065" "Gage height, feet"
[5,] "05" "00060" "Discharge, cubic feet per
second"


On Sun, Oct 4, 2009 at 9:49 PM, stephen sefick <ssefick at gmail.com>
wrote:>
http://waterdata.usgs.gov/nwis/uv?format=rdb&period=7&site_no=021973269
>
> I would like to be able to parse this file up:
>
> I can do this
> x <-
read.table("http://waterdata.usgs.gov/nwis/uv?format=rdb&period=7&site_no=021973269",
> skip=26)
>
> but If I add another gauge to this
>
> x <-
read.table("http://waterdata.usgs.gov/nwis/uv?format=rdb&period=7&site_no=021973269,06018500",
> skip=26)
> It does not work because there are two files appended to each other.
>
> It would be easy enough to write the code so that each individual
> gauge would be read in as a different file, but is there a way to get
> this information in using the commented part of the file to give the
> headers? ?This is probably a job for some other programing language
> like perl, but I don't know perl.
>
> any help would be very helpful.
> regards,
>
> --
> Stephen Sefick
>
> Let's not spend our time and resources thinking about things that are
> so little or so large that all they really do for us is puff us up and
> make us feel like gods. ?We are mammals, and have not exhausted the
> annoying little problems of being mammals.
>
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?-K. Mullis
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Rolf Turner

2009-Oct-05 02:20 UTC

head link

[R] Parsing Files in R (USGS StreamFlow data)

On 5/10/2009, at 2:49 PM, stephen sefick wrote:
> http://waterdata.usgs.gov/nwis/uv? 
> format=rdb&period=7&site_no=021973269
>
> I would like to be able to parse this file up:
>
> I can do this
> x <- read.table("http://waterdata.usgs.gov/nwis/uv? 
> format=rdb&period=7&site_no=021973269",
> skip=26)
>
> but If I add another gauge to this
>
> x <- read.table("http://waterdata.usgs.gov/nwis/uv? 
> format=rdb&period=7&site_no=021973269,06018500",
> skip=26)
> It does not work because there are two files appended to each other.
>
> It would be easy enough to write the code so that each individual
> gauge would be read in as a different file, but is there a way to get
> this information in using the commented part of the file to give the
> headers?  This is probably a job for some other programing language
> like perl, but I don't know perl.
>
> any help would be very helpful.
I'm completely clear what's going on here --- (a) I'm not sure what
you mean by
``using the commented part of the file to give the headers''; the  
headers
are not ``commented'', and (b) I'm puzzled by the fact that there
are
9 column
headers/field names, but only 7 columns/fields.

Be that as it were, here's what I'd do:

x <- read.table("http://waterdata.usgs.gov/nwis/uv? 
format=rdb&period=7&site_no=021973269",
                  skip=23,nrows=1,header=TRUE,check.names=FALSE)
y <- read.table("http://waterdata.usgs.gov/nwis/uv? 
format=rdb&period=7&site_no=021973269",skip=26)
names(y) <- names(x)[1:7]

This ***appears*** to give a reasonably sensible data frame.

Is this anything like what you want?

	cheers,

		Rolf Turner

######################################################################
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

Possibly Parallel Threads

Search for more maybe matching threads

R help - Oct 2009 - Parsing Files in R (USGS StreamFlow data)

[R] Parsing Files in R (USGS StreamFlow data)

[R] Parsing Files in R (USGS StreamFlow data)

[R] Parsing Files in R (USGS StreamFlow data)

Possibly Parallel Threads