http://waterdata.usgs.gov/nwis/uv?format=rdb&period=7&site_no=021973269 I would like to be able to parse this file up: I can do this x <- read.table("http://waterdata.usgs.gov/nwis/uv?format=rdb&period=7&site_no=021973269", skip=26) but If I add another gauge to this x <- read.table("http://waterdata.usgs.gov/nwis/uv?format=rdb&period=7&site_no=021973269,06018500", skip=26) It does not work because there are two files appended to each other. It would be easy enough to write the code so that each individual gauge would be read in as a different file, but is there a way to get this information in using the commented part of the file to give the headers? This is probably a job for some other programing language like perl, but I don't know perl. any help would be very helpful. regards, -- Stephen Sefick Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis
Its not completely clear what you want to preserve and what you want to eliminate but try this:> L <- readLines("http://waterdata.usgs.gov/nwis/uv?format=rdb&period=7&site_no=021973269,06018500")> L.USGS <- grep("^USGS", L, value = TRUE) > DF <- read.table(textConnection(L.USGS), fill = TRUE) > head(DF)V1 V2 V3 V4 V5 V6 V7 1 USGS 21973269 2009-09-27 00:00 6.96 4990 0.00 2 USGS 21973269 2009-09-27 00:15 6.96 4990 0.00 3 USGS 21973269 2009-09-27 00:30 6.97 5000 0.01 4 USGS 21973269 2009-09-27 00:45 6.97 5000 0.00 5 USGS 21973269 2009-09-27 01:00 6.98 5010 0.00 6 USGS 21973269 2009-09-27 01:15 6.98 5010 0.00> pat <- "^# +([0-9]+) +([0-9]+) +(.*)" > L.DD <- grep(pat, L, value = TRUE) > library(gsubfn) > DD <- strapply(L.DD, pat, c, simplify = rbind) > head(DD)[,1] [,2] [,3] [1,] "01" "00065" "Gage height, feet" [2,] "02" "00060" "Discharge, cubic feet per second" [3,] "03" "00045" "Precipitation, total, inches" [4,] "02" "00065" "Gage height, feet" [5,] "05" "00060" "Discharge, cubic feet per second" On Sun, Oct 4, 2009 at 9:49 PM, stephen sefick <ssefick at gmail.com> wrote:> http://waterdata.usgs.gov/nwis/uv?format=rdb&period=7&site_no=021973269 > > I would like to be able to parse this file up: > > I can do this > x <- read.table("http://waterdata.usgs.gov/nwis/uv?format=rdb&period=7&site_no=021973269", > skip=26) > > but If I add another gauge to this > > x <- read.table("http://waterdata.usgs.gov/nwis/uv?format=rdb&period=7&site_no=021973269,06018500", > skip=26) > It does not work because there are two files appended to each other. > > It would be easy enough to write the code so that each individual > gauge would be read in as a different file, but is there a way to get > this information in using the commented part of the file to give the > headers? ?This is probably a job for some other programing language > like perl, but I don't know perl. > > any help would be very helpful. > regards, > > -- > Stephen Sefick > > Let's not spend our time and resources thinking about things that are > so little or so large that all they really do for us is puff us up and > make us feel like gods. ?We are mammals, and have not exhausted the > annoying little problems of being mammals. > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?-K. Mullis > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
On 5/10/2009, at 2:49 PM, stephen sefick wrote:> http://waterdata.usgs.gov/nwis/uv? > format=rdb&period=7&site_no=021973269 > > I would like to be able to parse this file up: > > I can do this > x <- read.table("http://waterdata.usgs.gov/nwis/uv? > format=rdb&period=7&site_no=021973269", > skip=26) > > but If I add another gauge to this > > x <- read.table("http://waterdata.usgs.gov/nwis/uv? > format=rdb&period=7&site_no=021973269,06018500", > skip=26) > It does not work because there are two files appended to each other. > > It would be easy enough to write the code so that each individual > gauge would be read in as a different file, but is there a way to get > this information in using the commented part of the file to give the > headers? This is probably a job for some other programing language > like perl, but I don't know perl. > > any help would be very helpful.I'm completely clear what's going on here --- (a) I'm not sure what you mean by ``using the commented part of the file to give the headers''; the headers are not ``commented'', and (b) I'm puzzled by the fact that there are 9 column headers/field names, but only 7 columns/fields. Be that as it were, here's what I'd do: x <- read.table("http://waterdata.usgs.gov/nwis/uv? format=rdb&period=7&site_no=021973269", skip=23,nrows=1,header=TRUE,check.names=FALSE) y <- read.table("http://waterdata.usgs.gov/nwis/uv? format=rdb&period=7&site_no=021973269",skip=26) names(y) <- names(x)[1:7] This ***appears*** to give a reasonably sensible data frame. Is this anything like what you want? cheers, Rolf Turner ###################################################################### Attention:\ This e-mail message is privileged and confid...{{dropped:9}}