Miluji Sb
2017-Oct-15 21:02 UTC
[R] Download data from NASA for multiple locations - RCurl
Dear all, i am trying to download time-series climatic data from GES DISC (NASA) Hydrology Data Rods web-service. Unfortunately, no wget method is available. Five parameters are needed for data retrieval: variable, location, startDate, endDate, and type. For example: ### https://hydro1.gesdisc.eosdis.nasa.gov/daac-bin/access/timeseries.cgi?variable=GLDAS2:GLDAS_NOAH025_3H_v2.0:Tair_f_inst&startDate=1970-01-01T00&endDate=1979-12-31T00&location=GEOM:POINT(-71.06,%2042.36)&type=asc2 ### In this case, variable: Tair_f_inst (temperature), location: (-71.06, 42.36), startDate: 01 January 1970; endDate: 31 December 1979; type: asc2 (output 2-column ASCII). I am trying to download data for 100 US cities, data for which I have in the following data.frame: ### cities <- dput(droplevels(head(cities, 5))) structure(list(city = structure(1:5, .Label = c("Boston", "Bridgeport", "Cambridge", "Fall River", "Hartford"), class = "factor"), state structure(c(2L, 1L, 2L, 2L, 1L), .Label = c(" CT ", " MA "), class = "factor"), lon = c(-71.06, -73.19, -71.11, -71.16, -72.67), lat = c(42.36, 41.18, 42.37, 41.7, 41.77)), .Names = c("city", "state", "lon", "lat"), row.names = c(NA, 5L), class = "data.frame") ### Is it possible to download the data for the multiple locations automatically (e.g. RCurl) and save them as csv? Essentially, reading coordinates from the data.frame and entering it in the URL. I would also like to add identifying information to each of the data files from the cities data.frame. I have been doing the following for a single file: ### x <- readLines(con=url(" https://hydro1.gesdisc.eosdis.nasa.gov/daac-bin/access/timeseries.cgi?variable=GLDAS2:GLDAS_NOAH025_3H_v2.0:Tair_f_inst&startDate=1970-01-01T00&endDate=1979-12-31T00&location=GEOM:POINT(-71.06,%2042.36)&type=asc2 ")) x <- x[-(1:13)] mydata <- data.frame(year = substr(x,1,4), month = substr(x, 6,7), day = substr(x, 9, 10), hour = substr(x, 12, 13), temp = substr(x, 21, 27)) mydata$city <- rep(cities[1,1], nrow(mydata)) mydata$state <- rep(cities[1,2], nrow(mydata)) mydata$lon <- rep(cities[1,3], nrow(mydata)) mydata$lat <- rep(cities[1,4], nrow(mydata)) ### Help and advice would be greatly appreciated. Thank you! Sincerely, Milu [[alternative HTML version deleted]]
David Winsemius
2017-Oct-15 21:45 UTC
[R] Download data from NASA for multiple locations - RCurl
> On Oct 15, 2017, at 2:02 PM, Miluji Sb <milujisb at gmail.com> wrote: > > Dear all, > > i am trying to download time-series climatic data from GES DISC (NASA) > Hydrology Data Rods web-service. Unfortunately, no wget method is > available. > > Five parameters are needed for data retrieval: variable, location, > startDate, endDate, and type. For example: > > ### > https://hydro1.gesdisc.eosdis.nasa.gov/daac-bin/access/timeseries.cgi?variable=GLDAS2:GLDAS_NOAH025_3H_v2.0:Tair_f_inst&startDate=1970-01-01T00&endDate=1979-12-31T00&location=GEOM:POINT(-71.06,%2042.36)&type=asc2 > ### > > In this case, variable: Tair_f_inst (temperature), location: (-71.06, > 42.36), startDate: 01 January 1970; endDate: 31 December 1979; type: asc2 > (output 2-column ASCII). > > I am trying to download data for 100 US cities, data for which I have in > the following data.frame: > > ### > cities <- dput(droplevels(head(cities, 5))) > structure(list(city = structure(1:5, .Label = c("Boston", "Bridgeport", > "Cambridge", "Fall River", "Hartford"), class = "factor"), state > structure(c(2L, > 1L, 2L, 2L, 1L), .Label = c(" CT ", " MA "), class = "factor"), > lon = c(-71.06, -73.19, -71.11, -71.16, -72.67), lat = c(42.36, > 41.18, 42.37, 41.7, 41.77)), .Names = c("city", "state", > "lon", "lat"), row.names = c(NA, 5L), class = "data.frame") > ### > > Is it possible to download the data for the multiple locations > automatically (e.g. RCurl) and save them as csv? Essentially, reading > coordinates from the data.frame and entering it in the URL. > > I would also like to add identifying information to each of the data files > from the cities data.frame. I have been doing the following for a single > file:Didn't seem that difficult: library(downloader) # makes things easier for Macs, perhaps not needed # if not used will need to use download.file for( i in 1:5) { target1 <- paste0("https://hydro1.gesdisc.eosdis.nasa.gov/daac-bin/access/timeseries.cgi?variable=GLDAS2:GLDAS_NOAH025_3H_v2.0:Tair_f_inst&startDate=1970-01-01T00&endDate=1979-12-31T00&location=GEOM:POINT(", cities[i, "lon"], ",%20", cities[i,"lat"], ")&type=asc2") target2 <- paste0("~/", # change for whatever destination directory you may prefer. cities[i,"city"], cities[i,"state"], ".asc") download(url=target1, destfile=target2) } Now I have 5 named files with extensions ".asc" in my user directory (since I'm on a Mac). It is a slow website so patience is needed. -- David> > ### > x <- readLines(con=url(" > https://hydro1.gesdisc.eosdis.nasa.gov/daac-bin/access/timeseries.cgi?variable=GLDAS2:GLDAS_NOAH025_3H_v2.0:Tair_f_inst&startDate=1970-01-01T00&endDate=1979-12-31T00&location=GEOM:POINT(-71.06,%2042.36)&type=asc2 > ")) > x <- x[-(1:13)] > > mydata <- data.frame(year = substr(x,1,4), > month = substr(x, 6,7), > day = substr(x, 9, 10), > hour = substr(x, 12, 13), > temp = substr(x, 21, 27)) > > mydata$city <- rep(cities[1,1], nrow(mydata)) > mydata$state <- rep(cities[1,2], nrow(mydata)) > mydata$lon <- rep(cities[1,3], nrow(mydata)) > mydata$lat <- rep(cities[1,4], nrow(mydata)) > ### > > Help and advice would be greatly appreciated. Thank you! > > Sincerely, > > Milu > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius Alameda, CA, USA 'Any technology distinguishable from magic is insufficiently advanced.' -Gehm's Corollary to Clarke's Third Law
Miluji Sb
2017-Oct-15 22:35 UTC
[R] Download data from NASA for multiple locations - RCurl
Dear David, This is amazing, thank you so much. If I may ask another question: The output looks like the following: ### dput(head(x,15)) c("Metadata for Requested Time Series:", "", "prod_name=GLDAS_NOAH025_3H_v2.0", "param_short_name=Tair_f_inst", "param_name=Near surface air temperature", "unit=K", "begin_time=1970-01-01T00", "end_time=1979-12-31T21", "lat= 42.36", "lon=-71.06", "Request_time=2017-10-15 22:20:03 GMT", "", "Date&Time Data", "1970-01-01T00:00:00\t267.769", "1970-01-01T03:00:00\t264.595") ### Thus I need to drop the first 13 rows and do the following to add identifying information: ### mydata <- data.frame(year = substr(x,1,4), month = substr(x, 6,7), day = substr(x, 9, 10), hour = substr(x, 12, 13), temp = substr(x, 21, 27)) mydata$city <- rep(cities[1,1], nrow(mydata)) mydata$state <- rep(cities[1,2], nrow(mydata)) mydata$lon <- rep(cities[1,3], nrow(mydata)) mydata$lat <- rep(cities[1,4], nrow(mydata)) ### Is it possible to incorporate these into your code so the data looks like this: dput(droplevels(head(mydata))) structure(list(year = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "1970", class = "factor"), month = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "01", class "factor"), day = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "01", class "factor"), hour = structure(1:6, .Label = c("00", "03", "06", "09", "12", "15"), class = "factor"), temp = structure(c(6L, 4L, 2L, 1L, 3L, 5L), .Label = c("261.559", "262.525", "262.648", "264.595", "265.812", "267.769"), class = "factor"), city structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "Boston", class = "factor"), state = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = " MA ", class "factor"), lon = c(-71.06, -71.06, -71.06, -71.06, -71.06, -71.06), lat = c(42.36, 42.36, 42.36, 42.36, 42.36, 42.36)), .Names = c("year", "month", "day", "hour", "temp", "city", "state", "lon", "lat" ), row.names = c(NA, 6L), class = "data.frame") Apologies for asking repeated questions and thank you again! Sincerely, Milu On Sun, Oct 15, 2017 at 11:45 PM, David Winsemius <dwinsemius at comcast.net> wrote:> > > On Oct 15, 2017, at 2:02 PM, Miluji Sb <milujisb at gmail.com> wrote: > > > > Dear all, > > > > i am trying to download time-series climatic data from GES DISC (NASA) > > Hydrology Data Rods web-service. Unfortunately, no wget method is > > available. > > > > Five parameters are needed for data retrieval: variable, location, > > startDate, endDate, and type. For example: > > > > ### > > https://hydro1.gesdisc.eosdis.nasa.gov/daac-bin/access/ > timeseries.cgi?variable=GLDAS2:GLDAS_NOAH025_3H_v2.0: > Tair_f_inst&startDate=1970-01-01T00&endDate=1979-12-31T00& > location=GEOM:POINT(-71.06,%2042.36)&type=asc2 > > ### > > > > In this case, variable: Tair_f_inst (temperature), location: (-71.06, > > 42.36), startDate: 01 January 1970; endDate: 31 December 1979; type: > asc2 > > (output 2-column ASCII). > > > > I am trying to download data for 100 US cities, data for which I have in > > the following data.frame: > > > > ### > > cities <- dput(droplevels(head(cities, 5))) > > structure(list(city = structure(1:5, .Label = c("Boston", "Bridgeport", > > "Cambridge", "Fall River", "Hartford"), class = "factor"), state > > structure(c(2L, > > 1L, 2L, 2L, 1L), .Label = c(" CT ", " MA "), class = "factor"), > > lon = c(-71.06, -73.19, -71.11, -71.16, -72.67), lat = c(42.36, > > 41.18, 42.37, 41.7, 41.77)), .Names = c("city", "state", > > "lon", "lat"), row.names = c(NA, 5L), class = "data.frame") > > ### > > > > Is it possible to download the data for the multiple locations > > automatically (e.g. RCurl) and save them as csv? Essentially, reading > > coordinates from the data.frame and entering it in the URL. > > > > I would also like to add identifying information to each of the data > files > > from the cities data.frame. I have been doing the following for a single > > file: > > Didn't seem that difficult: > > library(downloader) # makes things easier for Macs, perhaps not needed > # if not used will need to use download.file > > for( i in 1:5) { > target1 <- paste0("https://hydro1.gesdisc.eosdis.nasa.gov/daac- > bin/access/timeseries.cgi?variable=GLDAS2:GLDAS_NOAH025_ > 3H_v2.0:Tair_f_inst&startDate=1970-01-01T00&endDate=1979-12- > 31T00&location=GEOM:POINT(", > cities[i, "lon"], > ",%20", cities[i,"lat"], > ")&type=asc2") > target2 <- paste0("~/", # change for whatever destination directory > you may prefer. > cities[i,"city"], > cities[i,"state"], ".asc") > download(url=target1, destfile=target2) > } > > Now I have 5 named files with extensions ".asc" in my user directory > (since I'm on a Mac). It is a slow website so patience is needed. > > -- > David > > > > > > ### > > x <- readLines(con=url(" > > https://hydro1.gesdisc.eosdis.nasa.gov/daac-bin/access/ > timeseries.cgi?variable=GLDAS2:GLDAS_NOAH025_3H_v2.0: > Tair_f_inst&startDate=1970-01-01T00&endDate=1979-12-31T00& > location=GEOM:POINT(-71.06,%2042.36)&type=asc2 > > ")) > > x <- x[-(1:13)] > > > > mydata <- data.frame(year = substr(x,1,4), > > month = substr(x, 6,7), > > day = substr(x, 9, 10), > > hour = substr(x, 12, 13), > > temp = substr(x, 21, 27)) > > > > mydata$city <- rep(cities[1,1], nrow(mydata)) > > mydata$state <- rep(cities[1,2], nrow(mydata)) > > mydata$lon <- rep(cities[1,3], nrow(mydata)) > > mydata$lat <- rep(cities[1,4], nrow(mydata)) > > ### > > > > Help and advice would be greatly appreciated. Thank you! > > > > Sincerely, > > > > Milu > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > David Winsemius > Alameda, CA, USA > > 'Any technology distinguishable from magic is insufficiently advanced.' > -Gehm's Corollary to Clarke's Third Law > > > > > >[[alternative HTML version deleted]]
Reasonably Related Threads
- Download data from NASA for multiple locations - RCurl
- Download data from NASA for multiple locations - RCurl
- Download data from NASA for multiple locations - RCurl
- Download data from NASA for multiple locations - RCurl
- Skip error in downloading file in loop