Miluji Sb
2017-Oct-15 21:02 UTC
[R] Download data from NASA for multiple locations - RCurl
Dear all,
i am trying to download time-series climatic data from GES DISC (NASA)
Hydrology Data Rods web-service. Unfortunately, no wget method is
available.
Five parameters are needed for data retrieval: variable, location,
startDate, endDate, and type. For example:
###
https://hydro1.gesdisc.eosdis.nasa.gov/daac-bin/access/timeseries.cgi?variable=GLDAS2:GLDAS_NOAH025_3H_v2.0:Tair_f_inst&startDate=1970-01-01T00&endDate=1979-12-31T00&location=GEOM:POINT(-71.06,%2042.36)&type=asc2
###
In this case, variable: Tair_f_inst (temperature), location: (-71.06,
42.36), startDate: 01 January 1970; endDate: 31 December 1979; type: asc2
(output 2-column ASCII).
I am trying to download data for 100 US cities, data for which I have in
the following data.frame:
###
cities <- dput(droplevels(head(cities, 5)))
structure(list(city = structure(1:5, .Label = c("Boston",
"Bridgeport",
"Cambridge", "Fall River", "Hartford"), class =
"factor"), state structure(c(2L,
1L, 2L, 2L, 1L), .Label = c(" CT ", " MA "), class =
"factor"),
lon = c(-71.06, -73.19, -71.11, -71.16, -72.67), lat = c(42.36,
41.18, 42.37, 41.7, 41.77)), .Names = c("city", "state",
"lon", "lat"), row.names = c(NA, 5L), class =
"data.frame")
###
Is it possible to download the data for the multiple locations
automatically (e.g. RCurl) and save them as csv? Essentially, reading
coordinates from the data.frame and entering it in the URL.
I would also like to add identifying information to each of the data files
from the cities data.frame. I have been doing the following for a single
file:
###
x <- readLines(con=url("
https://hydro1.gesdisc.eosdis.nasa.gov/daac-bin/access/timeseries.cgi?variable=GLDAS2:GLDAS_NOAH025_3H_v2.0:Tair_f_inst&startDate=1970-01-01T00&endDate=1979-12-31T00&location=GEOM:POINT(-71.06,%2042.36)&type=asc2
"))
x <- x[-(1:13)]
mydata <- data.frame(year = substr(x,1,4),
month = substr(x, 6,7),
day = substr(x, 9, 10),
hour = substr(x, 12, 13),
temp = substr(x, 21, 27))
mydata$city <- rep(cities[1,1], nrow(mydata))
mydata$state <- rep(cities[1,2], nrow(mydata))
mydata$lon <- rep(cities[1,3], nrow(mydata))
mydata$lat <- rep(cities[1,4], nrow(mydata))
###
Help and advice would be greatly appreciated. Thank you!
Sincerely,
Milu
[[alternative HTML version deleted]]
David Winsemius
2017-Oct-15 21:45 UTC
[R] Download data from NASA for multiple locations - RCurl
> On Oct 15, 2017, at 2:02 PM, Miluji Sb <milujisb at gmail.com> wrote: > > Dear all, > > i am trying to download time-series climatic data from GES DISC (NASA) > Hydrology Data Rods web-service. Unfortunately, no wget method is > available. > > Five parameters are needed for data retrieval: variable, location, > startDate, endDate, and type. For example: > > ### > https://hydro1.gesdisc.eosdis.nasa.gov/daac-bin/access/timeseries.cgi?variable=GLDAS2:GLDAS_NOAH025_3H_v2.0:Tair_f_inst&startDate=1970-01-01T00&endDate=1979-12-31T00&location=GEOM:POINT(-71.06,%2042.36)&type=asc2 > ### > > In this case, variable: Tair_f_inst (temperature), location: (-71.06, > 42.36), startDate: 01 January 1970; endDate: 31 December 1979; type: asc2 > (output 2-column ASCII). > > I am trying to download data for 100 US cities, data for which I have in > the following data.frame: > > ### > cities <- dput(droplevels(head(cities, 5))) > structure(list(city = structure(1:5, .Label = c("Boston", "Bridgeport", > "Cambridge", "Fall River", "Hartford"), class = "factor"), state > structure(c(2L, > 1L, 2L, 2L, 1L), .Label = c(" CT ", " MA "), class = "factor"), > lon = c(-71.06, -73.19, -71.11, -71.16, -72.67), lat = c(42.36, > 41.18, 42.37, 41.7, 41.77)), .Names = c("city", "state", > "lon", "lat"), row.names = c(NA, 5L), class = "data.frame") > ### > > Is it possible to download the data for the multiple locations > automatically (e.g. RCurl) and save them as csv? Essentially, reading > coordinates from the data.frame and entering it in the URL. > > I would also like to add identifying information to each of the data files > from the cities data.frame. I have been doing the following for a single > file:Didn't seem that difficult: library(downloader) # makes things easier for Macs, perhaps not needed # if not used will need to use download.file for( i in 1:5) { target1 <- paste0("https://hydro1.gesdisc.eosdis.nasa.gov/daac-bin/access/timeseries.cgi?variable=GLDAS2:GLDAS_NOAH025_3H_v2.0:Tair_f_inst&startDate=1970-01-01T00&endDate=1979-12-31T00&location=GEOM:POINT(", cities[i, "lon"], ",%20", cities[i,"lat"], ")&type=asc2") target2 <- paste0("~/", # change for whatever destination directory you may prefer. cities[i,"city"], cities[i,"state"], ".asc") download(url=target1, destfile=target2) } Now I have 5 named files with extensions ".asc" in my user directory (since I'm on a Mac). It is a slow website so patience is needed. -- David> > ### > x <- readLines(con=url(" > https://hydro1.gesdisc.eosdis.nasa.gov/daac-bin/access/timeseries.cgi?variable=GLDAS2:GLDAS_NOAH025_3H_v2.0:Tair_f_inst&startDate=1970-01-01T00&endDate=1979-12-31T00&location=GEOM:POINT(-71.06,%2042.36)&type=asc2 > ")) > x <- x[-(1:13)] > > mydata <- data.frame(year = substr(x,1,4), > month = substr(x, 6,7), > day = substr(x, 9, 10), > hour = substr(x, 12, 13), > temp = substr(x, 21, 27)) > > mydata$city <- rep(cities[1,1], nrow(mydata)) > mydata$state <- rep(cities[1,2], nrow(mydata)) > mydata$lon <- rep(cities[1,3], nrow(mydata)) > mydata$lat <- rep(cities[1,4], nrow(mydata)) > ### > > Help and advice would be greatly appreciated. Thank you! > > Sincerely, > > Milu > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius Alameda, CA, USA 'Any technology distinguishable from magic is insufficiently advanced.' -Gehm's Corollary to Clarke's Third Law
Miluji Sb
2017-Oct-15 22:35 UTC
[R] Download data from NASA for multiple locations - RCurl
Dear David,
This is amazing, thank you so much. If I may ask another question:
The output looks like the following:
###
dput(head(x,15))
c("Metadata for Requested Time Series:", "",
"prod_name=GLDAS_NOAH025_3H_v2.0",
"param_short_name=Tair_f_inst", "param_name=Near surface air
temperature",
"unit=K", "begin_time=1970-01-01T00",
"end_time=1979-12-31T21",
"lat= 42.36", "lon=-71.06", "Request_time=2017-10-15
22:20:03 GMT",
"", "Date&Time Data",
"1970-01-01T00:00:00\t267.769",
"1970-01-01T03:00:00\t264.595")
###
Thus I need to drop the first 13 rows and do the following to add
identifying information:
###
mydata <- data.frame(year = substr(x,1,4),
month = substr(x, 6,7),
day = substr(x, 9, 10),
hour = substr(x, 12, 13),
temp = substr(x, 21, 27))
mydata$city <- rep(cities[1,1], nrow(mydata))
mydata$state <- rep(cities[1,2], nrow(mydata))
mydata$lon <- rep(cities[1,3], nrow(mydata))
mydata$lat <- rep(cities[1,4], nrow(mydata))
###
Is it possible to incorporate these into your code so the data looks like
this:
dput(droplevels(head(mydata)))
structure(list(year = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label =
"1970",
class = "factor"),
month = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "01", class
"factor"),
day = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "01", class
"factor"),
hour = structure(1:6, .Label = c("00", "03",
"06", "09",
"12", "15"), class = "factor"), temp =
structure(c(6L, 4L,
2L, 1L, 3L, 5L), .Label = c("261.559", "262.525",
"262.648",
"264.595", "265.812", "267.769"), class =
"factor"), city structure(c(1L,
1L, 1L, 1L, 1L, 1L), .Label = "Boston", class =
"factor"),
state = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = " MA ",
class "factor"),
lon = c(-71.06, -71.06, -71.06, -71.06, -71.06, -71.06),
lat = c(42.36, 42.36, 42.36, 42.36, 42.36, 42.36)), .Names =
c("year",
"month", "day", "hour", "temp",
"city", "state", "lon", "lat"
), row.names = c(NA, 6L), class = "data.frame")
Apologies for asking repeated questions and thank you again!
Sincerely,
Milu
On Sun, Oct 15, 2017 at 11:45 PM, David Winsemius <dwinsemius at
comcast.net>
wrote:
>
> > On Oct 15, 2017, at 2:02 PM, Miluji Sb <milujisb at gmail.com>
wrote:
> >
> > Dear all,
> >
> > i am trying to download time-series climatic data from GES DISC (NASA)
> > Hydrology Data Rods web-service. Unfortunately, no wget method is
> > available.
> >
> > Five parameters are needed for data retrieval: variable, location,
> > startDate, endDate, and type. For example:
> >
> > ###
> > https://hydro1.gesdisc.eosdis.nasa.gov/daac-bin/access/
> timeseries.cgi?variable=GLDAS2:GLDAS_NOAH025_3H_v2.0:
> Tair_f_inst&startDate=1970-01-01T00&endDate=1979-12-31T00&
> location=GEOM:POINT(-71.06,%2042.36)&type=asc2
> > ###
> >
> > In this case, variable: Tair_f_inst (temperature), location: (-71.06,
> > 42.36), startDate: 01 January 1970; endDate: 31 December 1979; type:
> asc2
> > (output 2-column ASCII).
> >
> > I am trying to download data for 100 US cities, data for which I have
in
> > the following data.frame:
> >
> > ###
> > cities <- dput(droplevels(head(cities, 5)))
> > structure(list(city = structure(1:5, .Label = c("Boston",
"Bridgeport",
> > "Cambridge", "Fall River", "Hartford"),
class = "factor"), state > > structure(c(2L,
> > 1L, 2L, 2L, 1L), .Label = c(" CT ", " MA "), class
= "factor"),
> > lon = c(-71.06, -73.19, -71.11, -71.16, -72.67), lat = c(42.36,
> > 41.18, 42.37, 41.7, 41.77)), .Names = c("city",
"state",
> > "lon", "lat"), row.names = c(NA, 5L), class =
"data.frame")
> > ###
> >
> > Is it possible to download the data for the multiple locations
> > automatically (e.g. RCurl) and save them as csv? Essentially, reading
> > coordinates from the data.frame and entering it in the URL.
> >
> > I would also like to add identifying information to each of the data
> files
> > from the cities data.frame. I have been doing the following for a
single
> > file:
>
> Didn't seem that difficult:
>
> library(downloader) # makes things easier for Macs, perhaps not needed
> # if not used will need to use download.file
>
> for( i in 1:5) {
> target1 <- paste0("https://hydro1.gesdisc.eosdis.nasa.gov/daac-
> bin/access/timeseries.cgi?variable=GLDAS2:GLDAS_NOAH025_
> 3H_v2.0:Tair_f_inst&startDate=1970-01-01T00&endDate=1979-12-
> 31T00&location=GEOM:POINT(",
> cities[i, "lon"],
> ",%20", cities[i,"lat"],
> ")&type=asc2")
> target2 <- paste0("~/", # change for whatever destination
directory
> you may prefer.
> cities[i,"city"],
> cities[i,"state"], ".asc")
> download(url=target1, destfile=target2)
> }
>
> Now I have 5 named files with extensions ".asc" in my user
directory
> (since I'm on a Mac). It is a slow website so patience is needed.
>
> --
> David
>
>
> >
> > ###
> > x <- readLines(con=url("
> > https://hydro1.gesdisc.eosdis.nasa.gov/daac-bin/access/
> timeseries.cgi?variable=GLDAS2:GLDAS_NOAH025_3H_v2.0:
> Tair_f_inst&startDate=1970-01-01T00&endDate=1979-12-31T00&
> location=GEOM:POINT(-71.06,%2042.36)&type=asc2
> > "))
> > x <- x[-(1:13)]
> >
> > mydata <- data.frame(year = substr(x,1,4),
> > month = substr(x, 6,7),
> > day = substr(x, 9, 10),
> > hour = substr(x, 12, 13),
> > temp = substr(x, 21, 27))
> >
> > mydata$city <- rep(cities[1,1], nrow(mydata))
> > mydata$state <- rep(cities[1,2], nrow(mydata))
> > mydata$lon <- rep(cities[1,3], nrow(mydata))
> > mydata$lat <- rep(cities[1,4], nrow(mydata))
> > ###
> >
> > Help and advice would be greatly appreciated. Thank you!
> >
> > Sincerely,
> >
> > Milu
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
> 'Any technology distinguishable from magic is insufficiently
advanced.'
> -Gehm's Corollary to Clarke's Third Law
>
>
>
>
>
>
[[alternative HTML version deleted]]
Seemingly Similar Threads
- Download data from NASA for multiple locations - RCurl
- Download data from NASA for multiple locations - RCurl
- Download data from NASA for multiple locations - RCurl
- Download data from NASA for multiple locations - RCurl
- Skip error in downloading file in loop