Sorry if I'm joining a little bit late. I've put some related links and scripts together a few weeks ago. Then I stopped with this, because there is so much. The data format employed by John Hopkins CSSE was sort of a big surprise to me. An opposite approach was taken in Germany, that organized it as a big JSON trees. Fortunately, both can be "tidied" with R, and represent good didactic examples for our students. Here yet another repo linking to the data: https://github.com/tpetzoldt/covid Thomas On 04.05.2020 at 20:48 James Spottiswoode wrote:> Sure. COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University is available here: > > https://github.com/CSSEGISandData/COVID-19 > > All in csv fiormat. > > >> On May 4, 2020, at 11:31 AM, Bernard McGarvey <mcgarvey.bernard at comcast.net> wrote: >> >> Just curious does anyone know of a website that has data available in a format that R can download and analyze? >> >> Thanks >> >> >> Bernard McGarvey >> >> >> Director, Fort Myers Beach Lions Foundation, Inc. >> >> >> Retired (Lilly Engineering Fellow). >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > James Spottiswoode > Applied Mathematics & Statistics > (310) 270 6220 > jamesspottiswoode Skype > james at jsasoc.com > >
On Thu, May 7, 2020 at 12:58 AM Thomas Petzoldt <thpe at simecol.de> wrote:> > Sorry if I'm joining a little bit late. > > I've put some related links and scripts together a few weeks ago. Then I > stopped with this, because there is so much. > > The data format employed by John Hopkins CSSE was sort of a big surprise > to me.Why? I find it quite convenient to drop the first few columns and extract the data as a matrix (using data.matrix()). -Deepayan> An opposite approach was taken in Germany, that organized it as a > big JSON trees. > > Fortunately, both can be "tidied" with R, and represent good didactic > examples for our students. > > Here yet another repo linking to the data: > > https://github.com/tpetzoldt/covid > > > Thomas > > > On 04.05.2020 at 20:48 James Spottiswoode wrote: > > Sure. COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University is available here: > > > > https://github.com/CSSEGISandData/COVID-19 > > > > All in csv fiormat. > > > > > >> On May 4, 2020, at 11:31 AM, Bernard McGarvey <mcgarvey.bernard at comcast.net> wrote: > >> > >> Just curious does anyone know of a website that has data available in a format that R can download and analyze? > >> > >> Thanks > >> > >> > >> Bernard McGarvey > >> > >> > >> Director, Fort Myers Beach Lions Foundation, Inc. > >> > >> > >> Retired (Lilly Engineering Fellow). > >> > >> ______________________________________________ > >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > > James Spottiswoode > > Applied Mathematics & Statistics > > (310) 270 6220 > > jamesspottiswoode Skype > > james at jsasoc.com > > > > > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
On 07.05.2020 at 11:19 Deepayan Sarkar wrote:> On Thu, May 7, 2020 at 12:58 AM Thomas Petzoldt <thpe at simecol.de> wrote: >> >> Sorry if I'm joining a little bit late. >> >> I've put some related links and scripts together a few weeks ago. Then I >> stopped with this, because there is so much. >> >> The data format employed by John Hopkins CSSE was sort of a big surprise >> to me. > > Why? I find it quite convenient to drop the first few columns and > extract the data as a matrix (using data.matrix()). > > -DeepayanMany thanks for the hint to use data.matrix My aim was not to say that it is difficult, especially as R has all the tools for data mangling. My surprise was that "wide tables" and non-ISO dates as column names are not the "data base way" that we in general teach to our students With reshape2::melt or tidyr::gather resp. pivot_longer, conversion is quite easy, regardless if one wants to use tidyverse or not, see example below. Again, thanks, Thomas library("dplyr") library("readr") library("tidyr") file <- "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv" dat <- read_delim(file, delim=",") names(dat)[1:2] <- c("Province_State", "Country_Region") dat2 <- dat %>% ## summarize Country/Region duplicates group_by(Country_Region) %>% summarise_at(vars(-(1:4)), sum) %>% ## make it a long table pivot_longer(cols = -Country_Region, names_to = "time") %>% ## convert to ISO 8601 date mutate(time = as.POSIXct(time, format="%m/%e/%y"))> >> An opposite approach was taken in Germany, that organized it as a >> big JSON trees. >> >> Fortunately, both can be "tidied" with R, and represent good didactic >> examples for our students. >> >> Here yet another repo linking to the data: >> >> https://github.com/tpetzoldt/covid >> >> >> Thomas >> >> >> On 04.05.2020 at 20:48 James Spottiswoode wrote: >>> Sure. COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University is available here: >>> >>> https://github.com/CSSEGISandData/COVID-19 >>> >>> All in csv fiormat. >>> >>> >>>> On May 4, 2020, at 11:31 AM, Bernard McGarvey <mcgarvey.bernard at comcast.net> wrote: >>>> >>>> Just curious does anyone know of a website that has data available in a format that R can download and analyze? >>>> >>>> Thanks >>>> >>>> >>>> Bernard McGarvey >>>> >>>> >>>> Director, Fort Myers Beach Lions Foundation, Inc. >>>> >>>> >>>> Retired (Lilly Engineering Fellow). >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> James Spottiswoode >>> Applied Mathematics & Statistics >>> (310) 270 6220 >>> jamesspottiswoode Skype >>> james at jsasoc.com >>>-- Dr. Thomas Petzoldt senior scientist Technische Universitaet Dresden Faculty of Environmental Sciences Institute of Hydrobiology 01062 Dresden, Germany https://tu-dresden.de/Members/thomas.petzoldt