Patrick Connolly
2018-Jan-24 07:23 UTC
[R] Function gutenberg_download in the gutenbergr package
I've been working through https://www.tidytextmining.com/tidytext.html wherein everything worked until I got to this part in section 1.5> hgwells <- gutenberg_download(c(35, 36, 5230, 159))Determining mirror for Project Gutenberg from http://www.gutenberg.org/robot/harvest Error in open.connection(con, "rb") : Failed to connect to www.gutenberg.org port 80: Connection timed out Which indicates the problem is at the very start: if (is.null(mirror)) { mirror <- gutenberg_get_mirror(verbose = verbose) } The documentation for gutenberg_get_mirror indicates there's nothing different I could set. So I tried specifying my usual mirror:> hgwells <- gutenberg_download(c(1260, 768, 969, 9182, 767), mirror = "http://cran.stat.auckland.ac.nz")Error in read_zip_url(full_url) : could not find function "read_zip_url">Which is, indeed, strange since according to> help.search("read_zip_url")Help files with alias or concept or title matching ?read_zip_url? using regular expression matching: gutenbergr::read_zip_url Read a file from a .zip URL Aliases: read_zip_url [...] And according to library(help = "gutenbergr") [...] Index: gutenberg_authors Metadata about Project Gutenberg authors gutenberg_download Download one or more works using a Project Gutenberg ID gutenberg_get_mirror Get the recommended mirror for Gutenberg files gutenberg_metadata Gutenberg metadata about each work gutenberg_strip Strip header and footer content from a Project Gutenberg book gutenberg_subjects Gutenberg metadata about the subject of each work gutenberg_works Get a filtered table of Gutenberg work metadata read_zip_url Read a file from a .zip URL [...] However, when I look at the list for that part of the search(), there is no read_zip_url but all the rest of that list are present. So it's not surprising that it isn't found. But it puzzles me that it is not there. Ideas as to where I should proceed gratefully appreciated.> sessionInfo()R version 3.4.2 (2017-09-28) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 14.04.5 LTS Matrix products: default BLAS: /home/hrapgc/local/R-3.4.2/lib/libRblas.so LAPACK: /home/hrapgc/local/R-3.4.2/lib/libRlapack.so locale: [1] LC_CTYPE=en_NZ.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_NZ.UTF-8 LC_COLLATE=en_NZ.UTF-8 [5] LC_MONETARY=en_NZ.UTF-8 LC_MESSAGES=en_NZ.UTF-8 [7] LC_PAPER=en_NZ.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_NZ.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] grDevices utils stats graphics methods base other attached packages: [1] sos_2.0-0 brew_1.0-6 gutenbergr_0.1.3 ggplot2_2.2.1 [5] stringr_1.2.0 bindrcpp_0.2 dplyr_0.7.4 janeaustenr_0.1.5 [9] tidytext_0.1.6 FactoMineR_1.38 readxl_1.0.0 tm_0.7-3 [13] NLP_0.1-11 wordcloud_2.5 RColorBrewer_1.1-2 lattice_0.20-35 loaded via a namespace (and not attached): [1] Rcpp_0.12.13 cellranger_1.1.0 compiler_3.4.2 [4] plyr_1.8.4 bindr_0.1 tokenizers_0.1.4 [7] tools_3.4.2 gtable_0.2.0 tibble_1.3.4 [10] nlme_3.1-131 pkgconfig_2.0.1 rlang_0.1.2 [13] Matrix_1.2-11 psych_1.7.8 curl_3.0 [16] parallel_3.4.2 xml2_1.1.1 cluster_2.0.6 [19] hms_0.3 flashClust_1.01-2 grid_3.4.2 [22] scatterplot3d_0.3-40 glue_1.1.1 ellipse_0.3-8 [25] R6_2.2.2 foreign_0.8-69 readr_1.1.1 [28] purrr_0.2.4 tidyr_0.7.2 reshape2_1.4.2 [31] magrittr_1.5 scales_0.5.0 SnowballC_0.5.1 [34] MASS_7.3-47 leaps_3.0 assertthat_0.2.0 [37] mnormt_1.5-5 colorspace_1.3-2 labeling_0.3 [40] stringi_1.1.5 lazyeval_0.2.1 munsell_0.4.3 [43] slam_0.1-42 broom_0.4.2>-- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___ Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) ..... Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
Jeff Newmiller
2018-Jan-24 15:59 UTC
[R] Function gutenberg_download in the gutenbergr package
I have never used that package, but it seems obvious to me that you need to "reflect" on the meaning of the word "mirror". There is no reason to assume that a site hosting a mirror of the CRAN archive is also going to host a mirror of Project Gutenberg [1]. If, after you know you are giving reasonable inputs the package does not seem to work as designed, please remember that contributed packages have maintainers [2] and not all of them subscribe to r-help. [1] https://www.gutenberg.org/MIRRORS.ALL [2] ?maintainer -- Sent from my phone. Please excuse my brevity. On January 23, 2018 11:23:06 PM PST, Patrick Connolly <p_connolly at slingshot.co.nz> wrote:> >I've been working through https://www.tidytextmining.com/tidytext.html >wherein everything worked until I got to this part in section 1.5 > >> hgwells <- gutenberg_download(c(35, 36, 5230, 159)) >Determining mirror for Project Gutenberg from >http://www.gutenberg.org/robot/harvest >Error in open.connection(con, "rb") : > Failed to connect to www.gutenberg.org port 80: Connection timed out > >Which indicates the problem is at the very start: > > if (is.null(mirror)) { > mirror <- gutenberg_get_mirror(verbose = verbose) > } > >The documentation for gutenberg_get_mirror indicates there's nothing >different I could set. > >So I tried specifying my usual mirror: > >> hgwells <- gutenberg_download(c(1260, 768, 969, 9182, 767), mirror >"http://cran.stat.auckland.ac.nz") >Error in read_zip_url(full_url) : could not find function >"read_zip_url" >> > >Which is, indeed, strange since according to > >> help.search("read_zip_url") >Help files with alias or concept or title matching ?read_zip_url? using >regular expression matching: > > >gutenbergr::read_zip_url > Read a file from a .zip URL > Aliases: read_zip_url > >[...] > >And according to >library(help = "gutenbergr") > >[...] >Index: > >gutenberg_authors Metadata about Project Gutenberg authors >gutenberg_download Download one or more works using a Project > Gutenberg ID >gutenberg_get_mirror Get the recommended mirror for Gutenberg files >gutenberg_metadata Gutenberg metadata about each work >gutenberg_strip Strip header and footer content from a Project > Gutenberg book >gutenberg_subjects Gutenberg metadata about the subject of each > work >gutenberg_works Get a filtered table of Gutenberg work metadata >read_zip_url Read a file from a .zip URL > >[...] > >However, when I look at the list for that part of the search(), there >is no read_zip_url but all the rest of that list are present. So it's >not surprising that it isn't found. But it puzzles me that it is not >there. > >Ideas as to where I should proceed gratefully appreciated. > > >> sessionInfo() >R version 3.4.2 (2017-09-28) >Platform: x86_64-pc-linux-gnu (64-bit) >Running under: Ubuntu 14.04.5 LTS > >Matrix products: default >BLAS: /home/hrapgc/local/R-3.4.2/lib/libRblas.so >LAPACK: /home/hrapgc/local/R-3.4.2/lib/libRlapack.so > >locale: > [1] LC_CTYPE=en_NZ.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_NZ.UTF-8 LC_COLLATE=en_NZ.UTF-8 > [5] LC_MONETARY=en_NZ.UTF-8 LC_MESSAGES=en_NZ.UTF-8 > [7] LC_PAPER=en_NZ.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C >[11] LC_MEASUREMENT=en_NZ.UTF-8 LC_IDENTIFICATION=C > >attached base packages: >[1] grDevices utils stats graphics methods base > >other attached packages: >[1] sos_2.0-0 brew_1.0-6 gutenbergr_0.1.3 >ggplot2_2.2.1 >[5] stringr_1.2.0 bindrcpp_0.2 dplyr_0.7.4 >janeaustenr_0.1.5 >[9] tidytext_0.1.6 FactoMineR_1.38 readxl_1.0.0 tm_0.7-3 > >[13] NLP_0.1-11 wordcloud_2.5 RColorBrewer_1.1-2 >lattice_0.20-35 > >loaded via a namespace (and not attached): > [1] Rcpp_0.12.13 cellranger_1.1.0 compiler_3.4.2 > [4] plyr_1.8.4 bindr_0.1 tokenizers_0.1.4 > [7] tools_3.4.2 gtable_0.2.0 tibble_1.3.4 >[10] nlme_3.1-131 pkgconfig_2.0.1 rlang_0.1.2 >[13] Matrix_1.2-11 psych_1.7.8 curl_3.0 >[16] parallel_3.4.2 xml2_1.1.1 cluster_2.0.6 >[19] hms_0.3 flashClust_1.01-2 grid_3.4.2 >[22] scatterplot3d_0.3-40 glue_1.1.1 ellipse_0.3-8 >[25] R6_2.2.2 foreign_0.8-69 readr_1.1.1 >[28] purrr_0.2.4 tidyr_0.7.2 reshape2_1.4.2 >[31] magrittr_1.5 scales_0.5.0 SnowballC_0.5.1 >[34] MASS_7.3-47 leaps_3.0 assertthat_0.2.0 >[37] mnormt_1.5-5 colorspace_1.3-2 labeling_0.3 >[40] stringi_1.1.5 lazyeval_0.2.1 munsell_0.4.3 >[43] slam_0.1-42 broom_0.4.2 >> > >-- >~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. > > ___ Patrick Connolly > {~._.~} Great minds discuss ideas > _( Y )_ Average minds discuss events >(:_~*~_:) Small minds discuss people > (_)-(_) ..... Eleanor Roosevelt > >~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.