Ben Heavner
2018-May-10 22:31 UTC
[Rd] readLines() behaves differently for gzfile connection
When I read a .gz file with readLines() in 3.4.3, it returns text (and a warning). In 3.5.0, it gives a warning, but no text. Is this expected behavior or a bug? 3.4.3:> source_file = "1k_annotation.gz" > readfile_con <- gzfile(source_file, "r") > readLines(readfile_con, n = 5)[1] "#chr\tpos\tref\talt\t <truncated output here> Warning message: In readLines(readfile_con, n = 5) : seek on a gzfile connection returned an internal error> close(readfile_con)> sessionInfo()R version 3.4.3 (2017-11-30) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS Sierra 10.12.6 Matrix products: default BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_3.4.3 --------------------------------------------- 3.5.0:> source_file = "1k_annotation.gz" > readfile_con <- gzfile(source_file, "r") > readLines(readfile_con, n = 5)[1] "" "" "" "" "" Warning message: In readLines(readfile_con, n = 5) : seek on a gzfile connection returned an internal error> close(readfile_con) > sessionInfo()R version 3.5.0 (2018-04-23) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Debian GNU/Linux 9 (stretch) Matrix products: default BLAS: /usr/lib/openblas-base/libblas.so.3 LAPACK: /usr/lib/libopenblasp-r0.2.19.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=C [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_3.5.0 ---------------------------------------- (note: I'm running 3.5.0 via the docker rocker/tidyverse:3.5 container, and 3.4.3 on my mac desktop machine) Thanks! Ben Heavner [[alternative HTML version deleted]]
Michael Lawrence
2018-May-10 23:17 UTC
[Rd] readLines() behaves differently for gzfile connection
Would it be possible to get that file or a representative subset of it somewhere so that I can reproduce this? Thanks, Michael On Thu, May 10, 2018 at 3:31 PM, Ben Heavner <bheavner at gmail.com> wrote:> When I read a .gz file with readLines() in 3.4.3, it returns text (and a > warning). In 3.5.0, it gives a warning, but no text. Is this expected > behavior or a bug? > > 3.4.3: >> source_file = "1k_annotation.gz" >> readfile_con <- gzfile(source_file, "r") >> readLines(readfile_con, n = 5) > [1] "#chr\tpos\tref\talt\t > > <truncated output here> > > Warning message: > In readLines(readfile_con, n = 5) : > seek on a gzfile connection returned an internal error > >> close(readfile_con) > >> sessionInfo() > R version 3.4.3 (2017-11-30) > Platform: x86_64-apple-darwin15.6.0 (64-bit) > Running under: macOS Sierra 10.12.6 > > Matrix products: default > BLAS: > /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib > LAPACK: > /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] compiler_3.4.3 > > --------------------------------------------- > > 3.5.0: >> source_file = "1k_annotation.gz" >> readfile_con <- gzfile(source_file, "r") >> readLines(readfile_con, n = 5) > [1] "" "" "" "" "" > Warning message: > In readLines(readfile_con, n = 5) : > seek on a gzfile connection returned an internal error >> close(readfile_con) >> sessionInfo() > R version 3.5.0 (2018-04-23) > Platform: x86_64-pc-linux-gnu (64-bit) > Running under: Debian GNU/Linux 9 (stretch) > > Matrix products: default > BLAS: /usr/lib/openblas-base/libblas.so.3 > LAPACK: /usr/lib/libopenblasp-r0.2.19.so > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=C > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] compiler_3.5.0 > > ---------------------------------------- > (note: I'm running 3.5.0 via the docker rocker/tidyverse:3.5 container, and > 3.4.3 on my mac desktop machine) > > Thanks! > Ben Heavner > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >
Ben Heavner
2018-May-10 23:21 UTC
[Rd] readLines() behaves differently for gzfile connection
You bet - it's available on github at https://github.com/UW-GAC/wgsaparsr/blob/master/tests/testthat/1k_annotation.gz -Ben On Thu, May 10, 2018 at 4:17 PM, Michael Lawrence <lawrence.michael at gene.com> wrote:> Would it be possible to get that file or a representative subset of it > somewhere so that I can reproduce this? > > Thanks, > Michael > > On Thu, May 10, 2018 at 3:31 PM, Ben Heavner <bheavner at gmail.com> wrote: > > When I read a .gz file with readLines() in 3.4.3, it returns text (and a > > warning). In 3.5.0, it gives a warning, but no text. Is this expected > > behavior or a bug? > > > > 3.4.3: > >> source_file = "1k_annotation.gz" > >> readfile_con <- gzfile(source_file, "r") > >> readLines(readfile_con, n = 5) > > [1] "#chr\tpos\tref\talt\t > > > > <truncated output here> > > > > Warning message: > > In readLines(readfile_con, n = 5) : > > seek on a gzfile connection returned an internal error > > > >> close(readfile_con) > > > >> sessionInfo() > > R version 3.4.3 (2017-11-30) > > Platform: x86_64-apple-darwin15.6.0 (64-bit) > > Running under: macOS Sierra 10.12.6 > > > > Matrix products: default > > BLAS: > > /Library/Frameworks/R.framework/Versions/3.4/ > Resources/lib/libRblas.0.dylib > > LAPACK: > > /Library/Frameworks/R.framework/Versions/3.4/ > Resources/lib/libRlapack.dylib > > > > locale: > > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > > > attached base packages: > > [1] stats graphics grDevices utils datasets methods base > > > > loaded via a namespace (and not attached): > > [1] compiler_3.4.3 > > > > --------------------------------------------- > > > > 3.5.0: > >> source_file = "1k_annotation.gz" > >> readfile_con <- gzfile(source_file, "r") > >> readLines(readfile_con, n = 5) > > [1] "" "" "" "" "" > > Warning message: > > In readLines(readfile_con, n = 5) : > > seek on a gzfile connection returned an internal error > >> close(readfile_con) > >> sessionInfo() > > R version 3.5.0 (2018-04-23) > > Platform: x86_64-pc-linux-gnu (64-bit) > > Running under: Debian GNU/Linux 9 (stretch) > > > > Matrix products: default > > BLAS: /usr/lib/openblas-base/libblas.so.3 > > LAPACK: /usr/lib/libopenblasp-r0.2.19.so > > > > locale: > > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=C > > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > > [9] LC_ADDRESS=C LC_TELEPHONE=C > > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > > > attached base packages: > > [1] stats graphics grDevices utils datasets methods base > > > > loaded via a namespace (and not attached): > > [1] compiler_3.5.0 > > > > ---------------------------------------- > > (note: I'm running 3.5.0 via the docker rocker/tidyverse:3.5 container, > and > > 3.4.3 on my mac desktop machine) > > > > Thanks! > > Ben Heavner > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-devel at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > >[[alternative HTML version deleted]]