Hervé Pagès
2013-May-07  22:08 UTC
[Rd] error when calling seek() twice on a gzfile connection
Hi,
I get an "internal error" when calling seek() twice on a gzfile
connection.
Create a gzip file:
   bigraw <- sample(charToRaw("abcdef"), 30000000, replace=TRUE)
   save(bigraw, file="bigraw.rda")
Open it:
   con <- gzfile("bigraw.rda", "rb")
Then:
   > seek(con, where=1)
   [1] 0
   > seek(con, where=24980000)
   [1] 1
   Warning message:
   In seek.connection(con, where = 24980000) :
     seek on a gzfile connection returned an internal error
   > seek(con)
   [1] 286
I don't get this error if I omit the 1st call to seek(), or if
I use a smaller 'where' value in the 2nd call to seek().
According to the man page, gzfile connections support seek()
but with a number of limitations. It doesn't seem that what I'm
trying to do falls into any of the limitations mentioned in
the man page though.
As a side note, this is maybe the kind of internal error that seems
serious enough to deserve being turned into a real error, not just
a warning.
Thanks,
H.
 > sessionInfo()
R version 3.0.0 (2013-04-03)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=C                 LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base
loaded via a namespace (and not attached):
[1] tools_3.0.0
-- 
Herv? Pag?s
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319
