Michael Lawrence
2018-Apr-26 18:21 UTC
[Rd] readLines() for non-blocking pipeline behaves differently in R 3.5
The issue is that readLines() tries to seek (for reasons I don't understand) in the non-blocking case, but silently fails for "stdin" since it's a stream. This confused the buffering logic. The fix is to mark "stdin" as unable to seek, but I do wonder why readLines() is seeking in the first place. Anyway, I'll get this into patched ASAP. Thanks for the report. Michael On Wed, Apr 25, 2018 at 5:13 PM, Michael Lawrence <michafla at gene.com> wrote:> Probably related to the switch to buffered connections. I will look > into this soon. > > On Wed, Apr 25, 2018 at 2:34 PM, Randy Lai <randy.cs.lai at gmail.com> wrote: >> It seems that the behavior of readLines() in R 3.5 has changed for non-blocking pipeline. >> >> >> Consider the following R script, which reads from STDIN line by line. >> ``` >> con <- file("stdin") >> open(con, blocking = FALSE) >> >> while (TRUE) { >> txt <- readLines(con, 1) >> if (length(txt) > 0) { >> cat(txt, "\n", file = stdout()) >> } >> Sys.sleep(0.1) >> } >> close(con) >> >> ``` >> >> In R 3.4.4, it works as expected. >> >> ``` >> (randymbpro)-Desktop$ echo "abc\nfoo" | R --slave -f test.R >> abc >> foo >> ``` >> >> In R 3.5, only the first line is printed >> ``` >> (randymbpro)-Desktop$ echo "abc\nfoo" | R --slave -f test.R >> abc >> ``` >> >> Is this change expected? If I change `blocking` to `TRUE` above, the above code would >> work. But I need non-blocking connection in my use case of piping buffer from >> another program. >> >> Best, >> >> R 3.5 @ macOS 10.13 >> >> >> Randy >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >>
Gábor Csárdi
2018-Apr-26 18:35 UTC
[Rd] readLines() for non-blocking pipeline behaves differently in R 3.5
I suspect the reason for the seek is this: cat("1\n", file = "foobar") f <- file("foobar", blocking = FALSE, open = "r") readLines(f) #> [1] "1" cat("2\n", file = "foobar", append = TRUE) readLines(f) #> [1] "2" cat("3\n", file = "foobar", append = TRUE) readLines(f) #> [1] "3" I.e. R can emulate a file connection with non-blocking reads. AFAICT there is no such thing, in Unix at least. For this emulation, it needs to seek to the "current" position. Gabor On Thu, Apr 26, 2018 at 7:21 PM, Michael Lawrence <lawrence.michael at gene.com> wrote:> The issue is that readLines() tries to seek (for reasons I don't > understand) in the non-blocking case, but silently fails for "stdin" > since it's a stream. This confused the buffering logic. The fix is to > mark "stdin" as unable to seek, but I do wonder why readLines() is > seeking in the first place. > > Anyway, I'll get this into patched ASAP. Thanks for the report. > > Michael > > > On Wed, Apr 25, 2018 at 5:13 PM, Michael Lawrence <michafla at gene.com> wrote: >> Probably related to the switch to buffered connections. I will look >> into this soon. >> >> On Wed, Apr 25, 2018 at 2:34 PM, Randy Lai <randy.cs.lai at gmail.com> wrote: >>> It seems that the behavior of readLines() in R 3.5 has changed for non-blocking pipeline. >>> >>> >>> Consider the following R script, which reads from STDIN line by line. >>> ``` >>> con <- file("stdin") >>> open(con, blocking = FALSE) >>> >>> while (TRUE) { >>> txt <- readLines(con, 1) >>> if (length(txt) > 0) { >>> cat(txt, "\n", file = stdout()) >>> } >>> Sys.sleep(0.1) >>> } >>> close(con) >>> >>> ``` >>> >>> In R 3.4.4, it works as expected. >>> >>> ``` >>> (randymbpro)-Desktop$ echo "abc\nfoo" | R --slave -f test.R >>> abc >>> foo >>> ``` >>> >>> In R 3.5, only the first line is printed >>> ``` >>> (randymbpro)-Desktop$ echo "abc\nfoo" | R --slave -f test.R >>> abc >>> ``` >>> >>> Is this change expected? If I change `blocking` to `TRUE` above, the above code would >>> work. But I need non-blocking connection in my use case of piping buffer from >>> another program. >>> >>> Best, >>> >>> R 3.5 @ macOS 10.13 >>> >>> >>> Randy >>> >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-devel at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
Michael Lawrence
2018-Apr-26 20:46 UTC
[Rd] readLines() for non-blocking pipeline behaves differently in R 3.5
Thanks for the clear explanation. At first glance seeking to the current position seemed like it would be a no-op, but obviously things are more complicated under the hood. On Thu, Apr 26, 2018 at 11:35 AM, G?bor Cs?rdi <csardi.gabor at gmail.com> wrote:> I suspect the reason for the seek is this: > > cat("1\n", file = "foobar") > f <- file("foobar", blocking = FALSE, open = "r") > readLines(f) > #> [1] "1" > > cat("2\n", file = "foobar", append = TRUE) > readLines(f) > #> [1] "2" > > cat("3\n", file = "foobar", append = TRUE) > readLines(f) > #> [1] "3" > > I.e. R can emulate a file connection with non-blocking reads. > AFAICT there is no such thing, in Unix at least. > For this emulation, it needs to seek to the "current" position. > > Gabor > > On Thu, Apr 26, 2018 at 7:21 PM, Michael Lawrence > <lawrence.michael at gene.com> wrote: >> The issue is that readLines() tries to seek (for reasons I don't >> understand) in the non-blocking case, but silently fails for "stdin" >> since it's a stream. This confused the buffering logic. The fix is to >> mark "stdin" as unable to seek, but I do wonder why readLines() is >> seeking in the first place. >> >> Anyway, I'll get this into patched ASAP. Thanks for the report. >> >> Michael >> >> >> On Wed, Apr 25, 2018 at 5:13 PM, Michael Lawrence <michafla at gene.com> wrote: >>> Probably related to the switch to buffered connections. I will look >>> into this soon. >>> >>> On Wed, Apr 25, 2018 at 2:34 PM, Randy Lai <randy.cs.lai at gmail.com> wrote: >>>> It seems that the behavior of readLines() in R 3.5 has changed for non-blocking pipeline. >>>> >>>> >>>> Consider the following R script, which reads from STDIN line by line. >>>> ``` >>>> con <- file("stdin") >>>> open(con, blocking = FALSE) >>>> >>>> while (TRUE) { >>>> txt <- readLines(con, 1) >>>> if (length(txt) > 0) { >>>> cat(txt, "\n", file = stdout()) >>>> } >>>> Sys.sleep(0.1) >>>> } >>>> close(con) >>>> >>>> ``` >>>> >>>> In R 3.4.4, it works as expected. >>>> >>>> ``` >>>> (randymbpro)-Desktop$ echo "abc\nfoo" | R --slave -f test.R >>>> abc >>>> foo >>>> ``` >>>> >>>> In R 3.5, only the first line is printed >>>> ``` >>>> (randymbpro)-Desktop$ echo "abc\nfoo" | R --slave -f test.R >>>> abc >>>> ``` >>>> >>>> Is this change expected? If I change `blocking` to `TRUE` above, the above code would >>>> work. But I need non-blocking connection in my use case of piping buffer from >>>> another program. >>>> >>>> Best, >>>> >>>> R 3.5 @ macOS 10.13 >>>> >>>> >>>> Randy >>>> >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________________________ >>>> R-devel at r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>>> >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >
Reasonably Related Threads
- readLines() for non-blocking pipeline behaves differently in R 3.5
- readLines() for non-blocking pipeline behaves differently in R 3.5
- readLines() for non-blocking pipeline behaves differently in R 3.5
- readLines() behaves differently for gzfile connection
- readLines() behaves differently for gzfile connection