On Fri, Nov 11, 2016 at 12:08 PM, Martin Maechler
<maechler at stat.math.ethz.ch> wrote:>>>>>> Gergely Dar?czi <daroczig at rapporter.net>
>>>>>> on Thu, 10 Nov 2016 16:48:12 +0100 writes:
>
> > Dear All,
> > I'm developing an R application running inside of a Java
daemon on
> > multiple threads, and interacting with the parent daemon via stdin
and
> > stdout.
>
> > Everything works perfectly fine except for having some memory
leaks
> > somewhere. Simplified version of the R app:
>
> > while (TRUE) {
> > con <- file('stdin', open = 'r', blocking =
TRUE)
> > line <- scan(con, what = character(0), nlines = 1, quiet =
TRUE)
> > close(con)
> > }
>
> > This loop uses more and more RAM as time passes (see more on this
> > below), not sure why, and I have no idea currently on how to debug
> > this further. Can someone please try to reproduce it and give me
some
> > hints on what is the problem?
>
> > Sample bash script to trigger an R process with such memory leak:
>
> > Rscript --vanilla -e
"while(TRUE)cat(runif(1),'\n')" | Rscript
> > --vanilla -e
"cat(Sys.getpid(),'\n');while(TRUE){con<-file('stdin',open='r',blocking=TRUE);line<-scan(con,what=character(0),nlines=1,quiet=TRUE);close(con);rm(con);gc()}"
>
> > Maybe you have to escape '\n' depending on your shell.
>
> > Thanks for reading this and any hints would be highly appreciated!
>
> I have no hints, sorry... but give some more "data":
>
> I've changed the above to *print* the gc() result every 1000th
> iteration, and after 100'000 iterations, there is still no
> memory increase from the point of view of R itself.
>
> However, monitoring the process (via 'htop', e.g.) shows about
> 1 MB per second increase in memory foot print of the process.
>
> One could argue that the error is with the OS / pipe / bash
> rather than with R itself... but I'm not expert enough to do
> argue here at all.
>
> Here's my version of your sample bash script and its output:
>
> $ Rscript --vanilla -e "while(TRUE)cat(runif(1),'\n')" |
Rscript --vanilla -e "cat(Sys.getpid(),'\n');i <- 0;
while(TRUE){con<-file('stdin',open='r',blocking=TRUE);line<-scan(con,what=character(0),nlines=1,quiet=TRUE);close(con);rm(con);a
<- gc(); i <- i+1; if(i %% 1000 == 1) {cat('i=',i,'\\n');
print(a)} }"
>
> 11059
> i= 1
> used (Mb) gc trigger (Mb) max used (Mb)
> Ncells 83216 4.5 10000000 534.1 213529 11.5
> Vcells 172923 1.4 16777216 128.0 562476 4.3
> i= 1001
> used (Mb) gc trigger (Mb) max used (Mb)
> Ncells 83255 4.5 10000000 534.1 213529 11.5
> Vcells 172958 1.4 16777216 128.0 562476 4.3
> .......
> ...............................................
> ...............................................
> ...............................................
> i= 80001
> used (Mb) gc trigger (Mb) max used (Mb)
> Ncells 83255 4.5 10000000 534.1 213529 11.5
> Vcells 172958 1.4 16777216 128.0 562476 4.3
> i= 81001
> used (Mb) gc trigger (Mb) max used (Mb)
> Ncells 83255 4.5 10000000 534.1 213529 11.5
> Vcells 172959 1.4 16777216 128.0 562476 4.3
> i= 82001
> used (Mb) gc trigger (Mb) max used (Mb)
> Ncells 83255 4.5 10000000 534.1 213529 11.5
> Vcells 172959 1.4 16777216 128.0 562476 4.3
> i= 83001
> used (Mb) gc trigger (Mb) max used (Mb)
> Ncells 83255 4.5 10000000 534.1 213529 11.5
> Vcells 172958 1.4 16777216 128.0 562476 4.3
> i= 84001
> used (Mb) gc trigger (Mb) max used (Mb)
> Ncells 83255 4.5 10000000 534.1 213529 11.5
> Vcells 172958 1.4 16777216 128.0 562476 4.3
>
Thank you very much, this was very useful!
I tried to do some more research on this, as Gabor Csardi also
suspected that the memory grow might be due to the writer being faster
than the reader, so data is simply accumulating in the input buffer of
the reader. I double checked this via:
Rscript --vanilla -e
"i<-1;while(TRUE){cat(runif(1),'\n');i<-i+1;if(i==1e6){Sys.sleep(15);i<-1}}"
| Rscript --vanilla -e
"cat(Sys.getpid(),'\n');i<-0;while(TRUE){con<-file('stdin',open='r',blocking=TRUE);line<-scan(con,what=character(0),nlines=1,quiet=TRUE);close(con);rm(con);a<-gc();i<-i+1;if(i%%1e3==1){cat('i=',i,'\\n');print(a)}}"scan(con,what=character(0),nlines=1,quiet=TRUE);close(con);rm(con);gc()}"
So the writer generates a good number of lines, but sleeps for 15
seconds after a while so that the reader can catch up. Monitoring the
memory footprint of the process (by the way gc reported no memory
increase in the reader, just like in Martin's output) shows that the
memory grows when the writer sends data, and it's constant when the
writer is sleeping, but it never decreases: http://imgur.com/r7T02pK
Maybe it's more like an OS-specific question based on this, you are
absolutely right, but I was not able to reproduce the same memory
issue in plain bash via:
while :;do echo '1';done | bash -c "while :;do read;done"
But I'm not sure if this does exactly the same as the original R
script, so this is rather just a guess.
On the other hand, I tried to modify the original minimal R script in
other ways as well to see which part might result in the strange
memory growth, and it seems that opening the connection once but
keeping the rest of the script (so still generating and reading tons
of lines without any sleep), did not show any memory leak:
Rscript --vanilla -e "while(TRUE)cat(runif(1),'\n')" |
Rscript
--vanilla -e
"cat(Sys.getpid(),'\n');con<-file('stdin',open='r',blocking=TRUE);while(TRUE){line<-scan(con,what=character(0),nlines=1,quiet=TRUE);};close(con)"
Based on this, I think I can (should) modify my R application to open
stdin only once and read from that connection in the infinite loop,
but I'm still interested in understanding what's causing the extra
memory usage when opening and closing many connections (if my above
findings are correct).
Thank you very much again, and I'm still looking for any suggestion or
advice on how to debug this further.
>
> > Best,
> > Gergely
>
> > PS1 see the image posted at
> >
http://stackoverflow.com/questions/40522584/memory-leak-with-closed-connections
> > on memory usage over time
> > PS2 the issue doesn't seem to be due to writing more data in
the first
> > R app compared to what the second R app can handle, as I tried the
> > same with adding a Sys.sleep(0.01) in the first app and that's
not an
> > issue at all in the real application
> > PS3 I also tried using stdin() instead of file('stdin'),
but that did
> > not work well for the stream running on multiple threads started
by
> > the same parent Java daemon
> > PS4 I've tried this on Linux using R 3.2.3 and 3.3.2
>
> For me, it's Linux, too (Fedora 24), using 'R 3.3.2 patched'..
>
> Martin