frederik at ofb.net
2017-Apr-21 17:34 UTC
[Rd] tempdir() may be deleted during long-running R session
Hi Mikko, I was bitten by this recently and I think some of the replies are missing the point. As I understand it, the problem consists of these elements: 1. When R starts, it creates a directory like /tmp/RtmpVIeFj4 2. Right after R starts I can create files in this directory with no error 3. After some hours or days I can no longer create files in this directory, because it has been deleted If R expected the directory to be deleted at random, and if we expect users to call dir.create every time they access tempdir, then why did R create the directory for us at the beginning of the session? That's just setting people up to get weird bugs, which only appear in difficult-to-reproduce situations (i.e. after the session has been open for a long time). I think before we dismiss this we should think about possible in-R solutions and why they are not feasible. Are there any packages which would break if a call to 'tempdir' automatically recreated this directory? (Or would it be too much of a performance hit to have 'tempdir' check and even just issue a warning when the directory is found not to exist?) Should we have a timer which periodically updates the modification time of tempdir()? What do other long-running programs do (e.g. screen, emacs)? Thank you, Frederick P.S. I noticed that dir.create does not seem to update the access or modification time of the file. So there is also a remote possibility that the directory could be "cleaned up" in between calling 'dir.create()' and putting a file in it. Maybe this is nitpicky, but if we accept that the *really* correct practice is more complicated than just calling 'dir.create()', this also argues for putting the proper invocations into some kind of standard function - either 'tempdir()' or something else.
Dirk Eddelbuettel
2017-Apr-23 14:15 UTC
[Rd] tempdir() may be deleted during long-running R session
On 21 April 2017 at 10:34, frederik at ofb.net wrote: | Hi Mikko, | | I was bitten by this recently and I think some of the replies are | missing the point. As I understand it, the problem consists of these | elements: | | 1. When R starts, it creates a directory like /tmp/RtmpVIeFj4 | | 2. Right after R starts I can create files in this directory with no | error | | 3. After some hours or days I can no longer create files in this | directory, because it has been deleted Nope. That is local to your system. Witness eg at my workstation: /tmp$ ls -ltGd Rtmp* drwx------ 3 edd 4096 Apr 21 16:12 Rtmp9K6bSN drwx------ 3 edd 4096 Apr 21 11:48 RtmpRRbaMP drwx------ 3 edd 4096 Apr 21 11:28 RtmpFlguFy drwx------ 3 edd 4096 Apr 20 13:06 RtmpWJDF3U drwx------ 3 edd 4096 Apr 18 15:58 RtmpY7ZIS1 drwx------ 3 edd 4096 Apr 18 12:12 Rtmpzr9W0v drwx------ 2 edd 4096 Apr 16 16:02 RtmpeD27El drwx------ 2 edd 4096 Apr 16 15:57 Rtmp572FHk drwx------ 3 edd 4096 Apr 13 11:08 RtmpqP0JSf drwx------ 3 edd 4096 Apr 10 18:47 RtmpzRzyFb drwx------ 3 edd 4096 Apr 6 15:21 RtmpQhvAUb drwx------ 3 edd 4096 Apr 6 11:24 Rtmp2lFKPz drwx------ 3 edd 4096 Apr 5 20:57 RtmprCeWUS drwx------ 2 edd 4096 Apr 3 15:12 Rtmp8xviDl drwx------ 3 edd 4096 Mar 30 16:50 Rtmp8w9n5h drwx------ 3 edd 4096 Mar 28 11:33 RtmpjAg6iY drwx------ 2 edd 4096 Mar 28 09:26 RtmpYHSgZG drwx------ 2 edd 4096 Mar 27 11:21 Rtmp0gSV4e drwx------ 2 edd 4096 Mar 27 11:21 RtmpOnneiY drwx------ 2 edd 4096 Mar 27 11:17 RtmpIWeiTJ drwx------ 3 edd 4096 Mar 22 08:51 RtmpJkVsSJ drwx------ 3 edd 4096 Mar 21 10:33 Rtmp9a5KxL /tmp$ Clearly still there after a month. I tend to have some longer-running R sessions in either Emacs/ESS or RStudio. So what I wrote in my last message here *clearly* applies to you: a local issue for which you have to take local action as R cannot know. You also have a choice of setting variables to affect this. | If R expected the directory to be deleted at random, and if we expect | users to call dir.create every time they access tempdir, then why did | R create the directory for us at the beginning of the session? That's | just setting people up to get weird bugs, which only appear in | difficult-to-reproduce situations (i.e. after the session has been | open for a long time). I disagree. R has been doing this many years, possibly two decades. | I think before we dismiss this we should think about possible in-R | solutions and why they are not feasible. Are there any packages which | would break if a call to 'tempdir' automatically recreated this | directory? (Or would it be too much of a performance hit to have | 'tempdir' check and even just issue a warning when the directory is | found not to exist?) Should we have a timer which periodically updates | the modification time of tempdir()? What do other long-running | programs do (e.g. screen, emacs)? There are options you have right now. Dirk -- http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
frederik at ofb.net
2017-Apr-24 19:34 UTC
[Rd] tempdir() may be deleted during long-running R session
Dirk, Your message felt a bit antagonistic to me, or maybe I'm not understanding what you're trying to say. We all seem to agree that different configurations exist, and that some Linux distributions are configured to delete files in /tmp/ after a certain amount of time (seems to be 10 days for Arch Linux, not sure about Ubuntu or Debian). The question of how users of such distributions can individually work around the problem Mikko identified has already been answered. The question that remains is what we expect new users to do. It's not really helpful to pretend that they will be reading the mailing list, as exciting as it is, or that they'll read the "R Installation and Administration" manual to make sure that their distribution did a good job of packaging R. There are plenty more visible places where this "gotcha" could be documented, than a manual I've never heard of until now. Even if a particular solution has to be implemented by the package maintainers of various distributions, I think it is fitting to discuss and solicit such solutions here on this mailing list. But it felt like you were trying to stifle such discussion. As it is, I don't even know what distributions are affected. I'm not sure how to look up the contents of a "default" configuration on other distributions. Frederick On Sun, Apr 23, 2017 at 09:15:18AM -0500, Dirk Eddelbuettel wrote:> > On 21 April 2017 at 10:34, frederik at ofb.net wrote: > | Hi Mikko, > | > | I was bitten by this recently and I think some of the replies are > | missing the point. As I understand it, the problem consists of these > | elements: > | > | 1. When R starts, it creates a directory like /tmp/RtmpVIeFj4 > | > | 2. Right after R starts I can create files in this directory with no > | error > | > | 3. After some hours or days I can no longer create files in this > | directory, because it has been deleted > > Nope. That is local to your system. Witness eg at my workstation: > > /tmp$ ls -ltGd Rtmp* > drwx------ 3 edd 4096 Apr 21 16:12 Rtmp9K6bSN > drwx------ 3 edd 4096 Apr 21 11:48 RtmpRRbaMP > drwx------ 3 edd 4096 Apr 21 11:28 RtmpFlguFy > drwx------ 3 edd 4096 Apr 20 13:06 RtmpWJDF3U > drwx------ 3 edd 4096 Apr 18 15:58 RtmpY7ZIS1 > drwx------ 3 edd 4096 Apr 18 12:12 Rtmpzr9W0v > drwx------ 2 edd 4096 Apr 16 16:02 RtmpeD27El > drwx------ 2 edd 4096 Apr 16 15:57 Rtmp572FHk > drwx------ 3 edd 4096 Apr 13 11:08 RtmpqP0JSf > drwx------ 3 edd 4096 Apr 10 18:47 RtmpzRzyFb > drwx------ 3 edd 4096 Apr 6 15:21 RtmpQhvAUb > drwx------ 3 edd 4096 Apr 6 11:24 Rtmp2lFKPz > drwx------ 3 edd 4096 Apr 5 20:57 RtmprCeWUS > drwx------ 2 edd 4096 Apr 3 15:12 Rtmp8xviDl > drwx------ 3 edd 4096 Mar 30 16:50 Rtmp8w9n5h > drwx------ 3 edd 4096 Mar 28 11:33 RtmpjAg6iY > drwx------ 2 edd 4096 Mar 28 09:26 RtmpYHSgZG > drwx------ 2 edd 4096 Mar 27 11:21 Rtmp0gSV4e > drwx------ 2 edd 4096 Mar 27 11:21 RtmpOnneiY > drwx------ 2 edd 4096 Mar 27 11:17 RtmpIWeiTJ > drwx------ 3 edd 4096 Mar 22 08:51 RtmpJkVsSJ > drwx------ 3 edd 4096 Mar 21 10:33 Rtmp9a5KxL > /tmp$ > > Clearly still there after a month. I tend to have some longer-running R > sessions in either Emacs/ESS or RStudio. > > So what I wrote in my last message here *clearly* applies to you: a local > issue for which you have to take local action as R cannot know. You also > have a choice of setting variables to affect this. > > | If R expected the directory to be deleted at random, and if we expect > | users to call dir.create every time they access tempdir, then why did > | R create the directory for us at the beginning of the session? That's > | just setting people up to get weird bugs, which only appear in > | difficult-to-reproduce situations (i.e. after the session has been > | open for a long time). > > I disagree. R has been doing this many years, possibly two decades. > > | I think before we dismiss this we should think about possible in-R > | solutions and why they are not feasible. Are there any packages which > | would break if a call to 'tempdir' automatically recreated this > | directory? (Or would it be too much of a performance hit to have > | 'tempdir' check and even just issue a warning when the directory is > | found not to exist?) Should we have a timer which periodically updates > | the modification time of tempdir()? What do other long-running > | programs do (e.g. screen, emacs)? > > There are options you have right now. > > Dirk > > -- > http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org >
Martin Maechler
2017-Apr-25 11:00 UTC
[Rd] tempdir() may be deleted during long-running R session
>>>>> Dirk Eddelbuettel <edd at debian.org> >>>>> on Sun, 23 Apr 2017 09:15:18 -0500 writes:> On 21 April 2017 at 10:34, frederik at ofb.net wrote: > | Hi Mikko, > | > | I was bitten by this recently and I think some of the replies are > | missing the point. As I understand it, the problem consists of these > | elements: > | > | 1. When R starts, it creates a directory like /tmp/RtmpVIeFj4 > | > | 2. Right after R starts I can create files in this directory with no > | error > | > | 3. After some hours or days I can no longer create files in this > | directory, because it has been deleted > Nope. That is local to your system. Correct. OTOH, Mikko and Frederik have a point in my view (below). > Witness eg at my workstation: > /tmp$ ls -ltGd Rtmp* > drwx------ 3 edd 4096 Apr 21 16:12 Rtmp9K6bSN > drwx------ 3 edd 4096 Apr 21 11:48 RtmpRRbaMP > drwx------ 3 edd 4096 Apr 21 11:28 RtmpFlguFy > drwx------ 3 edd 4096 Apr 20 13:06 RtmpWJDF3U > drwx------ 3 edd 4096 Apr 18 15:58 RtmpY7ZIS1 > drwx------ 3 edd 4096 Apr 18 12:12 Rtmpzr9W0v > drwx------ 2 edd 4096 Apr 16 16:02 RtmpeD27El > drwx------ 2 edd 4096 Apr 16 15:57 Rtmp572FHk > drwx------ 3 edd 4096 Apr 13 11:08 RtmpqP0JSf > drwx------ 3 edd 4096 Apr 10 18:47 RtmpzRzyFb > drwx------ 3 edd 4096 Apr 6 15:21 RtmpQhvAUb > drwx------ 3 edd 4096 Apr 6 11:24 Rtmp2lFKPz > drwx------ 3 edd 4096 Apr 5 20:57 RtmprCeWUS > drwx------ 2 edd 4096 Apr 3 15:12 Rtmp8xviDl > drwx------ 3 edd 4096 Mar 30 16:50 Rtmp8w9n5h > drwx------ 3 edd 4096 Mar 28 11:33 RtmpjAg6iY > drwx------ 2 edd 4096 Mar 28 09:26 RtmpYHSgZG > drwx------ 2 edd 4096 Mar 27 11:21 Rtmp0gSV4e > drwx------ 2 edd 4096 Mar 27 11:21 RtmpOnneiY > drwx------ 2 edd 4096 Mar 27 11:17 RtmpIWeiTJ > drwx------ 3 edd 4096 Mar 22 08:51 RtmpJkVsSJ > drwx------ 3 edd 4096 Mar 21 10:33 Rtmp9a5KxL > /tmp$ > Clearly still there after a month. I tend to have some longer-running R > sessions in either Emacs/ESS or RStudio. > So what I wrote in my last message here *clearly* applies to you: a local > issue for which you have to take local action as R cannot know. You also > have a choice of setting variables to affect this. Thank you Dirk (and Brian). That is all true, and of course I have known about this myself "forever" as well. > | If R expected the directory to be deleted at random, and if we expect > | users to call dir.create every time they access tempdir, then why did > | R create the directory for us at the beginning of the session? That's > | just setting people up to get weird bugs, which only appear in > | difficult-to-reproduce situations (i.e. after the session has been > | open for a long time). > I disagree. R has been doing this many years, possibly two decades. Yes, R has been doing this for a long time, including all the configuration options with environment variables, and yes this is sufficient "in principle". > | I think before we dismiss this we should think about possible in-R > | solutions and why they are not feasible. Here Mikko and Frederik do have a point I think. > | Are there any packages which > | would break if a call to 'tempdir' automatically recreated this > | directory? (Or would it be too much of a performance hit to have > | 'tempdir' check and even just issue a warning when the directory is > | found not to exist?) > | Should we have a timer which periodically updates > | the modification time of tempdir()? What do other long-running > | programs do (e.g. screen, emacs)? Valid questions, in my view. Before answering, let's try to see how hard it would be to make the tempdir() function in R more versatile. As I've found it is not at all hard to add an option which checks the existence and if the directory is no longer "valid", tries to recreate it (and if it fails doing that it calls the famous R_Suicide(), as it does when R starts up and tempdir() cannot be initialized correctly). The proposed entry in NEWS is ? tempdir(check=TRUE) recreates the tmpdir() if it is no longer valid. and of course the default would be status quo, i.e., check = FALSE, and once this is in R-devel, we (those who install R-devel) can experiment with it. Martin