Martin Maechler
2017-Apr-25 11:00 UTC
[Rd] tempdir() may be deleted during long-running R session
>>>>> Dirk Eddelbuettel <edd at debian.org> >>>>> on Sun, 23 Apr 2017 09:15:18 -0500 writes:> On 21 April 2017 at 10:34, frederik at ofb.net wrote: > | Hi Mikko, > | > | I was bitten by this recently and I think some of the replies are > | missing the point. As I understand it, the problem consists of these > | elements: > | > | 1. When R starts, it creates a directory like /tmp/RtmpVIeFj4 > | > | 2. Right after R starts I can create files in this directory with no > | error > | > | 3. After some hours or days I can no longer create files in this > | directory, because it has been deleted > Nope. That is local to your system. Correct. OTOH, Mikko and Frederik have a point in my view (below). > Witness eg at my workstation: > /tmp$ ls -ltGd Rtmp* > drwx------ 3 edd 4096 Apr 21 16:12 Rtmp9K6bSN > drwx------ 3 edd 4096 Apr 21 11:48 RtmpRRbaMP > drwx------ 3 edd 4096 Apr 21 11:28 RtmpFlguFy > drwx------ 3 edd 4096 Apr 20 13:06 RtmpWJDF3U > drwx------ 3 edd 4096 Apr 18 15:58 RtmpY7ZIS1 > drwx------ 3 edd 4096 Apr 18 12:12 Rtmpzr9W0v > drwx------ 2 edd 4096 Apr 16 16:02 RtmpeD27El > drwx------ 2 edd 4096 Apr 16 15:57 Rtmp572FHk > drwx------ 3 edd 4096 Apr 13 11:08 RtmpqP0JSf > drwx------ 3 edd 4096 Apr 10 18:47 RtmpzRzyFb > drwx------ 3 edd 4096 Apr 6 15:21 RtmpQhvAUb > drwx------ 3 edd 4096 Apr 6 11:24 Rtmp2lFKPz > drwx------ 3 edd 4096 Apr 5 20:57 RtmprCeWUS > drwx------ 2 edd 4096 Apr 3 15:12 Rtmp8xviDl > drwx------ 3 edd 4096 Mar 30 16:50 Rtmp8w9n5h > drwx------ 3 edd 4096 Mar 28 11:33 RtmpjAg6iY > drwx------ 2 edd 4096 Mar 28 09:26 RtmpYHSgZG > drwx------ 2 edd 4096 Mar 27 11:21 Rtmp0gSV4e > drwx------ 2 edd 4096 Mar 27 11:21 RtmpOnneiY > drwx------ 2 edd 4096 Mar 27 11:17 RtmpIWeiTJ > drwx------ 3 edd 4096 Mar 22 08:51 RtmpJkVsSJ > drwx------ 3 edd 4096 Mar 21 10:33 Rtmp9a5KxL > /tmp$ > Clearly still there after a month. I tend to have some longer-running R > sessions in either Emacs/ESS or RStudio. > So what I wrote in my last message here *clearly* applies to you: a local > issue for which you have to take local action as R cannot know. You also > have a choice of setting variables to affect this. Thank you Dirk (and Brian). That is all true, and of course I have known about this myself "forever" as well. > | If R expected the directory to be deleted at random, and if we expect > | users to call dir.create every time they access tempdir, then why did > | R create the directory for us at the beginning of the session? That's > | just setting people up to get weird bugs, which only appear in > | difficult-to-reproduce situations (i.e. after the session has been > | open for a long time). > I disagree. R has been doing this many years, possibly two decades. Yes, R has been doing this for a long time, including all the configuration options with environment variables, and yes this is sufficient "in principle". > | I think before we dismiss this we should think about possible in-R > | solutions and why they are not feasible. Here Mikko and Frederik do have a point I think. > | Are there any packages which > | would break if a call to 'tempdir' automatically recreated this > | directory? (Or would it be too much of a performance hit to have > | 'tempdir' check and even just issue a warning when the directory is > | found not to exist?) > | Should we have a timer which periodically updates > | the modification time of tempdir()? What do other long-running > | programs do (e.g. screen, emacs)? Valid questions, in my view. Before answering, let's try to see how hard it would be to make the tempdir() function in R more versatile. As I've found it is not at all hard to add an option which checks the existence and if the directory is no longer "valid", tries to recreate it (and if it fails doing that it calls the famous R_Suicide(), as it does when R starts up and tempdir() cannot be initialized correctly). The proposed entry in NEWS is ? tempdir(check=TRUE) recreates the tmpdir() if it is no longer valid. and of course the default would be status quo, i.e., check = FALSE, and once this is in R-devel, we (those who install R-devel) can experiment with it. Martin
Jeroen Ooms
2017-Apr-25 13:05 UTC
[Rd] tempdir() may be deleted during long-running R session
On Tue, Apr 25, 2017 at 1:00 PM, Martin Maechler <maechler at stat.math.ethz.ch> wrote:> As I've found it is not at all hard to add an option which > checks the existence and if the directory is no longer "valid", > tries to recreate it (and if it fails doing that it calls the > famous R_Suicide(), as it does when R starts up and tempdir() > cannot be initialized correctly).Perhaps this can also fix the problem with mcparallel deleting the tempdir() when one of its children dies: file.exists(tempdir()) #TRUE parallel::mcparallel(q('no')) file.exists(tempdir()) # FALSE
Cook, Malcolm
2017-Apr-25 14:41 UTC
[Rd] tempdir() may be deleted during long-running R session
Chiming in late on this thread...> > | Are there any packages which> > | would break if a call to 'tempdir' automatically recreated this > > | directory? (Or would it be too much of a performance hit to have > > | 'tempdir' check and even just issue a warning when the directory is > > | found not to exist?) > > > | Should we have a timer which periodically updates > > | the modification time of tempdir()? What do other long-running > > | programs do (e.g. screen, emacs)? > > Valid questions, in my view. Before answering, let's try to see > how hard it would be to make the tempdir() function in R more versatile. Might this combination serve the purpose: * R session keeps an open handle on the tempdir it creates, * whatever tempdir harvesting cron job the user has be made sensitive enough not to delete open files (including open directories) > > As I've found it is not at all hard to add an option which > checks the existence and if the directory is no longer "valid", > tries to recreate it (and if it fails doing that it calls the > famous R_Suicide(), as it does when R starts up and tempdir() > cannot be initialized correctly). > > The proposed entry in NEWS is > > ? tempdir(check=TRUE) recreates the tmpdir() if it is no longer valid. > > and of course the default would be status quo, i.e., check = FALSE, > and once this is in R-devel, we (those who install R-devel) can > experiment with it. > > Martin
Martin Maechler
2017-Apr-25 15:53 UTC
[Rd] tempdir() may be deleted during long-running R session
>>>>> Jeroen Ooms <jeroenooms at gmail.com> >>>>> on Tue, 25 Apr 2017 15:05:51 +0200 writes:> On Tue, Apr 25, 2017 at 1:00 PM, Martin Maechler > <maechler at stat.math.ethz.ch> wrote: >> As I've found it is not at all hard to add an option >> which checks the existence and if the directory is no >> longer "valid", tries to recreate it (and if it fails >> doing that it calls the famous R_Suicide(), as it does >> when R starts up and tempdir() cannot be initialized >> correctly). > Perhaps this can also fix the problem with mcparallel > deleting the tempdir() when one of its children dies: > file.exists(tempdir()) #TRUE > parallel::mcparallel(q('no')) > file.exists(tempdir()) # FALSE Thank you, Jeroen, for the extra example. I now have comitted the new feature... (completely back compatible: in R's code tempdir() is not yet called with an argument and the default is check = FALSE ), actually in a "suicide-free" way ... which needed only slightly more code. In the worst case, one could save the R session by Sys.setenv(TEMPDIR = "<something writable>") if for instance /tmp/ suddenly became unwritable for the user. What we could consider is making the default of 'check' settable by an option, and experiment with setting the option to TRUE, so all such problems would be auto-solved (says the incurable optimist ...). Martin
frederik at ofb.net
2017-Apr-26 04:13 UTC
[Rd] tempdir() may be deleted during long-running R session
On Tue, Apr 25, 2017 at 02:41:58PM +0000, Cook, Malcolm wrote:> Might this combination serve the purpose: > * R session keeps an open handle on the tempdir it creates, > * whatever tempdir harvesting cron job the user has be made sensitive enough not to delete open files (including open directories)Good suggestion but doesn't work with the (increasingly popular) "Systemd": $ mkdir /tmp/somedir $ touch -d "12 days ago" /tmp/somedir/ $ cd /tmp/somedir/ $ sudo systemd-tmpfiles --clean $ ls /tmp/somedir/ ls: cannot access '/tmp/somedir/': No such file or directory I would advocate just changing 'tempfile()' so that it recreates the directory where the file is (the "dirname") before returning the file path. This would have fixed the issue I ran into. Changing 'tempdir()' to recreate the directory is another option. Thanks, Frederick
Maybe Matching Threads
- tempdir() may be deleted during long-running R session
- tempdir() may be deleted during long-running R session
- tempdir() may be deleted during long-running R session
- tempdir() may be deleted during long-running R session
- tempdir() may be deleted during long-running R session