Mikko Korpela
2017-Apr-21 10:49 UTC
[Rd] tempdir() may be deleted during long-running R session
Temporary files not accessed for a long time are automatically removed in some Linux distributions and probably other operating systems too, depending on system configuration. This may affect the per-session temporary directory, the path of which is returned by tempdir(). I think it would be nice if R automatically tried to recreate a missing tempdir() but this could have some performance implications. I ran the same test (below) on R 3.3.3 patched, R 3.4.0 beta, and R-devel, all at r72499 (2017-04-09) and compiled by myself. The results from the test were practically identical on all of those versions, the test platform being Ubuntu 14.04.5 LTS. This system is configured for a /tmp cleanup threshold of 7 days of inactivity (which is the default). After a wait of roughly 10 days, the R temporary directory had been deleted by an automatic cleanup procedure, and a call to `?` failed. This StackExchange question has some answers about the Ubuntu /tmp cleanup practice: https://askubuntu.com/q/20783 a <- print(tempdir()) # [1] "/tmp/user/1069138/RtmpGc9M5z" dir.exists(a) # TRUE # [1] TRUE Sys.time() # [1] "2017-04-10 16:00:30 EEST" ## Wait for one week (Ubuntu 14.04.5 LTS) print(Sys.time()); ?regex # [1] "2017-04-20 14:17:29 EEST" # Error in file(out, "wt") : cannot open the connection # In addition: Warning message: # In file(out, "wt") : # cannot open file '/tmp/user/1069138/RtmpGc9M5z/Rtxt3dbb65870ad4': No such file or directory b <- print(tempdir()) # [1] "/tmp/user/1069138/RtmpGc9M5z" identical(a, b) # [1] TRUE dir.exists(b) # [1] FALSE -- Mikko Korpela Department of Geosciences and Geography University of Helsinki
Prof Brian Ripley
2017-Apr-21 11:03 UTC
[Rd] tempdir() may be deleted during long-running R session
From the R-admin manual ?5: 'Various environment variables can be set to determine where R creates its per-session temporary directory. The environment variables TMPDIR, TMP and TEMP are searched in turn and the first one which is set and points to a writable area is used. If none do, the final default is /tmp on Unix-alikes and the value of R_USER on Windows. The path should be an absolute path not containing spaces (and it is best to avoid non-alphanumeric characters such as +). Some Unix-alike systems are set up to remove files and directories periodically from /tmp, for example by a cron job running tmpwatch. Set TMPDIR to another directory before starting long-running jobs on such a system.' On 21/04/2017 11:49, Mikko Korpela wrote:> Temporary files not accessed for a long time are automatically removed > in some Linux distributions and probably other operating systems too, > depending on system configuration. This may affect the per-session > temporary directory, the path of which is returned by tempdir(). I thinkNot for those who follow the manual and know that sysadmnins have enabled such a script.> it would be nice if R automatically tried to recreate a missing > tempdir() but this could have some performance implications. > > I ran the same test (below) on R 3.3.3 patched, R 3.4.0 beta, and > R-devel, all at r72499 (2017-04-09) and compiled by myself. The results > from the test were practically identical on all of those versions, the > test platform being Ubuntu 14.04.5 LTS. This system is configured for a > /tmp cleanup threshold of 7 days of inactivity (which is the default). > After a wait of roughly 10 days, the R temporary directory had been > deleted by an automatic cleanup procedure, and a call to `?` failed. > This StackExchange question has some answers about the Ubuntu /tmp > cleanup practice: https://askubuntu.com/q/20783 > > a <- print(tempdir()) > # [1] "/tmp/user/1069138/RtmpGc9M5z" > dir.exists(a) # TRUE > # [1] TRUE > Sys.time() > # [1] "2017-04-10 16:00:30 EEST" > ## Wait for one week (Ubuntu 14.04.5 LTS) > print(Sys.time()); ?regex > # [1] "2017-04-20 14:17:29 EEST" > # Error in file(out, "wt") : cannot open the connection > # In addition: Warning message: > # In file(out, "wt") : > # cannot open file '/tmp/user/1069138/RtmpGc9M5z/Rtxt3dbb65870ad4': No > such file or directory > b <- print(tempdir()) > # [1] "/tmp/user/1069138/RtmpGc9M5z" > identical(a, b) > # [1] TRUE > dir.exists(b) > # [1] FALSE >-- Brian D. Ripley, ripley at stats.ox.ac.uk Emeritus Professor of Applied Statistics, University of Oxford
Joris Meys
2017-Apr-21 11:42 UTC
[Rd] tempdir() may be deleted during long-running R session
In defense of the OP: I would have checked ?tmpdir and missed the information in the manual as well. On the help page there's ample information on the underlying processes that create the dir on multiple platforms. I think adding the last two sentences of prof. Ripley's quote as a warning to the help page, would be worth the effort. I do wonder though why you would run something that lasts 10 days and still rely on something that is called a "temporary" directory. Best regards Joris On Fri, Apr 21, 2017 at 1:03 PM, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:> From the R-admin manual ?5: > > 'Various environment variables can be set to determine where R creates its > per-session temporary directory. The environment variables TMPDIR, TMP and > TEMP are searched in turn and the first one which is set and points to a > writable area is used. If none do, the final default is /tmp on Unix-alikes > and the value of R_USER on Windows. The path should be an absolute path not > containing spaces (and it is best to avoid non-alphanumeric characters such > as +). > > Some Unix-alike systems are set up to remove files and directories > periodically from /tmp, for example by a cron job running tmpwatch. Set > TMPDIR to another directory before starting long-running jobs on such a > system.' > > > On 21/04/2017 11:49, Mikko Korpela wrote: > >> Temporary files not accessed for a long time are automatically removed >> in some Linux distributions and probably other operating systems too, >> depending on system configuration. This may affect the per-session >> temporary directory, the path of which is returned by tempdir(). I think >> > > Not for those who follow the manual and know that sysadmnins have enabled > such a script. > > > it would be nice if R automatically tried to recreate a missing >> tempdir() but this could have some performance implications. >> >> I ran the same test (below) on R 3.3.3 patched, R 3.4.0 beta, and >> R-devel, all at r72499 (2017-04-09) and compiled by myself. The results >> from the test were practically identical on all of those versions, the >> test platform being Ubuntu 14.04.5 LTS. This system is configured for a >> /tmp cleanup threshold of 7 days of inactivity (which is the default). >> After a wait of roughly 10 days, the R temporary directory had been >> deleted by an automatic cleanup procedure, and a call to `?` failed. >> This StackExchange question has some answers about the Ubuntu /tmp >> cleanup practice: https://askubuntu.com/q/20783 >> >> a <- print(tempdir()) >> # [1] "/tmp/user/1069138/RtmpGc9M5z" >> dir.exists(a) # TRUE >> # [1] TRUE >> Sys.time() >> # [1] "2017-04-10 16:00:30 EEST" >> ## Wait for one week (Ubuntu 14.04.5 LTS) >> print(Sys.time()); ?regex >> # [1] "2017-04-20 14:17:29 EEST" >> # Error in file(out, "wt") : cannot open the connection >> # In addition: Warning message: >> # In file(out, "wt") : >> # cannot open file '/tmp/user/1069138/RtmpGc9M5z/Rtxt3dbb65870ad4': No >> such file or directory >> b <- print(tempdir()) >> # [1] "/tmp/user/1069138/RtmpGc9M5z" >> identical(a, b) >> # [1] TRUE >> dir.exists(b) >> # [1] FALSE >> >> > > -- > Brian D. Ripley, ripley at stats.ox.ac.uk > Emeritus Professor of Applied Statistics, University of Oxford > > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Mathematical Modelling, Statistics and Bio-Informatics tel : +32 (0)9 264 61 79 Joris.Meys at Ugent.be ------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]]
Mikko Korpela
2017-Apr-21 12:13 UTC
[Rd] tempdir() may be deleted during long-running R session
On 21/04/17 14:03, Prof Brian Ripley wrote:> From the R-admin manual ?5: > > 'Various environment variables can be set to determine where R creates > its per-session temporary directory. The environment variables TMPDIR, > TMP and TEMP are searched in turn and the first one which is set and > points to a writable area is used. If none do, the final default is /tmp > on Unix-alikes and the value of R_USER on Windows. The path should be an > absolute path not containing spaces (and it is best to avoid > non-alphanumeric characters such as +). > > Some Unix-alike systems are set up to remove files and directories > periodically from /tmp, for example by a cron job running tmpwatch. Set > TMPDIR to another directory before starting long-running jobs on such a > system.'I am sorry for having missed this part of the manual, where the issue indeed is clearly documented.> > > On 21/04/2017 11:49, Mikko Korpela wrote: >> Temporary files not accessed for a long time are automatically removed >> in some Linux distributions and probably other operating systems too, >> depending on system configuration. This may affect the per-session >> temporary directory, the path of which is returned by tempdir(). I think > > Not for those who follow the manual and know that sysadmnins have > enabled such a script. > >> it would be nice if R automatically tried to recreate a missing >> tempdir() but this could have some performance implications.Despite my obvious failure to read the manual and report this properly, I will try to make a case. I understand that data stored in a temporary file may disappear, and for that reason using an alternative TMPDIR might be advisable. However, I think that creating a new temporary file is a different case, and it would be nice if `?` and `help` continued to work, for example. I understand if this will not be put on the R core list of things to do.>> >> I ran the same test (below) on R 3.3.3 patched, R 3.4.0 beta, and >> R-devel, all at r72499 (2017-04-09) and compiled by myself. The results >> from the test were practically identical on all of those versions, the >> test platform being Ubuntu 14.04.5 LTS. This system is configured for a >> /tmp cleanup threshold of 7 days of inactivity (which is the default). >> After a wait of roughly 10 days, the R temporary directory had been >> deleted by an automatic cleanup procedure, and a call to `?` failed. >> This StackExchange question has some answers about the Ubuntu /tmp >> cleanup practice: https://askubuntu.com/q/20783 >> >> a <- print(tempdir()) >> # [1] "/tmp/user/1069138/RtmpGc9M5z" >> dir.exists(a) # TRUE >> # [1] TRUE >> Sys.time() >> # [1] "2017-04-10 16:00:30 EEST" >> ## Wait for one week (Ubuntu 14.04.5 LTS) >> print(Sys.time()); ?regex >> # [1] "2017-04-20 14:17:29 EEST" >> # Error in file(out, "wt") : cannot open the connection >> # In addition: Warning message: >> # In file(out, "wt") : >> # cannot open file '/tmp/user/1069138/RtmpGc9M5z/Rtxt3dbb65870ad4': No >> such file or directory >> b <- print(tempdir()) >> # [1] "/tmp/user/1069138/RtmpGc9M5z" >> identical(a, b) >> # [1] TRUE >> dir.exists(b) >> # [1] FALSE >>-- Mikko Korpela Department of Geosciences and Geography University of Helsinki
Possibly Parallel Threads
- tempdir() may be deleted during long-running R session
- tempdir() may be deleted during long-running R session
- tempdir() may be deleted during long-running R session
- tempdir() may be deleted during long-running R session
- tempdir() may be deleted during long-running R session