mjbauer at eecs.tufts.edu
2008-Jun-18  20:30 UTC
[Rd] Improper directory removal causes file system havoc (PR#11738)
Full_Name: Michael Bauer
Version: 2.7.0
OS: Solaris 10 (sparc)
Submission from: (NULL) (130.64.21.7)
When running 'make check' on a fresh R 2.7.0 build on Solaris 10
(sparc),
reg-tests-1.R fails consistently because the test directory /tmp/R<random>
cannot be deleted.  The reason for the deletion failure is because the file
system thinks that the directory is still full, as its link count is not 2 -- it
is 5.  The key bit of reg-tests-1.R is:
dd <- c("dir1", "dir2", "dirs",
"moredirs")
for(d in dd) dir.create(d)
dir(".")
file.create(file.path(dd, "somefile"))
dir(".", recursive=TRUE)
stopifnot(unlink("dir?") == 1) # not an error
unlink("dir?", recursive = TRUE)
stopifnot(file.exists(dd) == c(FALSE, FALSE, FALSE, TRUE))
unlink("*dir*", recursive = TRUE)
stopifnot(!file.exists(dd))
I've run this code snippet in the R command line.  The initial state of the
directory is:
root at sunfire16# mkdir testdir
root at sunfire16# cd testdir
root at sunfire16# ls -al
total 34
drwx------   2 root          512 Jun 18 16:12 .
drwxrwxrwt  18 root        33280 Jun 18 16:12 ..
After the file.create command, the directory state is:
root at sunfire16# ls -al
total 38
drwx------   6 root          512 Jun 18 16:13 .
drwxrwxrwt  19 root        33280 Jun 18 16:13 ..
drwx------   2 root          512 Jun 18 16:13 dir1
drwx------   2 root          512 Jun 18 16:13 dir2
drwx------   2 root          512 Jun 18 16:13 dirs
drwx------   2 root          512 Jun 18 16:13 moredirs
root at sunfire16# ls -al *
dir1:
total 2
drwx------   2 root          512 Jun 18 16:13 .
drwx------   6 root          512 Jun 18 16:13 ..
-rw-------   1 root            0 Jun 18 16:13 somefile
dir2:
total 2
drwx------   2 root          512 Jun 18 16:13 .
drwx------   6 root          512 Jun 18 16:13 ..
-rw-------   1 root            0 Jun 18 16:13 somefile
dirs:
total 2
drwx------   2 root          512 Jun 18 16:13 .
drwx------   6 root          512 Jun 18 16:13 ..
-rw-------   1 root            0 Jun 18 16:13 somefile
moredirs:
total 2
drwx------   2 root          512 Jun 18 16:13 .
drwx------   6 root          512 Jun 18 16:13 ..
-rw-------   1 root            0 Jun 18 16:13 somefile
However, immediately after the unlink("dir?") command, the directory
state is
incorrect.  Note that in addition to dir1, dir2, and dirs being missing, the
link count on the parent directory is still 6, not 3.
root at sunfire16# ls -al
total 35
drwx------   6 root          512 Jun 18 16:13 .
drwxrwxrwt  20 root        33280 Jun 18 16:20 ..
drwx------   2 root          512 Jun 18 16:13 moredirs
root at sunfire16# ls -al *
total 2
drwx------   2 root          512 Jun 18 16:13 .
drwx------   6 root          512 Jun 18 16:13 ..
-rw-------   1 root            0 Jun 18 16:13 somefile
Once the rest of the code runs, the directory state is still incorrect, with the
link count on a now-empty directory still being 5 instead of 2:
root at sunfire16# ls -al
total 34
drwx------   5 root          512 Jun 18 16:23 .
drwxrwxrwt  20 root        33280 Jun 18 16:20 ..
The directory now cannot be removed, either within R or by hand in the shell:
root at sunfire16# cd ..
root at sunfire16# rmdir testdir
rmdir: directory "testdir": Directory not empty
The only thing I've found to fix this is unmounting the file system and
running
fsck on it.  This finds the mis-deleted directories and fixes the broken link
counts.
I'm not certain if this is a bug in R or in Solaris, though it certainly
looks
like a Solaris bug.  I have submitted the bug to both you and Solaris.
I also have truss output from a run of this, which shows R using a mix of
unlink() and unlinkat() for directory removal.  unlink() is the call that
incorrectly removes the directories.  The pertinent bits are:
17500:  unlink("dir1")                                  = 0
17500:  unlink("dir2")                                  = 0
17500:  unlink("dirs")                                  = 0
17500:  unlink("/tmp/RtmpKrefN4/41c6167e")              = 0
17702:  lstat64("/", 0xFFBFF640)                        = 0
17702:  fstatat64(-3041965, "/tmp/RtmpKrefN4", 0xFFBFF5C8, 0x00001000)
= 0
17702:  unlinkat(-3041965, "/tmp/RtmpKrefN4", 0x00000000) Err#1 EPERM
[sys_linkd
ir]
17702:  unlinkat(3, "file153e1776", 0x00000000)         = 0
17702:  unlinkat(3, "file47b5546c", 0x00000000)         = 0
17702:  unlinkat(3, "moredirs", 0x00000000)             Err#1 EPERM
[sys_linkdir
]
17702:  fstatat64(3, "moredirs", 0xFFBFF530, 0x00001000) = 0
17702:  openat64(3, "moredirs",
O_RDONLY|O_NONBLOCK|O_NOCTTY|O_NOFOLLOW) = 4
17702:  unlinkat(4, "somefile", 0x00000000)             = 0
17702:  unlinkat(3, "moredirs", 0x00000001)             = 0
17702:  unlinkat(-3041965, "/tmp/RtmpKrefN4", 0x00000001) Err#22
EINVAL
I can provide the full truss output if desired.
Seemingly Similar Threads
- Improper directory removal causes file system havoc (PR#11747)
- make check reg-tests-1.R error on solaris
- make check reg-tests-1.R error on solaris
- regression tests for unlink and wildcards fail - Solaris 10 SPARC / Sun Studio 12 (PR#10501)
- [PATCH 2/3] syscalls: Add syscalls needed by arm64
