# For linux-cifs-client: this paragraph is for the Debian bug tracking # system control robot. Please ignore it. reassign 447153 linux-2.6 thanks On Fri, Oct 19, 2007 at 12:03:01AM +0200, Michal Suchanek wrote:> On 18/10/2007, Colin Watson <cjwatson at debian.org> wrote: > > On Thu, Oct 18, 2007 at 03:32:27PM +0200, Hramrach wrote: > > > When copying to a cifs share scp fails to notice write errors and > > > happily continues copying when there is no disk space. > > > Note that cifs probably only reports these errors on close(), not > > > write(). > > cp reports the error: > > cp firefox-2.0.0.4.tar.gz /mnt/ > cp: closing `/mnt/firefox-2.0.0.4.tar.gz': No space left on device > > scp produces files of the same size with different content for small > files and truncated (although not always zero) files for larger files, > and notices no problem.I've reproduced this locally and I believe it's a kernel bug. Here's an strace excerpt: [pid 25301] <... read resumed> "C", 1) = 1 [pid 25301] read(0, "0", 1) = 1 [pid 25301] read(0, "6", 1) = 1 [pid 25301] read(0, "4", 1) = 1 [pid 25301] read(0, "4", 1) = 1 [pid 25301] read(0, " ", 1) = 1 [pid 25301] read(0, "1", 1) = 1 [pid 25301] read(0, "4", 1) = 1 [pid 25301] read(0, "1", 1) = 1 [pid 25301] read(0, " ", 1) = 1 [pid 25301] read(0, "t", 1) = 1 [pid 25301] read(0, ".", 1) = 1 [pid 25301] read(0, "p", 1) = 1 [pid 25301] read(0, "y", 1) = 1 [pid 25301] read(0, "\n", 1) = 1 [pid 25301] stat64("cifstest-mount//t.py", {st_mode=S_IFREG|0744, st_size=141, ...}) = 0 [pid 25301] open("cifstest-mount//t.py", O_WRONLY|O_CREAT|O_LARGEFILE, 0644) = 3 [pid 25301] write(1, "\0", 1) = 1 [pid 25301] fstat64(3, <unfinished ...> [pid 25301] <... fstat64 resumed> {st_mode=S_IFREG|0744, st_size=141, ...}) = 0 [pid 25301] read(0, <unfinished ...> [pid 25301] <... read resumed> "import gtk\nimport locale\nlocale."..., 141) = 141 [pid 25301] write(3, "import gtk\nimport locale\nlocale."..., 141) = 141 [pid 25301] ftruncate64(3, 141) = 0 [pid 25301] close(3) = 0 [pid 25301] read(0, "\0", 1) = 1 [pid 25301] write(1, "\0", 1) = 1 [pid 25301] read(0, <unfinished ...> [pid 25301] <... read resumed> "", 1) = 0 [pid 25301] exit_group(0) = ? As you can see, it simply isn't getting any error back from the kernel. Here's a reduced test program that exhibits the same problem as scp when run with a filename on a CIFS mount of a full filesystem: #include <stdio.h> #include <unistd.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <errno.h> int main(int argc, char **argv) { int fd; if (argc <= 1) { fprintf(stderr, "Usage: %s filename\n", argv[0]); return 1; } fd = open(argv[1], O_CREAT | O_WRONLY, 0644); if (fd < 0) { perror("open"); return 1; } while (write(fd, "x", 1) < 1) { if (errno == EINTR) continue; perror("write"); return 1; } if (ftruncate(fd, 1) < 0) { perror("ftruncate"); return 1; } if (close(fd) < 0) { perror("close"); return 1; } return 0; } No error, but you end up with a one-byte hole rather than either (a) "x" or (b) an error. If you leave out the ftruncate, then close returns ENOSPC. In either case, you get "CIFS VFS: Write2 ret -28, written = 0" in syslog, but the error code doesn't make it to userspace if there's an ftruncate between write and close. Is this a CIFS bug? I can't find anything in the ftruncate documentation that suggests it is allowed to do this; I think that if write claims to have written all the bytes then userspace ought to be able to assume that ftruncate(fd, st_size) is a no-op. To openssh-unix-dev: does anyone think this is worth a workaround? The ftruncate seems rather unnecessary if we've already written out the required number of bytes anyway. I've attached a patch which only does it if that isn't the case (although I have some trouble seeing how we could ever get to the ftruncate without either writing the required number of bytes or encountering a write error). If people think it's a good idea I'll file it in Bugzilla. Thanks, -- Colin Watson [cjwatson at debian.org] -------------- next part -------------- A non-text attachment was scrubbed... Name: scp-ftruncate.patch Type: text/x-diff Size: 1443 bytes Desc: not available Url : http://lists.mindrot.org/pipermail/openssh-unix-dev/attachments/20071112/28e8e46a/attachment.bin
Michal Suchanek
2007-Nov-12 19:13 UTC
Bug#447153: /usr/bin/scp: Fails to notice write errors
Hello On 12/11/2007, Colin Watson <cjwatson at debian.org> wrote: Thanks for looking into this.> Here's a reduced test program that exhibits the same problem as scp when > run with a filename on a CIFS mount of a full filesystem: > > #include <stdio.h> > #include <unistd.h> > #include <sys/types.h> > #include <sys/stat.h> > #include <fcntl.h> > #include <errno.h> > > int main(int argc, char **argv) > { > int fd; > if (argc <= 1) { > fprintf(stderr, "Usage: %s filename\n", argv[0]); > return 1; > } > fd = open(argv[1], O_CREAT | O_WRONLY, 0644); > if (fd < 0) { > perror("open"); > return 1; > } > while (write(fd, "x", 1) < 1) { > if (errno == EINTR) > continue; > perror("write"); > return 1; > } > if (ftruncate(fd, 1) < 0) { > perror("ftruncate"); > return 1; > } > if (close(fd) < 0) { > perror("close"); > return 1; > } > return 0; > } > > No error, but you end up with a one-byte hole rather than either (a) "x" > or (b) an error. If you leave out the ftruncate, then close returns > ENOSPC. In either case, you get "CIFS VFS: Write2 ret -28, written = 0" > in syslog, but the error code doesn't make it to userspace if there's an > ftruncate between write and close. Is this a CIFS bug? I can't find > anything in the ftruncate documentation that suggests it is allowed to > do this; I think that if write claims to have written all the bytes then > userspace ought to be able to assume that ftruncate(fd, st_size) is a > no-op. > > > To openssh-unix-dev: does anyone think this is worth a workaround? The > ftruncate seems rather unnecessary if we've already written out the > required number of bytes anyway. I've attached a patch which only does > it if that isn't the case (although I have some trouble seeing how we > could ever get to the ftruncate without either writing the required > number of bytes or encountering a write error). If people think it's a > good idea I'll file it in Bugzilla. >I do not know the scp code so I might be missing something. However, truncating the file might be necessary when there is already a file, and it was originally longer. It looks like a bug either in the kernel or in ftruncate documentation. It is certainly true that the write error should get reported at some point if it occurs, and a filesystem that chooses to not return it on write() should save the errors for close(). However, using ftruncate() on the file does, in some sense, successfully extend the file to the desired length. It looks like such mixing of calls was not expected by the fs driver, and may not be well defined in general. I would suggest closing the file, and if it needs truncating, reopen and truncate it (or do some voodoo with duplicated fds if possible). The file could also be truncated in advance, but that would make dealing with files that change during transfer even harder. Although one cannot guarantee any sane results in that case anyway. Thanks Michal
On Mon, Nov 12, 2007 at 06:33:54PM +0000, Colin Watson wrote:> To openssh-unix-dev: does anyone think this is worth a workaround?Gut feeling is that it should be fixed wherever the problem is.> The ftruncate seems rather unnecessary if we've already written out > the required number of bytes anyway.Not neccessarily, we may be overwriting a larger file with the same name. //Peter