thr3ads.net - rsync - [Bug 13317] New: rsync returns success when target filesystem is full [Mar 2018]

If this information is useful, please help other people find it:
Share via:

samba-bugs at samba.org

2018-Mar-01 16:38 UTC

[Bug 13317] New: rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

            Bug ID: 13317
           Summary: rsync returns success when target filesystem is full
           Product: rsync
           Version: 3.1.2
          Hardware: x64
                OS: FreeBSD
            Status: NEW
          Severity: major
          Priority: P5
         Component: core
          Assignee: wayned at samba.org
          Reporter: rui.desousa at icloud.com
        QA Contact: rsync-qa at samba.org

rsync returns success when the target filesystem is full (ZFS using quotas). 
It leaves a corrupted truncated or sparsely populated file on the target while
returning success.

i.e. Target system does not have enough space; however the rsync does not fail
on the syncing of the file but on the rename of file.  If however, there is
just enough space left over for the rename; then the complete operation is
marked as success and a truncated or corrupted sparse file is left on the
target filesystem.  This causes issues with application that rely on the return
code like Postgresql's archive process which will then delete the original
WAL
file after a success code is returned.  The result is that WAL file was never
archive properly resulting in corrupted database backup.   

[postgres at hades ~/dbc1/pg_wal]$ rsync -a 000000010000005B00000035
~/arch/dbc1/wal/000000010000005B00000035 
rsync: rename
"/usr/home/postgres/arch/dbc1/wal/.000000010000005B00000035.6bjO94"
->
"000000010000005B00000035": Disc quota exceeded (69) 
rsync error: some files/attrs were not transferred (see previous errors) (code
23) at main.c(1200) [sender=3.1.2] 


i.e. more detail example: 

https://www.postgresql.org/message-id/21AF0CB4-0873-4960-BA14-E72FA08B352E%40icloud.com

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-01 17:07 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #1 from Kevin Korb <rsync at sanitarium.net> ---
Are you saying that it is exiting with an exit code of 0 after outputting that
error or that sometimes in the same condition it shows no error and exits code
0?  Either way it would probably be helpful to see an output with -v and -i.

Also, in your link, that isn't what -c does.  It does not verify that the
file
was transferred correctly.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-01 17:18 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #2 from Rui DeSousa <rui.desousa at icloud.com> ---
(In reply to Kevin Korb from comment #1)

I saying that in some cases the rename does not fail; and in those cases it
returns success despite there not being enough space to store the original
file.  It looks like it should have error before the rename occurs but does
not; thus leaving a truncated or sparsely populated file.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-05 20:26 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #3 from Dave Gordon <dg32768 at zoho.eu> ---
(In reply to Rui DeSousa from comment #2)

Here's a snippet from Oracle's ZFS help ...
https://docs.oracle.com/cd/E19253-01/819-5461/gitfx/index.html

"Enforcement of user and group quotas might be delayed by several seconds.
This
delay means that users might exceed their quota before the system notices that
they are over quota and refuses additional writes with the EDQUOT error
message."

So, rsync(1) might
(a) successfully write a bunch of data that falls just short of the quota,
(b) "successfully" (i.e. with no reported error) write some more that
takes it
over quota
(c) close the file (ignoring any result)
(d) rename the file -- which may or may not work.

ZFS probably notices the quota problem somewhere between (b) and (c), drops the
excess data, and returns EDQUOT to the close(2) call.

The success of the rename depends only on whether there is sufficient free
space at that instant; any previous failure in writing the file won't affect
the rename directly. Rename may not always need any extra space anyway.

BTW: apropos your comment about "cat" in thread
https://www.postgresql.org/message-id/E6AB850C-D05E-405B-8D4E-DE18E128F402%40icloud.com
it's not the cat that makes this reliable, it's the ssh! :)

HTH,
.Dave.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-05 20:42 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #4 from Dave Gordon <dg32768 at zoho.eu> ---
To see whether rsync is getting any errors reported by any system calls it
makes, one could run it under strace(1) on Linux, or dtrace on Solaris.
Presumably FreeBSD has at least one of these, or something similar?

Linux:
$ strace -ff -o rsync-trace -e trace=file,desc rsync ...
$ grep -w 'E[[:upper:][:digit:]]*' rsync-trace.*
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or
directory)
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or
directory)
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or
directory)
open("/usr/etc/popt", O_RDONLY)         = -1 ENOENT (No such file or
directory)
open("/etc/popt", O_RDONLY)             = -1 ENOENT (No such file or
directory)
stat("/etc/popt.d", 0x7ffdafe82860)     = -1 ENOENT (No such file or
directory)

The output is a list of all filename- or descriptor-related syscalls that
failed. If any of them show up as EDQUOT or relate to the problematic output
file, that might be a big clue :)

HTH,
.Dave.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-05 22:05 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #5 from Rui DeSousa <rui.desousa at icloud.com> ---
(In reply to Dave Gordon from comment #4)

Hi Dave,

I'm not seeing any errors on the write calls. Would an fsync() be required
to
force the error?

[postgres at hades ~]$ grep ERR rsync.test.log 
52419: lstat("/usr/local/etc/libmap.d",0x7fffffffc728) ERR#2 'No
such file or
directory'
52419: access("/lib/libiconv.so.2",F_OK)     ERR#2 'No such file
or directory'
52419: access("/usr/lib/libiconv.so.2",F_OK)     ERR#2 'No such
file or
directory'
52419: access("/usr/lib/compat/libiconv.so.2",F_OK) ERR#2 'No such
file or
directory'
52419: readlink("/etc/malloc.conf",0x7fffffffd7e0,1024) ERR#2 'No
such file or
directory'
52419: openat(AT_FDCWD,"/etc/popt",O_RDONLY,00)     ERR#2 'No such
file or
directory'
52419: openat(AT_FDCWD,"/usr/home/postgres/.popt",O_RDONLY,00) ERR#2
'No such
file or directory'
52881: stat("arch/000000010000005E00000017",0x7fffffffca10) ERR#2
'No such file
or directory'
52881: __acl_get_file(0x455d12,0x3,0x8012ca000)     ERR#22 'Invalid
argument'
52881: lstat("000000010000005E00000017",0x7fffffffbc60) ERR#2 'No
such file or
directory'
52987: openat(AT_FDCWD,"000000010000005E00000017",O_RDONLY,00) ERR#2
'No such
file or directory'
52987: __acl_get_file(0x455d12,0x3,0x8012ca000)     ERR#22 'Invalid
argument'
52987:
rename(".000000010000005E00000017.kmkocG","000000010000005E00000017")
ERR#69 'Disc quota exceeded'
52987: stat("/usr/share/nls/C/libc.cat",0x7fffffff9e28) ERR#2 'No
such file or
directory'
52987: stat("/usr/share/nls/libc/C",0x7fffffff9e28) ERR#2 'No such
file or
directory'
52987: stat("/usr/local/share/nls/C/libc.cat",0x7fffffff9e28) ERR#2
'No such
file or directory'
52987: stat("/usr/local/share/nls/libc/C",0x7fffffff9e28) ERR#2
'No such file
or directory'
52987: select(1,{ 0 },{ },{ 0 },{ 60.000000 })     ERR#4 'Interrupted system
call'
52881: nanosleep({ 0.020000000 })         ERR#4 'Interrupted system
call'
52881: wait4(-1,0x7fffffffc3d4,WNOHANG,0x0)     ERR#10 'No child
processes'
52881: sigreturn(0x7fffffffc400)         ERR#4 'Interrupted system call'
52881: wait4(52987,0x7fffffffc9d4,WNOHANG,0x0)     ERR#10 'No child
processes'
52419: nanosleep({ 0.020000000 })         ERR#4 'Interrupted system
call'
52419: wait4(-1,0x7fffffffc4d4,WNOHANG,0x0)     ERR#10 'No child
processes'
52419: sigreturn(0x7fffffffc500)         ERR#4 'Interrupted system call'
52419: wait4(52881,0x7fffffffcaf4,WNOHANG,0x0)     ERR#10 'No child
processes'
52419: wait4(52881,0x7fffffffcaf4,WNOHANG,0x0)     ERR#10 'No child
processes'

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-05 23:09 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #6 from Rui DeSousa <rui.desousa at icloud.com> ---
(In reply to Rui DeSousa from comment #5)

It looks like no error is returned and result is a sparse file. I think a
sync() would be required otherwise the file is truncated on close to meet the
quota.  

[postgres at hades ~]$ df -h arch 
Filesystem                  Size    Used   Avail Capacity  Mounted on
hydra/home/postgres/arch    1.0G    1.0G    316K   100%   
/usr/home/postgres/arch

[postgres at hades ~]$ rsync -av --inplace 000000010000005E00000017
arch/000000010000005E00000017
sending incremental file list
000000010000005E00000017

sent 67,125,370 bytes  received 35 bytes  8,950,054.00 bytes/sec
total size is 67,108,864  speedup is 1.00
[postgres at hades ~]$ ls -lh  arch/000000010000005E00000017
-rw-------  1 postgres  postgres    64M Mar  5 16:02
arch/000000010000005E00000017
[postgres at hades ~]$ du -h  arch/000000010000005E00000017
362K    arch/000000010000005E00000017


Here's example of dd exceeding the quota; however, it ends up with a
truncated
file instead of it being sparse.  

[postgres at hades ~/arch]$ dd if=/dev/random of=test.dmp bs=256k
^C3080+0 records in
2+6156 records out
807403520 bytes transferred in 10.843296 secs (74461081 bytes/sec)
[postgres at hades ~/arch]$ ls -lh test.dmp 
-rw-r--r--  1 postgres  postgres   384K Mar  5 18:05 test.dmp
[postgres at hades ~/arch]$ du -h test.dmp 
393K    test.dmp

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-05 23:57 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #7 from Dave Gordon <dg32768 at zoho.eu> ---
(In reply to Rui DeSousa from comment #5)

That was a run where the rename failed. Do you know whether the temporary file
was truncated or corrupted in that scenario?

[HINT: one can stop the rsync with a signal, find the temporary file, create a
hardlink to it, and then let rsync resume; in the event of an error, the
temporary won't disappear because of the hardlink, and then one can examine
it
to see what state it was in when rsync quit].

Another thing to try would be adding the options --itemize-changes and 
--debug=deltasum3,recv2,io2

BTW I checked whether the return code from close() is verified, and, while
there are many close() calls whose result is ignored, the specific one at the
end of a file transfer *is* checked. So if ZFS did return a delayed error at
that point, it would be detected and cause the transfer to fail.

.Dave.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

Dan Stromberg

2018-Mar-06 00:03 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

On Mon, Mar 5, 2018 at 3:09 PM, just subscribed for rsync-qa from
bugzilla via rsync <rsync at lists.samba.org>
wrote:> https://bugzilla.samba.org/show_bug.cgi?id=13317
>
> --- Comment #6 from Rui DeSousa <rui.desousa at icloud.com> ---
> (In reply to Rui DeSousa from comment #5)
>
> It looks like no error is returned and result is a sparse file. I think a
> sync() would be required otherwise the file is truncated on close to meet
the
> quota.
IINM, to create a sparse file, you have to seek past the concrete part
of a file (if any), and write something.

I'm not sure that a sync would change whether you've gone over your
quota.  At least, if I were designing a quota system, I'd make it
depend on what you've written to the buffer cache and disk, not disk
alone.

Here's an example of creating a sparse file:
$ /usr/local/pypy3-5.10.0/bin/pypy3
below cmd output started 2018 Mon Mar 05 04:01:55 PM PST
Python 3.5.3 (09f9160b643e, Dec 22 2017, 10:10:27)
[PyPy 5.10.0 with GCC 6.2.0 20160901] on linux
Type "help", "copyright", "credits" or
"license" for more information.>>>> file_ = open('/tmp/sparse-file', 'wb')
>>>> file_.seek(1024*1024*8)
8388608>>>> file_.write(b'a')
1>>>> file_.close()
>>>>above cmd output done    2018 Mon Mar 05 04:02:14 PM PST
dstromberg at zareason2:~ x86_64-unknown-linux-gnu 4805

$ ls -l /tmp/sparse-file
below cmd output started 2018 Mon Mar 05 04:02:18 PM PST
-rw-rw-r-- 1 dstromberg dstromberg 8388609 Mar  5 16:02 /tmp/sparse-file
above cmd output done    2018 Mon Mar 05 04:02:18 PM PST
dstromberg at zareason2:~ x86_64-unknown-linux-gnu 4805

$ du -sh /tmp/sparse-file
below cmd output started 2018 Mon Mar 05 04:02:22 PM PST
4.0K    /tmp/sparse-file

samba-bugs at samba.org

2018-Mar-06 00:08 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #8 from Ben RUBSON <ben.rubson at gmail.com> ---
(In reply to Dave Gordon from comment #3)> ZFS probably notices the quota problem somewhere between (b) and (c), drops
the excess data, and returns EDQUOT to the close(2) call.
(In reply to Dave Gordon from comment #7)> So if ZFS did return a delayed error at that point, it would be detected
and cause the transfer to fail.
I confirm ZFS does not return delayed errors.
userquota are delayed, so user may store more than the userquota, however what
is stored is stored.
The other types of ZFS quotas, quota and refquota, are not delayed and ZFS
extremely takes care to the last bytes stored around the limit to return EDQUOT
after the very last allowed bytes (throughput is then here terribly slow).

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-06 00:11 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #9 from Dave Gordon <dg32768 at zoho.eu> ---
(In reply to Rui DeSousa from comment #6)

In your example:
$ rsync -av --inplace 000000010000005E00000017 arch/000000010000005E00000017
sending incremental file list
000000010000005E00000017

sent 67,125,370 bytes  received 35 bytes  8,950,054.00 bytes/sec
total size is 67,108,864  speedup is 1.00
$ ls -lh  arch/000000010000005E00000017
-rw-------  1 postgres  postgres    64M Mar  5 16:02
arch/000000010000005E00000017
$ du -h  arch/000000010000005E00000017
362K    arch/000000010000005E00000017

how much of the source file is non-sparse? 'Cos ZFS can "sparsify"
a file if it
detects that you've got big chunks of zeros. For that matter, if you've
got
dedup  enabled, it should be able to detect any repeated pattern that's
block-sized and -aligned. That might let you create files that go over quota,
since they're not really using very much space.

In your dd(1) example, do you get a different result if you source /dev/zero
rather than /dev/(u)random?

.Dave.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-06 00:34 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #10 from Dave Gordon <dg32768 at zoho.eu> ---
BTW, have you tried *either* --sparse *or* --preallocate (but not both
together, please, as that will trigger bug 13320 -
https://bugzilla.samba.org/show_bug.cgi?id=13320)

Does you get the same problem (i.e. file corruption with no reported error)
with each of these, or an error, or some different type of misbehaviour?

.Dave.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-06 01:51 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #11 from Rui DeSousa <rui.desousa at icloud.com> ---
(In reply to Dave Gordon from comment #7)

This is the result of hard link on the temp file where the rename failed.

root at hades:~postgres/arch # ls -lh rsync.temp ; du -h rsync.temp 
-rw-------  1 postgres  postgres   1.1M Mar  5 16:02 rsync.temp
329K    rsync.temp

The file system has LZ4 compression; thus the file is actually 19MB on disk.

root at hades:~postgres # ls -lh 000000010000005E00000017 
-rw-------  1 postgres  postgres    64M Mar  5 16:02 000000010000005E00000017
root at hades:~postgres # du -h 000000010000005E00000017 
 19M    000000010000005E00000017


Since it's using compression; I thought /dev/zero would be a bad option to
test
with.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-06 01:56 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #12 from Rui DeSousa <rui.desousa at icloud.com> ---
(In reply to Dave Gordon from comment #10)

The sparse option errors out :).

[postgres at hades ~]$ rsync -av --sparse 000000010000005E00000017
arch/000000010000005E00000017 
sending incremental file list
000000010000005E00000017
rsync: write failed on
"/usr/home/postgres/arch/000000010000005E00000017": Disc
quota exceeded (69)
rsync error: error in file IO (code 11) at receiver.c(400) [receiver=3.1.2]
rsync: [sender] write error: Broken pipe (32)



[postgres at hades ~]$ rsync -av --preallocate 000000010000005E00000017
arch/000000010000005E00000017 
preallocation is not supported on this Server
rsync error: syntax or usage error (code 1) at compat.c(205) [Receiver=3.1.2]
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: error in rsync protocol data stream (code 12) at io.c(226)
[sender=3.1.2]

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-06 02:09 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #13 from Rui DeSousa <rui.desousa at icloud.com> ---
(In reply to Rui DeSousa from comment #12)

Running truss on the --sparse option does show the error being is returned
during the write call.


[postgres at hades ~]$ truss -f -o sparse.log rsync -av --sparse
000000010000005E00000017 arch/000000010000005E00000017 
sending incremental file list
000000010000005E00000017
rsync: write failed on
"/usr/home/postgres/arch/000000010000005E00000017": Disc
quota exceeded (69)
rsync error: error in file IO (code 11) at receiver.c(400) [receiver=3.1.2]
rsync: [sender] write error: Broken pipe (32)


[postgres at hades ~]$ grep 68408 sparse.log |grep ERR
68408: openat(AT_FDCWD,"000000010000005E00000017",O_RDONLY,00) ERR#2
'No such
file or directory'
68408: write(1,"\M^W\M-P\^E\0\^A\0\0\0\0\0\^P\\^"...,1024) ERR#69
'Disc quota
exceeded'
68408: stat("/usr/share/nls/C/libc.cat",0x7fffffff9dd8) ERR#2 'No
such file or
directory'
68408: stat("/usr/share/nls/libc/C",0x7fffffff9dd8) ERR#2 'No such
file or
directory'
68408: stat("/usr/local/share/nls/C/libc.cat",0x7fffffff9dd8) ERR#2
'No such
file or directory'
68408: stat("/usr/local/share/nls/libc/C",0x7fffffff9dd8) ERR#2
'No such file
or directory'

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-06 13:33 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #14 from Dave Gordon <dg32768 at zoho.eu> ---
(In reply to Rui DeSousa from comment #6)

So you now have an example which reliably produces a bad outcome (corrupted
file)? With the combination of
(a) insufficient space before copying, and
(b) --inplace so that no rename is needed
you get no error from rsync but the file uses far less space than it should?
And the contents are not as expected?

Perhaps you could try this sequence:
$ src=000000010000005E00000017
$ dst=arch/000000010000005E00000017
$ rm $dst
$ rsync -avi --inplace --debug=deltasum2,recv2,io2 $src $dst
$ touch $src
$ rsync -avic --inplace --debug=deltasum3,recv2,io2 $src $dst
$ md5sum $src $dst
and if the md5sum shows a discrepancy then also capture a truss listing of all
the syscalls made by the first run of rsync (where the destination does not
exist).

Rationale:
I'm wondering whether ZFS if able to detect that the destination file is
identical to the source file (even though it's on a different logical
filesystem, it's probably still in the same pool) and merge the disk blocks
(retrospectively?)

So we start with a file whose logical size is 64M, but which is compressed to
19M on disk. ZFS uncompresses it on the fly and delivers 64M of data to the
first rsync. rsync sequentially writes 64M, checking the success of each write.
The last write should end at an offset of 64M, then the destination file is
closed (and the return from that is checked). The truss listing will show how
many writes were made, and whether any of them failed. ZFS will recompress (and
dedup?) the written data, resulting in an unknown amount of new space being
used.

The second rsync will read both the source and destination files -- presumably
both now being decompressed on demand -- and compare checksums. Any mismatch
here will result in (parts of) the file being retransferred; obviously, this
shouldn't happen if the first transfer succeeded.

md5sum will check whether the files really do match.

HTH,
.Dave.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-06 13:39 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #15 from Ben RUBSON <ben.rubson at gmail.com> ---
Rui just to be sure, which type of ZFS quota are you using ?
quota ? refquota ? userquota ?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-06 15:31 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #16 from Rui DeSousa <rui.desousa at icloud.com> ---
(In reply to Ben RUBSON from comment #15)

I just set the quota property.

NAME                      PROPERTY       VALUE     SOURCE
hydra/home/postgres/arch  quota          1G        local
hydra/home/postgres/arch  reservation    none      default
hydra/home/postgres/arch  refquota       none      default
hydra/home/postgres/arch  compression    lz4       inherited from
hydra/home/postgres
hydra/home/postgres/arch  compressratio  3.66x     -

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-06 17:11 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #17 from Rui DeSousa <rui.desousa at icloud.com> ---
(In reply to Dave Gordon from comment #14)

Here's the output you requested.  ZFS would use the same block even if
it's the
same data as don't have dedup enabled.

[postgres at hades ~]$ ls arch/
dbc1
[postgres at hades ~]$ du -h 000000010000005E00000017 ; df -h arch/
 19M    000000010000005E00000017
Filesystem                  Size    Used   Avail Capacity  Mounted on
hydra/home/postgres/arch    977M    975M    1.9M   100%   
/usr/home/postgres/arch
[postgres at hades ~]$ rsync -avi --inplace --debug=deltasum2,recv2,io2
000000010000005E00000017 arch/000000010000005E00000017
[Receiver] safe_read(0)=4
[sender] safe_read(5)=4
[sender] safe_read(5)=1
[sender] safe_read(5)=4
sending incremental file list
server_recv(2) starting pid=64788
get_local_name count=1 arch/000000010000005E00000017
send_files mapped 000000010000005E00000017 of size 67108864
calling match_sums 000000010000005E00000017>f+++++++++ 000000010000005E00000017sending file_sum
false_alarms=0 hash_hits=0 matches=0
recv_files(1) starting
recv_files(000000010000005E00000017)
got file_sum
finishing 000000010000005E00000017
[receiver] send_msg_int(100, 1)
recv_files phase=1
total: matches=0  hash_hits=0  false_alarms=0 data=67108864
recv_files phase=2
recv_files finished
[receiver] send_msg(10, 8)

sent 67,125,370 bytes  received 313 bytes  2,200,842.07 bytes/sec
total size is 67,108,864  speedup is 1.00
[postgres at hades ~]$ echo $?
0
[postgres at hades ~]$ du -h arch/000000010000005E00000017; md5
000000010000005E00000017 arch/000000010000005E00000017 
1.9M    arch/000000010000005E00000017
MD5 (000000010000005E00000017) = b607d345c5527152450c02c23d778cf2
MD5 (arch/000000010000005E00000017) = f90772e440b04b63490a88e7dafeac84
[postgres at hades ~]$ touch 000000010000005E00000017 
[postgres at hades ~]$ rsync -avic --inplace --debug=deltasum3,recv2,io2
000000010000005E00000017 arch/000000010000005E00000017
[Receiver] safe_read(0)=4
[sender] safe_read(5)=4
[sender] safe_read(5)=1
[sender] safe_read(5)=4
sending incremental file list
server_recv(2) starting pid=67722
get_local_name count=1 arch/000000010000005E00000017
count=0 n=0 rem=0
send_files mapped 000000010000005E00000017 of size 67108864
calling match_sums 000000010000005E00000017>fc.t...... 000000010000005E00000017recv_files(1) starting
recv_files(000000010000005E00000017)
recv mapped 000000010000005E00000017 of size 67108864
data recv 32768 at 0
data recv 32768 at 32768
data recv 32768 at 65536
data recv 32768 at 98304
data recv 32768 at 131072
data recv 32768 at 163840
data recv 32768 at 196608
data recv 32768 at 229376
data recv 32768 at 262144
data recv 32768 at 294912
data recv 32768 at 327680
data recv 32768 at 360448
data recv 32768 at 393216
data recv 32768 at 425984
data recv 32768 at 458752
data recv 32768 at 491520
data recv 32768 at 524288
data recv 32768 at 557056
data recv 32768 at 589824
data recv 32768 at 622592
data recv 32768 at 655360
data recv 32768 at 688128
data recv 32768 at 720896
data recv 32768 at 753664
data recv 32768 at 786432
data recv 32768 at 819200
data recv 32768 at 851968
data recv 32768 at 884736
data recv 32768 at 917504
data recv 32768 at 950272
data recv 32768 at 983040
data recv 32768 at 1015808
data recv 32768 at 1048576
data recv 32768 at 1081344
data recv 32768 at 1114112
data recv 32768 at 1146880
data recv 32768 at 1179648
data recv 32768 at 1212416
data recv 32768 at 1245184
data recv 32768 at 1277952
data recv 32768 at 1310720
data recv 32768 at 1343488
data recv 32768 at 1376256
data recv 32768 at 1409024
data recv 32768 at 1441792
data recv 32768 at 1474560
data recv 32768 at 1507328
data recv 32768 at 1540096
data recv 32768 at 1572864
data recv 32768 at 1605632
data recv 32768 at 1638400
data recv 32768 at 1671168
data recv 32768 at 1703936
data recv 32768 at 1736704
data recv 32768 at 1769472
data recv 32768 at 1802240
data recv 32768 at 1835008
data recv 32768 at 1867776
data recv 32768 at 1900544
data recv 32768 at 1933312
data recv 32768 at 1966080
data recv 32768 at 1998848
data recv 32768 at 2031616
data recv 32768 at 2064384
data recv 32768 at 2097152
data recv 32768 at 2129920
data recv 32768 at 2162688
data recv 32768 at 2195456
data recv 32768 at 2228224
data recv 32768 at 2260992
data recv 32768 at 2293760
data recv 32768 at 2326528
data recv 32768 at 2359296
data recv 32768 at 2392064
data recv 32768 at 2424832
data recv 32768 at 2457600
data recv 32768 at 2490368
data recv 32768 at 2523136
data recv 32768 at 2555904
data recv 32768 at 2588672
data recv 32768 at 2621440
data recv 32768 at 2654208
data recv 32768 at 2686976
data recv 32768 at 2719744
data recv 32768 at 2752512
data recv 32768 at 2785280
data recv 32768 at 2818048
data recv 32768 at 2850816
data recv 32768 at 2883584
data recv 32768 at 2916352
data recv 32768 at 2949120
data recv 32768 at 2981888
data recv 32768 at 3014656
data recv 32768 at 3047424
data recv 32768 at 3080192
data recv 32768 at 3112960
data recv 32768 at 3145728
data recv 32768 at 3178496
data recv 32768 at 3211264
data recv 32768 at 3244032
data recv 32768 at 3276800
data recv 32768 at 3309568
data recv 32768 at 3342336
data recv 32768 at 3375104
data recv 32768 at 3407872
data recv 32768 at 3440640
data recv 32768 at 3473408
data recv 32768 at 3506176
data recv 32768 at 3538944
data recv 32768 at 3571712
data recv 32768 at 3604480
data recv 32768 at 3637248
data recv 32768 at 3670016
data recv 32768 at 3702784
data recv 32768 at 3735552
data recv 32768 at 3768320
data recv 32768 at 3801088
data recv 32768 at 3833856
data recv 32768 at 3866624
data recv 32768 at 3899392
data recv 32768 at 3932160
data recv 32768 at 3964928
data recv 32768 at 3997696
data recv 32768 at 4030464
data recv 32768 at 4063232
data recv 32768 at 4096000
data recv 32768 at 4128768
data recv 32768 at 4161536
data recv 32768 at 4194304
data recv 32768 at 4227072
data recv 32768 at 4259840
data recv 32768 at 4292608
data recv 32768 at 4325376
data recv 32768 at 4358144
data recv 32768 at 4390912
data recv 32768 at 4423680
data recv 32768 at 4456448
data recv 32768 at 4489216
data recv 32768 at 4521984
data recv 32768 at 4554752
data recv 32768 at 4587520
data recv 32768 at 4620288
data recv 32768 at 4653056
data recv 32768 at 4685824
data recv 32768 at 4718592
data recv 32768 at 4751360
data recv 32768 at 4784128
data recv 32768 at 4816896
data recv 32768 at 4849664
data recv 32768 at 4882432
data recv 32768 at 4915200
data recv 32768 at 4947968
data recv 32768 at 4980736
data recv 32768 at 5013504
data recv 32768 at 5046272
data recv 32768 at 5079040
data recv 32768 at 5111808
data recv 32768 at 5144576
data recv 32768 at 5177344
data recv 32768 at 5210112
data recv 32768 at 5242880
data recv 32768 at 5275648
data recv 32768 at 5308416
data recv 32768 at 5341184
data recv 32768 at 5373952
data recv 32768 at 5406720
data recv 32768 at 5439488
data recv 32768 at 5472256
data recv 32768 at 5505024
data recv 32768 at 5537792
data recv 32768 at 5570560
data recv 32768 at 5603328
data recv 32768 at 5636096
data recv 32768 at 5668864
data recv 32768 at 5701632
data recv 32768 at 5734400
data recv 32768 at 5767168
data recv 32768 at 5799936
data recv 32768 at 5832704
data recv 32768 at 5865472
data recv 32768 at 5898240
data recv 32768 at 5931008
data recv 32768 at 5963776
data recv 32768 at 5996544
data recv 32768 at 6029312
data recv 32768 at 6062080
data recv 32768 at 6094848
data recv 32768 at 6127616
data recv 32768 at 6160384
sending file_sum
false_alarms=0 hash_hits=0 matches=0
data recv 32768 at 6193152
data recv 32768 at 6225920
data recv 32768 at 6258688
[receiver] send_msg(1, 100)
[generator] send_msg(1, 100)
rsync: write failed on
"/usr/home/postgres/arch/000000010000005E00000017": Disc
quota exceeded (69)
[receiver] send_msg(3, 76)
[generator] send_msg(3, 76)
rsync error: error in file IO (code 11) at receiver.c(400) [receiver=3.1.2]
[receiver] send_msg_int(86, 11)
[generator] send_msg_int(86, 11)
[sender] send_msg(86, 0)
rsync: [sender] write error: Broken pipe (32)
[postgres at hades ~]$ echo $?
11

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-06 17:19 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #18 from Rui DeSousa <rui.desousa at icloud.com> ---
I also wrote a little util as well; I get the correct error in write spot.

[postgres at hades ~]$ cat 000000010000005E00000017 | ./fwrite/fwrite
arch/000000010000005E00000017
fwrite: write: Disc quota exceeded
[postgres at hades ~]$ echo $?
1
[postgres at hades ~]$ du -h arch/000000010000005E00000017.GghJVR 
1.9M    arch/000000010000005E00000017.GghJVR


Here's the code:

#include <sys/stat.h>
#include <err.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>


#define    BUFSIZE 131072

int
main(int argc, char *argv[])
{
  int fd, r, w;
  char *buf;
  char *name;

  if (argc != 2) {
    fprintf(stderr, "usage: fwrite [file]\n");
    exit(1);
  }

  if ((buf = malloc(BUFSIZE)) == NULL)
    err(1, "malloc");

  ++argv;
  if ((name = (char *) malloc(strlen(*argv) + 10)) == NULL)
    err(1, "malloc");

  strcat(strcpy(name, *argv), ".XXXXXX");

  if ((fd = mkstemp(name)) < 0)
    err(1, "mkstemp");

  while ((r = read(STDIN_FILENO, buf, BUFSIZE)) > 0)
    if ((w = write(fd, buf, r)) == -1)
      err(1, "write");

  if (r < 0)
   err(1, "read");

  if (fsync(fd) != 0)
   err(1, "fsync");

  if (close(fd) != 0)
   err(1, "close");

  if (rename(name, *argv) != 0)
   err(1, "rename");

  free(name);
  exit(0);
}

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-06 18:16 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #19 from Ben RUBSON <ben.rubson at gmail.com> ---
I managed to reproduce the issue on 11.0-RELEASE-p16.
Below a simple test case, without compression, without deduplication.
Note that issue is reproductible with quota, but not with userquota.

# f=/test
# z=zroot/ROOT/default$f

### quota / refquota (same results)

# zfs destroy $z
# zfs create $z
# zfs set compression=off $z
# zfs set dedup=off $z
# zfs set quota=1G $z

# dd if=/dev/random bs=1M count=600 of=$f/f1
# rsync -a --inplace $f/f1 $f/f2 ; echo $? ; diff $f/f1 $f/f2 ; rm $f/f2
0
Files /test/f1 and /test/f2 differ

# rsync -a $f/f1 $f/f2 ; echo $? ; rm $f/f2
rsync: rename "/test/.f2.6NVNwD" -> "f2": Disc quota
exceeded (69)
rsync error: some files/attrs were not transferred (see previous errors)
(code 23) at main.c(1192) [sender=3.1.3]
23
// the quota error should have triggered on write instead //

### userquota

# zfs destroy $z
# zfs create $z
# zfs set compression=off $z
# zfs set dedup=off $z
# zfs set userquota at root=1G $z

# dd if=/dev/random bs=1M count=600 of=$f/f1
# rsync -a --inplace $f/f1 $f/f2 ; echo $? ; diff $f/f1 $f/f2 ; rm $f/f2
rsync: [sender] write error: Broken pipe (32)
rsync: write failed on "/test/f2": Disc quota exceeded (69)
rsync error: error in file IO (code 11) at receiver.c(404) [receiver=3.1.3]
11
Files /test/f1 and /test/f2 differ

# rsync -a $f/f1 $f/f2 ; echo $? ; rm $f/f2
rsync: [sender] write error: Broken pipe (32)
rsync: write failed on "/test/f2": Disc quota exceeded (69)
rsync error: error in file IO (code 11) at receiver.c(404) [receiver=3.1.3]
11

# dd if=/dev/random bs=1M count=535 of=$f/f1
# rsync -a --inplace $f/f1 $f/f2 ; echo $? ; diff $f/f1 $f/f2
0
# zfs get -Hp -o value used,userquota at root $z
1123188736
1073741824
# touch $f/file
touch: /test/file: Disc quota exceeded

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-06 19:18 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #20 from Dave Gordon <dg32768 at zoho.eu> ---
So, looking like a ZFS issue but triggered in some way by the specific
behaviour of rsync (e.g. writing a certain block size/pattern causes the quota
error to be lost). The truss listing from a failing case should show exactly
what operations were performed on the destination filesystem; maybe someone
from the ZFS team could see whether there's some pathological case
that's not
handled properly.

.Dave.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-06 19:57 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #21 from Dave Gordon <dg32768 at zoho.eu> ---
Created attachment 14019
  --> https://bugzilla.samba.org/attachment.cgi?id=14019&action=edit
Test patch to see whether fdatasync() or fsync() detects a write failure

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-06 20:15 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #22 from Dave Gordon <dg32768 at zoho.eu> ---
(In reply to Ben RUBSON from comment #19)

Just to be totally certain about what ZFS may or may not share between files,
could you try this variant of your testcase:

# zfs destroy $z
# zfs create $z
# zfs set compression=off $z
# zfs set dedup=off $z
# zfs set quota=1G $z

Then with "tmpfs" referring to any non-ZFS filesystem:
# dd if=/dev/random bs=1M count=600 of=$tmpfs/f1
# rsync -a --inplace $tmpfs/f1 $f/f2 ; echo $? ; diff $tmpfs/f1 $f/f2
# dd if=/dev/random bs=1M count=600 of=$tmpfs/f1
# rsync -a --inplace $tmpfs/f1 $f/f3 ; echo $? ; diff $tmpfs/f1 $f/f3

The first rsync should succeed, and the files should be identical.
The second should fail, but if it doesn't then at least it can't be down
to ZFS
finding matching datablocks.

.Dave.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-06 20:48 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #23 from Dave Gordon <dg32768 at zoho.eu> ---
Looks like this ZFS problem could be a FreeBSD-specific issue; one of the
commits mentioned in this FreeNAS bug report has the subject
  zfs_write: fix problem with writes appearing to succeed when over quota

See https://redmine.ixsystems.com/issues/26650

FreeNAS Bug #26650: Correctly report ZFS dataset quota overflows

Updated by Alexander Motin 4 months ago

Investigation of the problem brought me to FreeBSD-specific change r298105 by
avg@ on 2016-04-16. If quota overflow detected during write, the write will
fail, but the error status can be lost, falsely reporting partial completion.
As result written data are flowing to nowhere and indefinitely, as fast as CPU
can handle the loop.

HTH,
.Dave.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-06 20:53 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #24 from Ben RUBSON <ben.rubson at gmail.com> ---
ZFS only shares between files with dedup on.
So first rsync / diff succeeded, second gave same result as before :
# rsync -a --inplace $tmpfs/f1 $f/f3 ; echo $? ; diff $tmpfs/f1 $f/f3
0
Files /mnt/f1 and /test/f3 differ

# ls -l $f
total 1048593
-rw-------  1 root  wheel  629145600  6 Mar 09:38 f2
-rw-------  1 root  wheel  629145600  6 Mar 09:40 f3
# du -sm $f/*
601    /test/f2
424    /test/f3

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-06 21:23 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #25 from Ben RUBSON <ben.rubson at gmail.com> ---
And here is the patch applied to FreeBSD stable/10 :
https://reviews.freebsd.org/rS326428
And to FreeBSD stable/11 :
https://reviews.freebsd.org/rS326427

Will then be in FreeBSD 11.3.

Note that this patch depends on another one as per its description.

Would be nice to give stable/10 or stable/11 branch a try.

But I think this ticket can then now be closed.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-06 21:51 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #26 from Rui DeSousa <rui.desousa at icloud.com> ---
(In reply to Ben RUBSON from comment #25)

That is awesome.  Thanks you for all of your efforts!

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-07 14:52 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #27 from Carson Gaspar <carson at taltos.org> ---
(In reply to Dave Gordon from comment #23)
Reading this, I took a look at the rsync sources, and, indeed, rsync has a bug.

perform_io() does not correctly check the return code from write().
safe_write() does - please steal the correct logic from it. perform_io() only
checks for write returning <=0, not for write returning something != len. So
in
the case of a partial write() it will corrupt the destination file.

I haven't looked for additional coding errors of this type.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

samba-bugs at samba.org

2018-Mar-07 23:26 UTC

head link

[Bug 13317] rsync returns success when target filesystem is full

https://bugzilla.samba.org/show_bug.cgi?id=13317

--- Comment #28 from Dave Gordon <dg32768 at zoho.eu> ---
(In reply to Carson Gaspar from comment #27)
Hmm? If you're referring to line 810 of io.c, which is the only write(2)
call I
can see in perform_io(), in the current HEAD it looks like this:

 810  if ((n = write(iobuf.out_fd, out->buf + out->pos, len)) <= 0) {
      ...
 822  }
      ...
 835  if ((out->pos += n) == out->size) {
      ...

so that while the immediate test is only for <= 0, the actual result is
stored
in n and subsequently used to adjust counters, pointers, etc. I don't think
there's a bug there.

BTW, AFAICT perform_io() is only used for IPC. The received file is written out
by write_file() in fileio.c:

140 /* write_file does not allow incomplete writes.  It loops internally
141  * until len bytes are written or errno is set.  Note that use_seek and
142  * offset are only used in sparse processing (see write_sparse()). */
143 int write_file(int f, int use_seek, OFF_T offset, const char *buf, int len)

.Dave.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

Seemingly Similar Threads

Search for more apparently analagous threads

rsync - Mar 2018 - [Bug 13317] New: rsync returns success when target filesystem is full

[Bug 13317] New: rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

[Bug 13317] rsync returns success when target filesystem is full

Seemingly Similar Threads