In short:
Platform: linux with 2.4 kernel
Version: rsync 2.6.6
Command line:
rsync266 -av -W --bwlimit=1 /mnt/somedir/rsync-2.6.6.tar.gz ./
Destination: local disk
Source: file on a smbfs mounted filesystem; share is exported on a NT 4.0
workstation over a very slow and unstable link
Result: Rsync completes operation with no special message, but the
resulting file is damaged, large parts are filled with binary zeroes
Strace shows that result value from read system call seems to be
ignored.
In more detail:
I am quite often using rsync as a replacement for cp (because of --bwlimit
option not present in cp).
The error I enountered with many versions of rsync, including 2.6.6
happens if I try to copy a a files from a remote filesystem mounted
on linux smbfs. The NT 4.0 computer sharing the data is on a very slow and
unstable link, so unstable, in fact, that I cannot get larger files
(usually about 2 MBytes) copies when working on a NT 4.0 workstation.
I can usually copy the file when working on a Linux system, but not with rsync.
I have run rsync though strace to see what is happening; here are the relevant
lines;
File handle 3 is source file;
File handle 4 is most probably a pipe or other interprocess communication
endpoint
(writing to destination file takes place in another rsync process)
13:21:42.172254 open("/mnt/somedir/rsync-2.6.6.tar.gz",
O_RDONLY|O_LARGEFILE) = 3
...
13:21:47.045818 read(3,
"\37\213\10\0\3102\351B\0\3\354<ks\333\266\262\371j\375"...,
262144) = 36864
[rsync requested 262144 bytes, but got only 36864 bytes]
...
13:22:43.980934 write(4,
"\37\213\10\0\3102\351B\0\3\354<ks\333\266\262\371j\375"..., 512) =
512
3:22:43.981111 gettimeofday({1127215363, 981132}, NULL) = 0
13:22:43.981325 select(5, NULL, [4], NULL, {60, 0}) = 1 (out [4], left {60, 0})
13:22:43.981463 write(4,
"-\342%K\27b\305\270\24\10\vcR\311#5\23R\n\317al\230\2\376"..., 512) =
512
...
13:23:17.494472 write(4, "\321Ka\232\247\203\323\375Wr\266I\223\226\364w
p\322\276"..., 512) = 512
13:23:17.494628 gettimeofday({1127215397, 494644}, NULL) = 0
13:23:17.494825 select(0, NULL, NULL, NULL, {0, 497760}) = 0 (Timeout)
13:23:17.994212 gettimeofday({1127215397, 994254}, NULL) = 0
13:23:17.994294 select(5, NULL, [4], NULL, {60, 0}) = 1 (out [4], left {60, 0})
13:23:17.994439 write(4,
"\"\315\260q\4n\23.\314\204\200\303\241\227P?\32\204(\252"...,
512) = 512
13:23:17.994596 gettimeofday({1127215397, 994612}, NULL) = 0
13:23:17.994793 select(0, NULL, NULL, NULL, {0, 498736}) = 0 (Timeout)
13:23:18.494219 gettimeofday({1127215398, 494260}, NULL) = 0
13:23:18.494300 select(5, NULL, [4], NULL, {60, 0}) = 1 (out [4], left {60, 0})
13:23:18.494447 write(4,
"\203\250\256\326\232\276b\212\260\211\271:\201\304\245"..., 512) =
512
13:23:18.494602 gettimeofday({1127215398, 494618}, NULL) = 0
13:23:18.494800 select(0, NULL, NULL, NULL, {0, 499712}) = 0 (Timeout)
13:23:18.994219 gettimeofday({1127215398, 994259}, NULL) = 0
13:23:18.994299 select(5, NULL, [4], NULL, {60, 0}) = 1 (out [4], left {60, 0})
13:23:18.994444 write(4, "\272\326
j\20v\1\264\340\0063e\252\374Y\370Q\356N\306t"..., 512) = 512
13:23:18.994602 gettimeofday({1127215398, 994618}, NULL) = 0
13:23:18.994799 select(0, NULL, NULL, NULL, {0, 499712}) = 0 (Timeout)
13:23:19.494223 gettimeofday({1127215399, 494266}, NULL) = 0
13:23:19.494306 select(5, NULL, [4], NULL, {60, 0}) = 1 (out [4], left {60, 0})
13:23:19.494453 write(4,
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) =
512
13:23:19.494615 gettimeofday({1127215399, 494631}, NULL) = 0
13:23:19.494819 select(0, NULL, NULL, NULL, {0, 499712}) = 0 (Timeout)
13:23:19.994223 gettimeofday({1127215399, 994264}, NULL) = 0
13:23:19.994304 select(5, NULL, [4], NULL, {60, 0}) = 1 (out [4], left {60, 0})
13:23:19.994450 write(4,
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) =
512
[ so no real data, only nulls are written in last two write calls]
....
13:26:59.074571 write(4,
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) =
512
13:26:59.074734 gettimeofday({1127215619, 74750}, NULL) = 0
13:26:59.075146 select(0, NULL, NULL, NULL, {0, 499712}) = 0 (Timeout)
13:26:59.574049 gettimeofday({1127215619, 574074}, NULL) = 0
13:26:59.574298 select(5, NULL, [4], NULL, {60, 0}) = 1 (out [4], left {60, 0})
13:26:59.574447 write(4, "\0\200\0\0", 4) = 4
13:26:59.574588 gettimeofday({1127215619, 574604}, NULL) = 0
13:26:59.574796 _llseek(3, 262144, [262144], SEEK_SET) = 0
13:26:59.574845 read(3,
"o\257\236\3546\17\3266\33\315\203\264\2f\36*\214\31_\315"..., 262144)
= 45056
[ here lseek to offset 262144 on file descriptor 3, though only 36864 were
actually read ]
and story repeats here;
The result is as follows:
$ xxd -g 1 -a rsync-2.6.6.tar.gz | egrep -A 2 -B 1 '00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00'
0008ff0: 6f 11 88 18 6e 2f fd a9 d9 15 4b 3c fc 4d 71 a4 o...n/....K<.Mq.
0009000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
*
0040000: 6f af 9e ec 36 0f d6 36 1b cd 83 b4 02 66 1e 2a o...6..6.....f.*
--
004aff0: 81 e7 cb a7 fe cf 4d b5 b0 e9 61 64 33 e8 30 5a ......M...ad3.0Z
004b000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
*
0080000: 1a 65 07 97 77 88 01 90 20 ab e7 ea 90 66 4c fb .e..w... ....fL.
--
008bff0: 70 85 f7 83 44 7e ac e7 e1 74 80 40 b3 01 b5 87 p...D~...t.@....
008c000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
*
00a8790: 00 00 ..
You can see that starting at offset 0x9000 = 36864 we have nulls instead of
original data up to offset 0x0040000 = 262144.
When reading at offset 262144, we got 45056=0xb000 bytes instead of 262144
requested;
After that there are nulls in resulting file up to 0x80000;
When reading at offset 524288, we got 49152=0xc000 bytes instead of 165778
requested;
After that there are nulls in resulting file till the end.
I hope this is enough information to fix the problem.
If not I will be glad to send more.
Best regards,
Wojtek