samba-bugs at samba.org
2011-May-29 01:40 UTC
[Bug 8177] New: Problems with big sparsed files
https://bugzilla.samba.org/show_bug.cgi?id=8177 Summary: Problems with big sparsed files Product: rsync Version: 3.0.8 Platform: x64 OS/Version: Linux Status: NEW Severity: normal Priority: P5 Component: core AssignedTo: wayned at samba.org ReportedBy: joluinfante at gmail.com QAContact: rsync-qa at samba.org Hi. I'm using rsync (3.0.8, 32 bits, compiled from sources) to sync vmdk's (vmware files, 36gb of size) files in sparse format, in esxi 4.1 filesystem, to ext3 filesystem. The first time, the files copy ok (converting sparse files to non-sparse files), but, next time, the update of files don't work. It fail with messages: receiving incremental file list ./ cata-920-aux.xml cata-920-flat.vmdk WARNING: cata-920-flat.vmdk failed verification -- update discarded (will try ag ain). rsync: read errors mapping "/vmfs/volumes/cata-021-lun0/cata-920/cata-920-flat.v mdk": Invalid argument (22) vmware.log cata-920-flat.vmdk ERROR: cata-920-flat.vmdk failed verification -- update discarded. rsync: read errors mapping "/vmfs/volumes/cata-021-lun0/cata-920/cata-920-flat.v mdk": Invalid argument (22) sent 2601134 bytes received 771180 bytes 3716.05 bytes/sec total size is 8591035869 speedup is 2547.52 rsync error: some files/attrs were not transferred (see previous errors) (code 2 3) at main.c(1518) [generator=3.0.8] I did use this sentence: rsync-static -av --rsync-path=/bin/rsync-static --rsh="ssh -i /root/.ssh/${REMOTE_HOST}" --exclude="*00000*" --exclude="*.vswp" --exclude="*. vmem" --exclude="*Snapshot*" --exclude="*lck" --exclude="*.vmsd" --exclude="*.vm ss" --exclude="*.vmx" --sparse --delete ${REMOTE_HOST}:/vmfs/volumes/${LUN}/${VM NAME}/ . This is a bug? I did check this "bug" is similar to 3925, but this was resolved in a previous version. TIA jorge infante rosario - santa fe - argentina -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
https://bugzilla.samba.org/show_bug.cgi?id=8177 --- Comment #1 from joluinfante at gmail.com 2011-05-29 01:43:29 UTC --- More information: I did compile the rsync 3.0.8 version, statically, in centos 3.9 32 bits, because this is the platform to build utilities for esxi 4.1. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
https://bugzilla.samba.org/show_bug.cgi?id=8177 --- Comment #2 from joluinfante at gmail.com 2011-05-29 03:18:22 UTC --- I did download the git'ed version (3.1.0), and the problem continue. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
https://bugzilla.samba.org/show_bug.cgi?id=8177 --- Comment #3 from joluinfante at gmail.com 2011-05-29 04:46:15 UTC --- The size of file is 8gb (not 36gb). Without parameters --inplace and --sparse, I did check, in the destination disk, the temporary file (.cata-920-flat.vmdk.WKiDRx), increase up to 8gb, and, then, I see the rsync error in screen, and the temporary file is deleted. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
https://bugzilla.samba.org/show_bug.cgi?id=8177 --- Comment #4 from joluinfante at gmail.com 2011-05-30 19:12:21 UTC --- Hi, forum! Anyone out there? -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
https://bugzilla.samba.org/show_bug.cgi?id=8177 Wayne Davison <wayned at samba.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED --- Comment #5 from Wayne Davison <wayned at samba.org> 2011-06-18 17:41:13 UTC --- Any time there is a read error, rsync tweaks the checksum so that it doesn't match, thus alerting the receiver that the file is bogus. You need to figure out what is causing this error from your OS: rsync: read errors mapping "/vmfs/volumes/cata-021-lun0/cata-920/cata-920-flat.vmdk": Invalid argument (22) That error is reported at the end of all the reading, and indicates that one of the read() calls for the sender's file returned that errno. If you strace the sending side, you should be able to see what the parameters are to the read() that trigger that OS error. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
https://bugzilla.samba.org/show_bug.cgi?id=8177 --- Comment #6 from joluinfante at gmail.com 2011-07-07 15:38:30 UTC --- Excuse me by the delay, and thanks for your reply. The problem are: * I need use the rsync (statically compiled) to rsync files between a esxi-free 4.1 (vmware) and a linux filesystem. * Then, I did compile the rsync from sources, in a rhel platform (appropiate to generate binaries for this version of vmware), and transfer it to esxi host. * In esxi-free 4.1 (the sender side), we don't have tools to debug. * I don't have experience with sparse files in other platforms, to test it. Can you give me a little help to modify the source to add debug information in the syscall?. I did try it, but, because the size of file is very large, and I see a lot of debug information. (In reply to comment #5)> Any time there is a read error, rsync tweaks the checksum so that it doesn't > match, thus alerting the receiver that the file is bogus. You need to figure > out what is causing this error from your OS: > > rsync: read errors mapping > "/vmfs/volumes/cata-021-lun0/cata-920/cata-920-flat.vmdk": Invalid argument > (22) > > That error is reported at the end of all the reading, and indicates that one of > the read() calls for the sender's file returned that errno. If you strace the > sending side, you should be able to see what the parameters are to the read() > that trigger that OS error.-- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
https://bugzilla.samba.org/show_bug.cgi?id=8177 --- Comment #7 from joluinfante at gmail.com 2012-01-18 10:51:39 UTC --- Hi! Any news with this "bug"? Can I help with any thing? TIA jorge infante -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
https://bugzilla.samba.org/show_bug.cgi?id=8177 --- Comment #8 from Rick <rick at ict-concept.nl> 2012-03-01 10:08:28 UTC --- Hi Jorge, You're not alone :) I'm using rsync the backup some vmware servers. Small files work without any problems but the large (-flat.vmdk) file only works the first time. (I do create a snapshot before syncing) This is the log: -------------------------- Sync the -flat.vmdk file (Vcenter Server-flat.vmdk) rsync: read errors mapping "/vmfs/volumes/4cf36d72-d19e4c8a-cec8-d485645b13ad/Vcenter Server/Vcenter Server-flat.vmdk": Invalid argument (22) ERROR: Vcenter Server-flat.vmdk failed verification -- update discarded. rsync: read errors mapping "/vmfs/volumes/4cf36d72-d19e4c8a-cec8-d485645b13ad/Vcenter Server/Vcenter Server-flat.vmdk": Invalid argument (22) rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1518) [generator=3.0.9] Thu Mar 1 10:43:00 CET 2012 FAILED: rsync on Vcenter Server failed (vmdk-flat file) (0) -------------------------- Some details: The both systems are running VMware ESX 4.1.0 build-260247 (Red Hat 5.1). The filesystem is ext3. I tried both rsync 3.0.7 and 3.0.9. The filesize is 80G / 200G (both ESX servers make copies of the other ESX server. Both ESX servers give the same error) Removing the --sparse parameter doesn't work The rsync command used is: /usr/local/bin/rsync -t -e "ssh -i /operator/vmbackup/vm_image_key -l root" --rsync-path=/operator/rsync --sparse --stats "192.168.1.10:/vmfs/volumes/4cf36d72-d19e4c8a-cec8-d485645b13ad/Server02/Server02-flat.vmdk" "/vmfs/volumes/VM-Datastore-VMHOST02//Server02/" The backup script runs at multiple customers (some are also ESX 4.1 machines), but this is the only one giving an error. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
https://bugzilla.samba.org/show_bug.cgi?id=8177 --- Comment #9 from Remco Hosman <remco at hosman.xs4all.nl> 2012-03-10 20:36:00 UTC --- Created attachment 7373 --> https://bugzilla.samba.org/attachment.cgi?id=7373 strace dumps -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
https://bugzilla.samba.org/show_bug.cgi?id=8177 --- Comment #10 from Remco Hosman <remco at hosman.xs4all.nl> 2012-03-10 20:36:28 UTC --- Created attachment 7374 --> https://bugzilla.samba.org/attachment.cgi?id=7374 strace dumps -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
https://bugzilla.samba.org/show_bug.cgi?id=8177 --- Comment #11 from Remco Hosman <remco at hosman.xs4all.nl> 2012-03-10 20:37:32 UTC --- I am having the same issue with ESXi5.0 and rsync 3.0.9. Compiled rsync on a debian 4.0 system. I have a VM and make a snapshot. at that point the original disk file should not change anymore. When the problem does not occur: *) when the VM is not running *) when the image file does not exists on the target when the problem does occur: *) when the VM is running and the file already exists on the target. I attached a tgz with straces of the following command: strace -v -tt -f -o out.log ./rsync --copy-links --recursive --compress --progress --inplace --bwlimit=100 /vmfs/volumes/4e9007cc-5eb44d46-eace-0019b9e8ff4d/backups/ user at host:/targetdir/ in all 4 cases (running/not running and existing/non existing target file) but it comes down to this: --- VM off, file already exists on target machine --- 7754511 20:20:40.298664 open("backuptest/backuptest_7-flat.vmdk", O_RDONLY|O_LARGEFILE) = 3 7754511 20:20:40.300134 fstat64(0x3, 0xfff6cacc) = 0 7754511 20:20:40.300296 write(1, "backuptest/backuptest_7-flat.vmd"..., 34) 34 7754511 20:20:40.317672 mmap2(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x782f0000 7754511 20:20:40.317831 read(3, "\25\0\2161\246\210wN\253Y\332O`T\357\354\21\307\344\31"..., 262144) = 262144 7754511 20:20:40.341102 gettimeofday({1331410840, 341167}, NULL) = 0 7754511 20:20:40.341230 ioctl(1, TIOCGPGRP, [7748366]) = 0 7754511 20:20:40.341434 write(1, " 4096 0% 13.38kB/s "..., 41) 41 -- bunch of gettimeofday()'s -- 7754511 20:20:40.382444 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 262143) = 262143 --- VM on, file already exists on target machine --- 7672734 20:24:08.828084 open("backuptest/backuptest_7-flat.vmdk", O_RDONLY|O_LARGEFILE) = 3 7672734 20:24:08.828808 fstat64(0x3, 0xfff6cacc) = 0 7672734 20:24:08.828976 write(1, "backuptest/backuptest_7-flat.vmd"..., 34) 34 7672734 20:24:08.829380 mmap2(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x782f0000 7672734 20:24:08.829529 read(3, "\25\0\2161\246\210wN\253Y\332O`T\357\354\21\307\344\31"..., 262144) = 262144 7672734 20:24:08.831043 gettimeofday({1331411048, 831103}, NULL) = 0 7672734 20:24:08.831165 ioctl(1, TIOCGPGRP, [7748509]) = 0 7672734 20:24:08.831367 write(1, " 4096 0% 15.09kB/s "..., 41) 41 -- bunch of gettimeofdays()'s -- 7672734 20:24:08.847043 read(3, 0x782f0009, 262143) = -1 EINVAL (Invalid argument) I am kinda lost at this point. Remco -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
https://bugzilla.samba.org/show_bug.cgi?id=8177 --- Comment #12 from grarpamp at gmail.com 2012-03-10 22:47:50 UTC --- As an aside, be aware that you may encounter rather poor performance with sparse files. If so, and until this bug is addressed, you may need to look at other transfer options. https://bugzilla.samba.org/show_bug.cgi?id=7854 You may also want to backup a snapshot (maybe one application level write point of fail) instead of missing all the writes numbered 2+ that will slip by while waiting on those hundred GB to copy. More properly, shut the vm down first or use any built in snapshot tool the vm has. Also, check *.* syslog and dmesg for any errors, run memtest86. But it say mmap 22, so maybe not disk media error. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
https://bugzilla.samba.org/show_bug.cgi?id=8177 --- Comment #13 from Remco Hosman <remco at hosman.xs4all.nl> 2012-03-11 08:08:22 UTC --- The reason for me to use rsync is the following: I want to backup a remote server over the internet. so bandwidth usage is an issue. The backup target is a btrfs system (again, not the cause of the problem, happens with ext3 on the target too). I make a reflink copy of yesterdays file and do rsync --inplace to backup. uses very little bandwidth and diskspace. The time it takes for the rsync to take place is not an issue. The snapshot is made and does not change over time until delete it after rsync is done. i did some more testing and found out the problem is not related to the file being sparse or big. when i have a non-sparse 16MB file, the problem persists. now im not an expert in interpreting strace output, but i dont see much difference. but does strace display things differently depending on the return value of a call? 3978200 20:24:54.964491 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 262144) = 262144 7672734 20:24:08.847043 read(3, 0x782f0009, 262143) = -1 EINVAL (Invalid argument) what is that 0x782f0009 ? the address rsync wants read to put the data? Remco -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
https://bugzilla.samba.org/show_bug.cgi?id=8177 --- Comment #14 from Remco Hosman <remco at hosman.xs4all.nl> 2012-03-11 10:38:02 UTC --- Is it possible the issue in the non-blocksized-read ? the first read() is always 262144 bytes. when the target does not exist yet, all other read()'s are 262144 bytes too, however, when the target file does exist, the next read()'s are 1 byte less: 262143 bytes. Why this only matters when the VM is running i do not know. Must be an internal VMFS issue. Remco -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
https://bugzilla.samba.org/show_bug.cgi?id=8177 --- Comment #15 from Remco Hosman <remco at hosman.xs4all.nl> 2012-03-11 10:52:55 UTC --- confirmed. the following text program demonstrates the problem: #include <stdio.h> #include <unistd.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> int main(void) { int fd, size; char data[300000]; fd open("/vmfs/volumes/4e9007cc-5eb44d46-eace-0019b9e8ff4d/backups/backuptest/backuptest_7-flat.vmdk",O_RDONLY); printf("open() : %d\n",fd); size = read(fd, data, 262144); printf("read() : %d\n",size); size = read(fd, data, 262143); printf("read() : %d\n",size); close(fd); } the first read() returns 262144, as it should. the 2nd returns -1 on VMFS when the VM is running. The required block-alignment is a multiple of 512 bytes. Remco -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
https://bugzilla.samba.org/show_bug.cgi?id=8177 --- Comment #16 from Remco Hosman <remco at hosman.xs4all.nl> 2012-03-11 20:11:46 UTC --- Someone pointed out i forgot to test the error returned by the failed reads. so here they are. after a failed read errno is 22, meaning EINVAL (also the same numerical error rsync reports). I also found that lseek()'ing to a non-512 byte boundary makes read()'s of any size fail until you seek to a 512-byte boundary again. Remco -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
https://bugzilla.samba.org/show_bug.cgi?id=8177 --- Comment #17 from molniev <molniev at hotmail.com> 2012-10-19 12:49:07 UTC --- Hi. I'am using rsync 3.0.9 on ESXi 5.0 and detect this bug (thanks Remco Hosman). I made ??a patch that solves the problem. This code reads the files, aligned with the 4096 border. The size of the blocks in (4096 * 4096 * 2) bytes (I think it will be faster, but you sacrifice memory). Replace the file "fileio.c" function "map_ptr" (line 180-264) with the following code: char *map_ptr(struct map_struct *map, OFF_T offset, int32 len) { OFF_T ret = 0; int32 read_size = 0; int32 n = 0; if (0 == len) return NULL; if (len < 0) { rprintf(FERROR, "invalid len passed to map_ptr: %ld\n", (long)len); exit_cleanup(RERR_FILEIO); } if ( (offset >= map->p_offset) && (offset + len <= map->p_offset + map->p_len) ) return map->p + (offset - map->p_offset); map->p_offset = (offset >> 12); map->p_offset = (map->p_offset << 12); while (map->p_len < (len + 4096) ) map->p_len += (4096 * 4096 * 2); if (map->p_len > map->p_size) { map->p_size = map->p_len; map->p = realloc_array(map->p, char, map->p_size); if (!map->p) out_of_memory("map_ptr"); } if (map->p_fd_offset != map->p_offset) { ret = do_lseek(map->fd, map->p_offset, SEEK_SET); if (ret != map->p_offset) { rsyserr(FERROR, errno, "lseek returned %.0f, not %.0f", (double)ret, (double)map->p_offset); exit_cleanup(RERR_FILEIO); } map->p_fd_offset = map->p_offset; } if (map->p_offset + map->p_len > map->file_size) { if (map->p_offset > map->file_size) { map->status = ENODATA; printf("DBG path function map_ptr error: %i\n", (int)__LINE__); memset(map->p + (offset - map->p_offset), 0, len); return (map->p + (offset - map->p_offset) ); } map->p_len = map->file_size - map->p_offset; if (len > (map->p_len - (offset - map->p_offset) ) ) { map->status = ENODATA; printf("DBG path function map_ptr error: %i\n", (int)__LINE__); memset(map->p + map->p_len, 0, map->p_len - len); } } read_size = map->p_len; while (read_size > 0) { n = read(map->fd, map->p, read_size); if (n <= 0) { if (!map->status) map->status = n ? errno : ENODATA; printf("DBG path function map_ptr error: %i\n", (int)__LINE__); memset(map->p, 0, map->p_len); break; } map->p_fd_offset += n; read_size -= n; } return map->p + (map->p_offset - offset); } If you find a bug in my code - please let - I'll try to fix it. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
https://bugzilla.samba.org/show_bug.cgi?id=8177 --- Comment #18 from molniev <molniev at hotmail.com> 2012-10-19 12:57:52 UTC --- Created attachment 8090 --> https://bugzilla.samba.org/attachment.cgi?id=8090 Correct fileio.c -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
https://bugzilla.samba.org/show_bug.cgi?id=8177 --- Comment #19 from ciscolyon <support at fidit.com> 2013-01-02 17:49:39 UTC --- (In reply to comment #18)> Created attachment 8090 [details] > Correct fileio.cwhere can I get a Rsync binary for esxi 5.1 Thank you -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
https://bugzilla.samba.org/show_bug.cgi?id=8177 --- Comment #20 from molniev <molniev at hotmail.com> 2013-01-03 18:54:22 UTC --- (In reply to comment #19)> (In reply to comment #18) > > Created attachment 8090 [details] [details] > > Correct fileio.c > > where can I get a Rsync binary for esxi 5.1 > > Thank youUse article: http://www.virtuallyghetto.com/2011/02/how-to-compile-statically-linked-rsync.html For 5.1 compatible not only Centos 3.9. Use newest distributive (4.x or 5.x). -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
https://bugzilla.samba.org/show_bug.cgi?id=8177 --- Comment #21 from ciscolyon <support at fidit.com> 2013-01-16 18:04:57 UTC --- I find the binay on http://www.virtuallyghetto.com/2011/02/how-to-compile-statically-linked-rsync.html and it works -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
https://bugzilla.samba.org/show_bug.cgi?id=8177 Wayne Davison <wayned at samba.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |FIXED --- Comment #22 from Wayne Davison <wayned at samba.org> 2013-01-18 23:38:52 UTC --- I have checked in a change that makes the map_ptr() function align its reads on 1024-byte boundaries. Hopefully that will fix this issue without making things inefficient. I looked at the patch from molniev, and it has a bug in the read look should the read return less than the number of bytes requested (the next read would overwrite earlier bytes instead of appending them). My patch is instead just a simpler bit of rounding changes made to the stock code. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.