bugzilla-daemon@dp3.samba.org
2005-Nov-22 21:17 UTC
[Bug 3186] Surprisingly large unshared memory usage
https://bugzilla.samba.org/show_bug.cgi?id=3186 ------- Comment #1 from foner-rsync-bugzilla@media.mit.edu 2005-11-22 14:16 MST ------- Any ideas on this? It's been open 5 weeks and probably got overlooked... Tnx! -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
bugzilla-daemon@dp3.samba.org
2005-Nov-23 06:42 UTC
[Bug 3186] Surprisingly large unshared memory usage
https://bugzilla.samba.org/show_bug.cgi?id=3186 ------- Comment #2 from vanes002@umn.edu 2005-11-22 23:41 MST ------- What version are you using? You have 2.6.7 selected in the bug report, but that's still in development. The copy-on-write optimization wasn't done until v2.6.1 (Apr 2004): - The generator is now better about not modifying the file list during the transfer in order to avoid a copy-on-write memory bifurcation (on systems where fork() uses shared memory). Previously, rsync's shared memory would slowly become unshared, resulting in real memory usage nearly doubling on the receiving side by the end of the transfer. Now, as long as permissions are being preserved, the shared memory should remain that way for the entire transfer. You use the -p option, so you meet the "permissions being preserved" condition. The file_struct data and other chunks of data are allocated out of the free memory pool. If lots of those allocated chunks are returned to the pool, pool management involves memory being modified, so that would require new writes. Wayne - does rsync free up anything substantial at any time after the fork that might trigger this? -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
bugzilla-daemon@dp3.samba.org
2005-Nov-23 07:36 UTC
[Bug 3186] Surprisingly large unshared memory usage
https://bugzilla.samba.org/show_bug.cgi?id=3186 ------- Comment #3 from foner-rsync-bugzilla@media.mit.edu 2005-11-23 00:34 MST ------- I'm actually using 2.6.7. In fact, I'm using a version from CVS in which Wayne added --min-size and --max-size, and fixed (and then re-fixed) a bug in hlink. This isn't currently the very latest CVS, but I believe corresponds to rsync-HEAD-20051014-2036GMT, plus the hlink stuff. I could fairly trivially update to the very latest CVS if it was useful in figuring out what's going on; my fundamental question was, "Is this a bug or am I misunderstanding something?" and it's sounding like you believe it's a bug. I'm reasonably sure that I saw this same behavior in the rsync that ships in Ubuntu Breezy, which is their (somewhat Debianized and with a broken --min|max-size) version of 2.6.5. If it was really necessary, I could re-verify that I saw this behavior in that version, although it'd take some work. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
bugzilla-daemon@dp3.samba.org
2005-Nov-23 12:21 UTC
[Bug 3186] Surprisingly large unshared memory usage
https://bugzilla.samba.org/show_bug.cgi?id=3186 ------- Comment #4 from vanes002@umn.edu 2005-11-23 05:19 MST ------- Thanks for confirming that you are using CVS. No need to update to the very latest. Hmm. You are using --delete, which is done before any file transfers. Ahhhhh. Looks like it builds an entire equivalent file list for files on the receiving side. So it may also be building a 200MB file list to do deletes. Try running without --delete and see if that extra memory usage still happens. If it goes away, try using --delete-during, which now that I think about it, was intended to solve this very problem with large numbers of files. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
bugzilla-daemon@dp3.samba.org
2005-Nov-24 00:12 UTC
[Bug 3186] Surprisingly large unshared memory usage
https://bugzilla.samba.org/show_bug.cgi?id=3186 wayned@samba.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED ------- Comment #5 from wayned@samba.org 2005-11-23 17:11 MST ------- I have been trying various copy commands to try to duplicate this, but haven't seen anything wrong. If -p is left off the memory for the shared file list will become unshared, but I haven't yet seen another scenario that would cause that.> Looks like it builds an entire equivalent file list > for files on the receiving side.Older rsync versions did create a duplicate file list when deleting, but that was changed in recent releases to only need enough memory for a single directory at a time plus whatever memory is needed for the scanning function to recurse down to the deepest dir. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
bugzilla-daemon@dp3.samba.org
2005-Nov-24 06:08 UTC
[Bug 3186] Surprisingly large unshared memory usage
https://bugzilla.samba.org/show_bug.cgi?id=3186 ------- Comment #6 from foner-rsync-bugzilla@media.mit.edu 2005-11-23 23:07 MST ------- I tried adding --delete-during (so the full invocation now looks like "rsync -vrltH --delete -pgo --stats -z -D --numeric-ids -i --delete-during --exclude-from=/FS/dirvish/HOST/20051124/exclude --link-dest=/FS/dirvish/HOST/20051123/tree root@HOST:/ /FS/dirvish/HOST/20051124/tree" and the behavior didn't change. But now that I think about it, it's not clear if --delete could be a problem in the first place, because these are dirvish runs. That means that I'm using -H and --link-dest to populate a tree that originally starts out empty, and winds up containing only a very few files that consume actual disk space (whatever got created or modified since the dirvish run yesterday), and about 2 million hardlinks into the previous day's tree. If rsync is writing to an otherwise-empty tree, it seems to me that --delete has nothing to do---which makes me wonder why dirvish even bothers to supply it automatically, frankly, since dirvish -always- starts from an empty destination tree. Is there some reason why it makes sense to supply --delete at all? (Unfortunately, we can't ask dirvish's original author why he did this, alas.) Or does --delete cause process inflation if there -isn't- much to do instead of if there -is-? Once this run completes in a couple hours (I'm debugging some other, unrelated things at the same time in this run), I may just blow its tree away and start over without --delete in any form (by editing the dirvish script) and see if that changes its behavior, but I'd be pretty mystified if it did unless my understanding of --delete, --link-dest, and empty destination trees is just wrong. Just in case I'm being completely faked out here and the second process really is sharing most of its memory, here are the top few lines of "top" running on the destination host: top - 00:34:30 up 8 days, 8:25, 2 users, load average: 3.05, 2.00, 0.96 Tasks: 65 total, 3 running, 62 sleeping, 0 stopped, 0 zombie Cpu(s): 6.0% us, 39.9% sy, 0.0% ni, 0.0% id, 51.3% wa, 2.0% hi, 0.7% si Mem: 516492k total, 508012k used, 8480k free, 76180k buffers Swap: 1544184k total, 12476k used, 1531708k free, 68404k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 10795 root 18 0 259m 253m 676 D 15.2 50.3 0:47.19 rsync 10865 root 16 0 251m 245m 688 S 0.0 48.7 0:00.08 rsync What's actually kinda interesting there is that it claims to have 8m free, and 76m of buffers, -and- to have 253+245m of rsync resident, all on a machine with only 512m total memory (and not including ~30m of other processes). And yet I'm pretty sure I -saw- the free memory go from about 200m to about 0 when that second process started up on, on previous runs. (On this one, I didn't quite catch it in the act and am not sure how much free memory there was before the inflation of the second process.) -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
bugzilla-daemon@dp3.samba.org
2005-Nov-24 18:29 UTC
[Bug 3186] Surprisingly large unshared memory usage
https://bugzilla.samba.org/show_bug.cgi?id=3186 ------- Comment #7 from wayned@samba.org 2005-11-24 11:27 MST ------- You are right that dervish does not need to use --delete when copying into a new directory, but it also doesn't hurt anything (since it won't actually do much of anything). I read something that made it sound like this might be a recent change in the Linux kernel, so I added a sleep(1000) to both the generator and the receiver right after they fork and then ran a test on a system with a linux-2.4 kernel and a linux-2.6 kernel. The processes stayed shared on the 2.4 system, but became unshared right after the fork on the 2.6 kernel, so this appears to be something that needs to be investigated in Linux itself. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
bugzilla-daemon@dp3.samba.org
2005-Nov-24 19:30 UTC
[Bug 3186] Surprisingly large unshared memory usage
https://bugzilla.samba.org/show_bug.cgi?id=3186 ------- Comment #8 from foner-rsync-bugzilla@media.mit.edu 2005-11-24 12:28 MST ------- Yikes. Well, I'm certainly runnning 2.6 on everything here. You're certainly in a better position than I am to try to bug-report this to the kernel developers; is that your next step? (I'm assuming there wasn't some API change that makes this "not a bug", but I haven't investigated.) (Thanks!) Btw, I suspect that dirvish's use of --delete might have originated in debugging, or if someone tries to recreate a failed run by redoing it on the same day (and hence in the same tree) without blowing away the tree first, since by default the trees are named by dates. In those cases, --delete would make sense, and since leaving it in theoretically will do nothing in the usual case, I'll ignore it. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
Apparently Analagous Threads
- [Bug 3186] New: Surprisingly large unshared memory usage
- [Bug 3175] it would be nice if --link-dest matched up devices and symlinks too
- [Bug 3175] devices and --link-dest don't seem to work together
- [Bug 3168] --min-size cores in 2.6.5 and is completely missing in 2.6.6
- [Bug 10290] New: Regression since 3.0.9: send_xattr_request: Assertion `f_out >= 0' failed