samba-bugs@samba.org
2005-Oct-18 03:00 UTC
[Bug 3186] New: Surprisingly large unshared memory usage
https://bugzilla.samba.org/show_bug.cgi?id=3186 Summary: Surprisingly large unshared memory usage Product: rsync Version: 2.6.7 Platform: x86 OS/Version: Linux Status: NEW Severity: normal Priority: P3 Component: core AssignedTo: wayned@samba.org ReportedBy: foner-rsync-bugzilla@media.mit.edu QAContact: rsync-qa@samba.org CC: foner-rsync-bugzilla@media.mit.edu I'm running a command like "rsync -vrltH --delete -pgo --stats -z -D --numeric-ids -i --link-dest=foo blah:bar baz" (part of a dirvish run) with an input fileset of about 2.4 million files (400K of those file are actually hardlinked to each other on the sending machine, and remain that way on the receiving machine---and in fact all but about 30 of them haven't changed, so virtually all 2.4M of those files also wind up hardlinked to the --link-dest directory; this is about 280G total). It takes about 10 minutes to scan a filesystem of this size, and both the sending & receiving machines rsyncs slowly expand to about 200M during this scan; that's understandable. But then, as soon as the scan is done, the second rsync process on the receiving side inflates (over the course of about 5 seconds or so) to -another- 200M. I don't think I'm being faked out by shared memory being reported twice, since the free memory on the machine declines precipitously at exactly the same time. This isn't quite screwing me yet (the machine's got half a gig of RAM and very little else that must stay resident during the run), but if the filesystem gets much bigger, I fear massive thrashing due to swapping. (Really, what I'll have to do is buy more RAM.) I was under the impression that this wasn't supposed to happen---that rsync tried hard not to modify lots of pages after the fork, and that Linux (I'm running Ubuntu Breezy, which has a 2.6 kernel) had copy-on-write fork semantics. Is the essentially instantaneous inflation of the second rsync process happening because of either the -H or the --link-dest, or is it a bug? [This transfer also accumulates about an hour of CPU time on this Athon 1200MHz CPU; I assume this is due to the expense of -H, and works out to about 1.5 milliseconds of processing per file, assuming I haven't goofed on the math; this is about a million instructions (or 21000 non-cached memory fetches) per file. I'd love it if this could be brought down, but I'm probably being unrealistic about an essentially O(n^2) algorithm...] -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
Possibly Parallel Threads
- [Bug 3186] Surprisingly large unshared memory usage
- [Bug 3175] New: devices and --link-dest don't seem to work together
- size-related rsync bugs?
- hang with rsync 3.0.0pre7 doing local copy
- Merge dataframes with no shared rows, some shared and som e unshared columns