samba-bugs@samba.org
2005-Oct-18 03:00 UTC
[Bug 3186] New: Surprisingly large unshared memory usage
https://bugzilla.samba.org/show_bug.cgi?id=3186
Summary: Surprisingly large unshared memory usage
Product: rsync
Version: 2.6.7
Platform: x86
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P3
Component: core
AssignedTo: wayned@samba.org
ReportedBy: foner-rsync-bugzilla@media.mit.edu
QAContact: rsync-qa@samba.org
CC: foner-rsync-bugzilla@media.mit.edu
I'm running a command like "rsync -vrltH --delete -pgo --stats -z -D
--numeric-ids -i --link-dest=foo blah:bar baz" (part of a dirvish run) with
an
input fileset of about 2.4 million files (400K of those file are actually
hardlinked to each other on the sending machine, and remain that way on the
receiving machine---and in fact all but about 30 of them haven't changed, so
virtually all 2.4M of those files also wind up hardlinked to the --link-dest
directory; this is about 280G total).
It takes about 10 minutes to scan a filesystem of this size, and both the
sending & receiving machines rsyncs slowly expand to about 200M during this
scan; that's understandable. But then, as soon as the scan is done, the
second
rsync process on the receiving side inflates (over the course of about 5 seconds
or so) to -another- 200M. I don't think I'm being faked out by shared
memory
being reported twice, since the free memory on the machine declines
precipitously at exactly the same time. This isn't quite screwing me yet
(the
machine's got half a gig of RAM and very little else that must stay resident
during the run), but if the filesystem gets much bigger, I fear massive
thrashing due to swapping. (Really, what I'll have to do is buy more RAM.)
I was under the impression that this wasn't supposed to happen---that rsync
tried hard not to modify lots of pages after the fork, and that Linux (I'm
running Ubuntu Breezy, which has a 2.6 kernel) had copy-on-write fork semantics.
Is the essentially instantaneous inflation of the second rsync process
happening because of either the -H or the --link-dest, or is it a bug?
[This transfer also accumulates about an hour of CPU time on this Athon 1200MHz
CPU; I assume this is due to the expense of -H, and works out to about 1.5
milliseconds of processing per file, assuming I haven't goofed on the math;
this
is about a million instructions (or 21000 non-cached memory fetches) per file.
I'd love it if this could be brought down, but I'm probably being
unrealistic
about an essentially O(n^2) algorithm...]
--
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug, or are watching the QA contact.
Reasonably Related Threads
- [Bug 3186] Surprisingly large unshared memory usage
- [Bug 3175] New: devices and --link-dest don't seem to work together
- size-related rsync bugs?
- hang with rsync 3.0.0pre7 doing local copy
- Merge dataframes with no shared rows, some shared and som e unshared columns
