Mermgfurt ! I have some problem with syncing two machines which are connected over a Gigabit-connection. I'm trying to use rsync with ssh because of the authorisation mechanisms (keys). It starts quite ok with 18 MB/s (this small speed may have something to do with our internal net) and falls down to 400 KB/s (!!!). This happens over a long period because those files I want to copy are very big (upto 70 GB per file). Even though I tried to increase the blocking size the speed just goes down and won't go up again. In fact it really writes 18 MB/s, it's not just a problem of -partial or something similar. Ok, I haven't tried it without ssh yet, but it really looks very strange. The version is the rsync 2.5.6cvs version from debian-unstable. Thanks for any help ! Mermgfurt, Udo -- Udo Wolter | /"\ email: uwp@dicke-aersche.de | \ / ASCII RIBBON CAMPAIGN www: www.dicke-aersche.de | X AGAINST HTML MAIL dark: heaven@lutz-ziffer.de | / \
At 16:30 11-11-2002 +0100, you wrote:>Mermgfurt ! > >I have some problem with syncing two machines which are connected >over a Gigabit-connection. I'm trying to use rsync with ssh because of >the authorisation mechanisms (keys). It starts quite ok with 18 MB/s >(this small speed may have something to do with our internal net) >and falls down to 400 KB/s (!!!). This happens over a long period >because those files I want to copy are very big (upto 70 GB per file). >Even though I tried to increase the blocking size the speed just goes >down and won't go up again. In fact it really writes 18 MB/s, it's not >just a problem of -partial or something similar. Ok, I haven't tried >it without ssh yet, but it really looks very strange. >The version is the rsync 2.5.6cvs version from debian-unstable.Look for the processor usage in the machines that are transfering the files. You'll probably see that one of those machines has about 100% processor usage, given that the big files are about 70Gb. Try massive block sizes, apart from that I have no ideas. The problem is unrelated from network bandwith, though... Bruno Ferreira --- [This E-mail scanned for viruses by Declude Virus]
I don't have a system with ssh available to check with (believe it or not, it's not approved for our network), but i think the sshd_config or ssh_config might be able to specify using compression as a default. Is ssh on the sending side, perchance, using a lot of CPU? I don't know of any cpu that can create anything close to a GB/sec compressed _input_, much less output. I don't even remember if you can turn the compression off if it IS default. Barring that, If you aren't concerned about somebody sniffing the content you're syncing, perhaps you could use the internal transport? If you can protect your ssh private keys, you can protect your rsync password-file as well. This also has the advantage of cutting down on context switches, as one process is doing both the sync stream AND the communication. Tim Conway conway.tim@spilihp.com reorder name and reverse domain 303.682.4917 office, 303.921.0301 cell Philips Semiconductor - Longmont TC 1880 Industrial Circle, Suite D Longmont, CO 80501 Available via SameTime Connect within Philips, caesupport2 on AIM "There are some who call me.... Tim?" uwp@dicke-aersche.de Sent by: rsync-admin@lists.samba.org 11/11/02 08:30 AM Please respond to uwp To: rsync@lists.samba.org cc: (bcc: Tim Conway/LMT/SC/PHILIPS) Subject: Speed problem Classification: Mermgfurt ! I have some problem with syncing two machines which are connected over a Gigabit-connection. I'm trying to use rsync with ssh because of the authorisation mechanisms (keys). It starts quite ok with 18 MB/s (this small speed may have something to do with our internal net) and falls down to 400 KB/s (!!!). This happens over a long period because those files I want to copy are very big (upto 70 GB per file). Even though I tried to increase the blocking size the speed just goes down and won't go up again. In fact it really writes 18 MB/s, it's not just a problem of -partial or something similar. Ok, I haven't tried it without ssh yet, but it really looks very strange. The version is the rsync 2.5.6cvs version from debian-unstable. Thanks for any help ! Mermgfurt, Udo -- Udo Wolter | /"\ email: uwp@dicke-aersche.de | \ / ASCII RIBBON CAMPAIGN www: www.dicke-aersche.de | X AGAINST HTML MAIL dark: heaven@lutz-ziffer.de | / \ -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
On Mon, Nov 11, 2002 at 04:30:05PM +0100, uwp@dicke-aersche.de wrote:> Mermgfurt ! > > I have some problem with syncing two machines which are connected > over a Gigabit-connection. I'm trying to use rsync with ssh because of > the authorisation mechanisms (keys). It starts quite ok with 18 MB/s > (this small speed may have something to do with our internal net) > and falls down to 400 KB/s (!!!). This happens over a long period > because those files I want to copy are very big (upto 70 GB per file). > Even though I tried to increase the blocking size the speed just goes > down and won't go up again. In fact it really writes 18 MB/s, it's not > just a problem of -partial or something similar. Ok, I haven't tried > it without ssh yet, but it really looks very strange. > The version is the rsync 2.5.6cvs version from debian-unstable. > > Thanks for any help !You haven't really provided enough data to even guess what is limiting your performance. You need look at CPU utilization, I/O load and network load on both ends. It isn't likely to be the network unless there is contention, but CPU and/or I/O are probably the problem. I know in my case the reciever is CPU bound. You could also be suffering from buss contention. Is the 18MB/s actual data transfer (just what has changed) during actual transfer? If so you may be up against the limitation of your disk+file systems. My gut reaction is that 18MB/s is probably just short of the best sustained throughput you can get out of your disk subsystem as configured. I'd suggest testing to see how fast your receiver can handle sustained writes and how fast your sender can read. And burst rates don't count. You need sustained transfer rates. -- ________________________________________________________________ J.W. Schultz Pegasystems Technologies email address: jw@pegasys.ws Remember Cernan and Schmitt
On Mon, 11 Nov 2002 jw schultz wrote:> You haven't really provided enough data to even guess what > is limiting your performance.As I said in the last mail: One limit for sure is ssh. But: with arcfour I'm getting 18 MB/s and that's where rsync is actually starting. It's just getting down and down and that's the strange point.> You need look at CPU utilization, I/O load and network loadThe CPU is at 100% for encryption reasons of ssh. But I/O is not very much.> Is the 18MB/s actual data transfer (just what has changed) > during actual transfer? If so you may be up against the > limitation of your disk+file systems.The disks have an upper limit of 52 MB/s (ext2) respectively 45 MB/s (ext3). It's an IDE RAID with 12 WD disks.> My gut reaction is that 18MB/s is probably just short of the > best sustained throughput you can get out of your diskNope. The average of the disks over a long period is between 31 and 38 MB/s. I tested it without ssh. Just to say it again: I wouldn't have a problem with 18 MB/s, that's what I expected. I just have a problem with the fact that it goes down to 400 KB/s in half an hour...:-( Mermgfurt, Udo -- Udo Wolter | /"\ email: uwp@dicke-aersche.de | \ / ASCII RIBBON CAMPAIGN www: www.dicke-aersche.de | X AGAINST HTML MAIL dark: heaven@lutz-ziffer.de | / \
On Mon, 11 Nov 2002 Bruno Ferreira wrote:> Look for the processor usage in the machines that are transfering the > files. You'll probably see that one of those machines has about 100%This doesn't seem to be the worst point. I mean: the machine is not going down under pressure or something like that. You can really work with it. Only the speed is going down.> processor usage, given that the big files are about 70Gb. Try massive > block sizes, apart from that I have no ideas.I tried this and it got slightly better, but the higher I set the blocksize the longer I have to wait before it starts. And I can only lengthen the time until speed begins to drop again. Maybe after 20 or 30 minutes I ran into the same problem.> The problem is unrelated from network bandwith, though...Can you elaborate on that ? What's the problem ? Besides of this problem rsync would be the perfect tool to move heavy data throughout the machines... Mermgfurt, Udo -- Udo Wolter <-> uwp@dicke-aersche.de | /"\ !!! Free Music Video !!! All Linux made !!! | \ / ASCII RIBBON CAMPAIGN http://www.dicke-aersche.de/chapterx/video.html | X AGAINST HTML MAIL !!! First Music Video made with Linux !!! | / \
On Mon, 11 Nov 2002, Paul Faure wrote:> Try it without ssh.But ssh have those nice authentication features...> ssh may be waiting in the random pool for more entropy (randomness). > When it grabs a lot of random data, it must wait for more "random" thingsAre you sure bout that ? I'm throwing a lot of data through my machines with ssh (also more than 15 - 20 MB/s) and it never got such a strange problem that network speed starts high and ends up in a terrible slow mess.> to happen to populate the random pool, if it did not do this, your random > data would be predictable and thus insecure.Hm, but to create those big files I never had those effects. It was all the way straight. It's just a problem during rsync to another machine ?> Try `ls -R /*` on your system when it slows down.You don't mean that, don't you ? I mean this process should run autmatically by night sometime... And to say it clear again: it's just the network that slows down not the whole machine and it does it only with rsync. Tomorrow I'll try to test it with standard rsh, maybe it's the connection between ssh and rsync that doesn't work good. Anyway, thanx for all your answers. At least I'm a small step further... Mermgfurt, Udo -- Udo Wolter | /"\ email: uwp@dicke-aersche.de | \ / ASCII RIBBON CAMPAIGN www: www.dicke-aersche.de | X AGAINST HTML MAIL dark: heaven@lutz-ziffer.de | / \
On Mon, 11 Nov 2002, I've been saying:> But why does it only happen with rsync ?Ok, the last tests with rsync/rsh have shown the following: (all on the receiving side) CPU: 100% Load: 2.5 blocks in: 38000/s even though nothing get written (no statistics) when it starts to write, it goes from 15000 to 32000 blocks in blockout even has no problems and just writes it down in 58000 - 60000 blocks and has 4 seconds of 0 blocks bo Funny: rsh only brings also just 12-18 MB/s just like native scp Effects are equal, it starts at a very high rate and drops after a while (BTW: Why are 2 rsync processes running ?) This whole thing only happens with very big files. If the files are bigger than 5 or 6 GB (I can't say exactly where the limit is, but it seems as if it's over the old 2 GB limit, anyway my system can handle files that big without any problems) this strange thing happens. When rsync once in this status it never comes back to a normal status, which means: if the file is completed and the next file is smaller, the effect is the same, the rate won't go up. Mermgfurt, Udo -- Udo Wolter | /"\ email: uwp@dicke-aersche.de | \ / ASCII RIBBON CAMPAIGN www: www.dicke-aersche.de | X AGAINST HTML MAIL dark: heaven@lutz-ziffer.de | / \
Ok, now I found something. When the effect of heavy speed drop occurs, it doesn't seem to send much bytes anymore. Block-in rate on the receiving side drops dramatically from 31000/s to 5000/every 4-8 seconds (which results to a rate of nearly 1 MB/s, that's what I got in the end). CPU load goes down on both sides. It seems as if the rsync process simply don't do anything any further. At the beginning sending and receiving side both dealing with 37000 blocks/s and after a while the sending side just begin to do only 19000 blocks/s for 4 or 5 seconds. After that 4 or 5 seconds 0 blocks/s and again 19000 b/s. And after some minutes it goes down to 14000 and the time between sending out the blocks will get higher, which means: more zero b/s. Very strange effect. It's as if there's someone who pulls the plug very softly... To be sure that it's not the system I did the same with scp: everything was fine, blocks have been send out linear and got linear in and onto the disk. No speed drops, no pauses, no pulling the plug. Mermgfurt, Udo -- Udo Wolter | /"\ email: uwp@dicke-aersche.de | \ / ASCII RIBBON CAMPAIGN www: www.dicke-aersche.de | X AGAINST HTML MAIL dark: heaven@lutz-ziffer.de | / \
Heya ! It seems that we found it out. It's the partial flag. We tested a lot of stuff here with strace and could see that after some while there came timeouts on some descriptors (0 = stdin). We saw that after those timeouts got heavy the blocks-in-out dropped heavily. But the reason wasn't clear at first. And so we tried to search with iptraf and found the problem. If you have a big file that got transmitted only 50%, the rsync process on the other side just copy the data that's still there to another file: .file-strangestring This copy lasts a long time on big files. The bigger the file, the longer the wait on the sending side until it really can send something. It seems as if the sending side after some while is getting a hard timeout. After this big and last one the speed drops extremely to 700 KB/s or even less. It looks like a busy loop because the rsync process eats all of the CPU time on the sending side without actually sending much. I'd call it a bug. But maybe we're the first ones that tried to send that lot of data in one file... The workaround is: if you have big files, just don't use -p or --partial. Even this can lead to problems: when the file is already there it seems to do --partial anyway. Maybe it's just a bug in the 2.5.6cvs-version or maybe this is what --force is to be for ? Now we have a constant rate of 13 MB/s. This isn't much on a Gigabit-line but it's sufficient for our needs. Thanx to all of you. Mermgfurt, Udo -- Udo Wolter | /"\ email: uwp@dicke-aersche.de | \ / ASCII RIBBON CAMPAIGN www: www.dicke-aersche.de | X AGAINST HTML MAIL dark: heaven@lutz-ziffer.de | / \
I agree, rsh as root is bad. I wouldn't suggest that. I'm talking about running "rsync --daemon", using /etc/rsyncd.conf to control the form of the access. It's pretty good for reading, and mostly works for writing. Oh, on our security - no ssh, but rsh is ok for root. I'm also exaggerating a bit on the reason we don't have ssh. I just can't convince the decision makers that it is important to have it and that rsh is leaving them buck naked to being 0wnz0r3d. Oh, well, it makes it easy for me to get root in an emergency, and so far, we can trust everybody on our network. Tim Conway conway.tim@spilihp.com reorder name and reverse domain 303.682.4917 office, 303.921.0301 cell Philips Semiconductor - Longmont TC 1880 Industrial Circle, Suite D Longmont, CO 80501 Available via SameTime Connect within Philips, caesupport2 on AIM "There are some who call me.... Tim?" uwp@dicke-aersche.de Sent by: uwp@quatar.philips.com 11/11/02 01:45 PM Please respond to uwp To: Tim Conway/LMT/SC/PHILIPS@AMEC cc: rsync@lists.samba.org Subject: Re: Speed problem Classification: On Mon, 11 Nov 2002 tim.conway@philips.com wrote: This would be an option, but doing rsh with a root account is not possible (couldn't get it to work, I haven't found any way to do it for root and I really want to do it with the root account because I wanna have the same file, directory and user permissions on the files) and also not quite recommended... (.rhosts ??? Sheer horror ! ;-)) Maybe it is the encryption but I use ssh otherwise too and can get on the same line results upto 12 MB/s (blowfish) and even 20 MB/s (arcfour) without any loss of speed. The funny thing is, it seems to happen only after a short while. The first 5 minutes seem to be going good, almost 18 MB/s (also arcfour which means, this is very similar) and then it goes down. It never goes up again, even when a new file get transferred, but it starts at 18 MB/s when I start the complete rsync again.
Here's one of my setups. It's invoked from inetd. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Tools@timsync /home/Tools/newsync/clients/sparetool>grep rsync /etc/inetd.conf /etc/services ;cat /etc/rsyncd.conf /etc/inetd.conf:rsync stream tcp nowait root /usr/bin/rsync rsyncd --daemon /etc/services:rsync 873/tcp rsyncd # rsync daemon log file = /var/tmp/rsyncd.log pid file = /var/run/rsyncd.pid [master1] path = /mastertoolservers/master1 refuse options = checksum read only = yes use chroot = no uid = Tools gid = Tools ignore nonreadable = yes [master2] path = /mastertoolservers/master2 refuse options = checksum read only = yes use chroot = no uid = Tools gid = Tools ignore nonreadable = yes [admin] path = /mastertoolservers/master2/admin refuse options = checksum read only = yes use chroot = no uid = Tools gid = Tools ignore nonreadable = yes [incoming] path = /users/Tools/incoming read only = no use chroot = no uid = Tools gid = Tools list = no Tools@timsync /home/Tools/newsync/clients/sparetool> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Here's a little script I crapped together to fire one up in any arbitrary site where I don't have root. An idling rsyncd doesn't eat much cpu or ram. I just reference it in the crontab for my user, and there's always one waiting for me. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ #!/bin/sh PATH=/bin:/usr/bin:/usr/sbin:/sbin:/etc:/cadappl/encap/bin export PATH pidfile=$HOME/.rsyncd.pid logfile=$HOME/.rsyncd.log configfile=$HOME/.rsyncd.conf [ -f "$pidfile" -a -s "$pidfile" ] && ps -p `cat "$pidfile"` |grep rsync>/dev/null && exit 0{ echo "log file = $logfile pid file = $pidfile [cadappldist] lockfile = /var/tmp/rsyncd.cadappldist.lock max connections = 2 path = /cadappldist use chroot = no read only = yes uid = Tools gid = Tools list = yes [cadappldistrw] lockfile = /var/tmp/rsyncd.cadappldistrw.lock max connections = 1 path = /cadappldist use chroot = no read only = no uid = Tools gid = Tools list = no" >$configfile rsync --daemon --port=4024 --config=$configfile }</dev/null >&0 2>&1 & +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ There'a a lot more useful info in the man pages. examine "--port=" and "--daemon", and maybe "--no-detach" in rsync(1), and read rsyncd.conf(5) all the way through. You can have password authentication, exclusions, parameter control... lots of stuff. Good luck. Tim Conway conway.tim@spilihp.com reorder name and reverse domain 303.682.4917 office, 303.921.0301 cell Philips Semiconductor - Longmont TC 1880 Industrial Circle, Suite D Longmont, CO 80501 Available via SameTime Connect within Philips, caesupport2 on AIM "There are some who call me.... Tim?" uwp@dicke-aersche.de Sent by: rsync-admin@lists.samba.org 11/13/02 12:45 PM Please respond to uwp To: Tim Conway/LMT/SC/PHILIPS@AMEC cc: rsync@lists.samba.org <uwp@quatar.philips.com> Subject: Re: Speed problem Classification: On Wed, 13 Nov 2002 tim.conway@philips.com wrote:> I agree, rsh as root is bad. I wouldn't suggest that. I'm talkingabout> running "rsync --daemon", using /etc/rsyncd.conf to control the form of > the access. It's pretty good for reading, and mostly works for writing.Do I get you right ? You don't need any transport mechanism, rsync can to everything by itself ? I thought rsh or ssh is a must. Can you give an example how to do it ? Thank you ! Mermgfurt, Udo -- Udo Wolter | /"\ email: uwp@dicke-aersche.de | \ / ASCII RIBBON CAMPAIGN www: www.dicke-aersche.de | X AGAINST HTML MAIL dark: heaven@lutz-ziffer.de | / \ -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html