Hello I have ported OpenSSH 3.8p1 to a LynxOS platform. Recently I heard a report from the field that v2 is perceived to be significantly slower than v1. Is this a known issue? Are there any configuration parameters that can be modified to make v2 faster? Thanks in advance for your response Amba
Amba Giri wrote:> I have ported OpenSSH 3.8p1 to a LynxOS platform. Recently I heard a > report from the field that v2 is perceived to be significantly slower > than v1. Is this a known issue? Are there any configuration parameters > that can be modified to make v2 faster?In general, SSHv2 is slower because it's stronger. That said, there are some things that can be done to speed it up. If you haven't already, fiddle with your compiler flags for both OpenSSL and OpenSSH. In particular, enabling hardware multiply instructions (eg -mv8 on SPARCs) makes a noticable difference to the Diffie-Hellman exchange. If you upgraded sshd, make sure you use the moduli file from a recent distribution. Older ones had 2kbit moduli that were actually 2k-1 bits, so sshd would end up using larger ones than requested. Older OpenSSH clients would ask for larger moduli than intended too, so newer clients ought to be faster too. You can also fiddle with the moduli file itself: keep only the lines with a generator of 2 (exponentiating 2 may be faster than 5 on some architectures). (Most of this only applies if your clients are using DH Group Exchange.) There's some more information here: http://www.openssh.com/faq.html#3.3 -- Darren Tucker (dtucker at zip.com.au) GPG key 8FF4FA69 / D9A3 86E9 7EEE AF4B B2D4 37C9 C982 80C7 8FF4 FA69 Good judgement comes with experience. Unfortunately, the experience usually comes from bad judgement.
Amba Giri wrote:> Hello > > I have ported OpenSSH 3.8p1 to a LynxOS platform. Recently I heard a > report from the field that v2 is perceived to be significantly slower > than v1. Is this a known issue? Are there any configuration parameters > that can be modified to make v2 faster?Protocol 2 is slower because it includes a real per-packet MAC instead of a weak checksum. You can save some overhead by using a truncated MAC like hmac-sha1-96, but there is always going to be more work per packet. I have looked at implementing AES CCM, which could be much faster, particularly on platforms with AES implemented in CPU instructions, but it doesn't fit nicely in the cipher and MAC negotiation mechanism. -d
Should I be filing a Change Request or a bug report for this issue? It appears to me that there are ways to enhance the performance of SSH v2 from these emails and it may be useful for everyone to get the correct changes for this into the next release. Amba>>> Michael A Stevens <mstevens at cmu.edu> 02/28/05 11:09AM >>>Its somewhat more complicated than this. The correct solution is to scale the SSH window to a value that matches the window of the protocol encapsulating the SSH data. Hard-coding window values into the binary isn't a great idea. For any given protocol stack, the windows should all match. Smaller ones will be bottlenecks, and larger ones wasting space. There also is a complication with just scaling the SSH window to a larger size because of a small bug in the channel code that grows a buffer to something larger than the buffer check allows. This really shouldn't happen though, as SSH should be able to depend on the underlying TCP buffer to hold data that it has not fetched yet. Mike On Mon, 28 Feb 2005, Markus Friedl wrote:> On Mon, Feb 28, 2005 at 11:09:26AM -0500, Christopher Rapier wrote: >> bandwidth = (MIN(tcp rwin, SSH2 FC buf)) >> ---------------------------- >> RTT >> >> Since the effective SSH2 flow control buffer is 64K (its actually >> defined as 128K but only 1/2 of it is actually used) and most TCP > > well this can be changed. > > Index: channels.c > ==================================================================> RCS file: /cvs/src/usr.bin/ssh/channels.c,v > retrieving revision 1.211 > diff -u -r1.211 channels.c > --- channels.c 29 Oct 2004 21:47:15 -0000 1.211 > +++ channels.c 27 Nov 2004 14:56:18 -0000 > @@ -1518,10 +1518,13 @@ > static int > channel_check_window(Channel *c) > { > - if (c->type == SSH_CHANNEL_OPEN && > - !(c->flags & (CHAN_CLOSE_SENT|CHAN_CLOSE_RCVD)) && > - c->local_window < c->local_window_max/2 && > - c->local_consumed > 0) { > + if (c->type != SSH_CHANNEL_OPEN || > + c->flags & (CHAN_CLOSE_SENT|CHAN_CLOSE_RCVD) || > + c->local_consumed <= 0) > + return 1; > + if ((c->local_window_max - c->local_window < > + 3 * CHAN_SES_PACKET_DEFAULT) || > + c->local_window < c->local_window_max/2) { > packet_start(SSH2_MSG_CHANNEL_WINDOW_ADJUST); > packet_put_int(c->remote_id); > packet_put_int(c->local_consumed); > Index: channels.h > ==================================================================> RCS file: /cvs/src/usr.bin/ssh/channels.h,v > retrieving revision 1.75 > diff -u -r1.75 channels.h > --- channels.h 29 Oct 2004 21:47:15 -0000 1.75 > +++ channels.h 27 Nov 2004 14:55:27 -0000 > @@ -118,7 +118,7 @@ > > /* default window/packet sizes for tcp/x11-fwd-channel */ > #define CHAN_SES_PACKET_DEFAULT (32*1024) > -#define CHAN_SES_WINDOW_DEFAULT (4*CHAN_SES_PACKET_DEFAULT) > +#define CHAN_SES_WINDOW_DEFAULT (40*CHAN_SES_PACKET_DEFAULT) > #define CHAN_TCP_PACKET_DEFAULT (32*1024) > #define CHAN_TCP_WINDOW_DEFAULT (4*CHAN_TCP_PACKET_DEFAULT) > #define CHAN_X11_PACKET_DEFAULT (16*1024) > > _______________________________________________ > openssh-unix-dev mailing list > openssh-unix-dev at mindrot.org > http://www.mindrot.org/mailman/listinfo/openssh-unix-dev > >________________________________________________________________________ This email has been scanned for computer viruses.
Markus Friedl wrote:> On Mon, Feb 28, 2005 at 02:09:34PM -0500, Michael A Stevens wrote: > >>Its somewhat more complicated than this. The correct solution is to scale >>the SSH window to a value that matches the window of the protocol > > > the point of the patch is not to hard-code the window but > the change the only-half-window-is-used bug.The question is why was only half the buffer being used in the first place. I think its reasonably safe to assume that there was a reason for it being set up this way.
Amba Giri wrote:> Should I be filing a Change Request or a bug report for this issue? > It appears to me that there are ways to enhance the performance of > SSH v2 from these emails and it may be useful for everyone to get> the correct changes for this into the next release. You might want to check out http://www.psc.edu/networking/projects/hpn-ssh I presented the results of this work at Supercomputing 2004 (SC04). It was reasonably well received. We also got a grant from Cisco to continue work on it (to allay any concerns we aren't forking, just doing development for the HPC and HPN community we are associated with).
Christopher, thanks for your response. Our customers have 3.8.1p1 binaries today. What is the safest and most efficient way to get the v2 performance enhancements to them? I am not very familiar with the Open Source patch process. Is there a link for this? Should I make the changes that I see in the link that you sent manually to the source files of 3.8.1p1? Are there any risks/drawbacks to this approach? Is it better to upgrade the customer to 3.9.1p1 and,if so, does the latter binary have any v2 performance enhancements. Thank you in advance... Amba
Autotuning is becoming more common outside research enviornments, its already in the linux kernel. http://www.csm.ornl.gov/~dunigan/netperf/auto.html Mike On Tue, 1 Mar 2005, Markus Friedl wrote:> On Tue, Mar 01, 2005 at 11:47:13AM -0500, Michael A Stevens wrote: >> There is also another subtle issue of how often should the window be >> polled. Polling the tcp window on a WINDOW_ADJUST message could prove to >> be costly if they happen very often. > > retrieving the tcp receive space once after connection > establishment should be enough for almost all tcp > stacks out there. > >