The current code for channel.c creates an array of Channel structs
(initially set to NULL) which is then iterated through, in full, every
time a channel needs to be dealt with. If only one channel is in use,
which is relatively common, the code still loops through the entire array.
This patch creates a linked list of pointers to these structs and the
code steps through the linked list. Since the linked list is only as
long as the number of active channels the amount of time spent looping
is reduced - in some cases considerably.
I've included the first 15 lines of an oprofile report comparing
multiple data transfers using the standard code and the patched code as
follows.
Standard:
samples cum. samples % cum. % symbol name
15360 15360 11.4140 11.4140 client_loop
13277 28637 9.8661 21.2801 packet_send2_wrapped
11017 39654 8.1867 29.4668 channel_output_poll
8070 47724 5.9968 35.4635 buffer_append_space
7914 55638 5.8809 41.3444 channel_handler
5970 61608 4.4363 45.7807 arc4random
5346 66954 3.9726 49.7533 channel_pre_open
5159 72113 3.8336 53.5869 packet_read_poll_seqnr
4253 76366 3.1604 56.7473 channel_post_open
3864 80230 2.8713 59.6186 cipher_crypt
3849 84079 2.8602 62.4788 buffer_len
3541 87620 2.6313 65.1101 channel_prepare_select
3267 90887 2.4277 67.5378 client_wait_until_can...
2557 93444 1.9001 69.4379 buffer_append
Patched:
samples cum. samples % cum. % symbol name
15832 15832 11.4148 11.4148 client_loop
15059 30891 10.8575 22.2723 packet_send2_wrapped
9635 40526 6.9468 29.2191 channel_output_poll
7486 48012 5.3974 34.6165 buffer_append_space
7087 55099 5.1097 39.7262 arc4random
6130 61229 4.4197 44.1459 channel_post_open
5388 66617 3.8847 48.0306 channel_pre_open
4875 71492 3.5149 51.5455 packet_read_poll_seqnr
4729 76221 3.4096 54.9550 channel_handler
4394 80615 3.1681 58.1231 cipher_crypt
4179 84794 3.0130 61.1361 channel_prepare_select
4044 88838 2.9157 64.0519 buffer_len
3388 92226 2.4427 66.4946 client_wait_until_can...
2812 95038 2.0274 68.5220 buffer_append
As you can see less time is spent in some of the channel routines.
I am *not* sure that at this time I can prove that this translates into
increased performance. Now can I prove at this time that this will
remain true in all usage situations. However, my gut instinct is that in
multiuser environments this may prove to have some benefit.
I was hoping some people could take a look at the attached patches, try
them out, see what they think, and let me know if they feel there is any
merit to them. They did pass all regression tests under linux, NetBSD,
and OS X.
Thank you for your time.
Chris Rapier
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: channels.h.diff
Url:
http://lists.mindrot.org/pipermail/openssh-unix-dev/attachments/20070726/4c1c7d54/attachment-0002.ksh
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: channels.c.diff
Url:
http://lists.mindrot.org/pipermail/openssh-unix-dev/attachments/20070726/4c1c7d54/attachment-0003.ksh