Hi Damien,
On Feb 23 10:28, Corinna Vinschen wrote:> On Feb 22 07:59, Damien Miller wrote:
> > On Sat, 21 Feb 2015, Corinna Vinschen wrote:
> > > - The failing last loop in the "forwarding" script as
reported back
> > > during 6.7 testing is still failing for me more often than not.
It's
> > > always the same reason, the script tries to use in-use port
numbers.
> > > Reducing the forwarding script to only this last test loop
succeeds
> > > every time, but is quite a hack for testing.
> >
> > Is it colliding with itself or with other services running on your
> > test host? (especially with ones windows starts itself)
>
> It's colliding with itself, afaics. It tries to use ports 3301/3302
> again after it already used it in a former test in the script, and with
> a high probability they are still taken. The same ports are used for
> earlier tests in the same script. On the testmachine, only very few
> ports are taken by Windows processes and only one of them (RDP, port
> 3389) in the range used by the tests.
>
> Note that this is still the same as described in
> http://lists.mindrot.org/pipermail/openssh-unix-dev/2014-August/032842.html
>
> See also
> http://lists.mindrot.org/pipermail/openssh-unix-dev/2014-August/032854.html
>
> And this workaround still lets the test succeed:
> http://lists.mindrot.org/pipermail/openssh-unix-dev/2014-August/032862.html
I think I'm a step closer to a solution.
I just added debug output to the forwarding.sh script and it turns out
that the test prior to the "transfer over chained unix domain socket
forwards ..." test, namely
echo "LocalForward ${base}01 127.0.0.1:$PORT" >>
$OBJ/ssh_config
echo "RemoteForward ${base}02 127.0.0.1:${base}01" >>
$OBJ/ssh_config
for p in 1 2; do
trace "config file: start forwarding, fork to background"
${SSH} -$p -F $OBJ/ssh_config -f somehost sleep 10
trace "config file: transfer over forwarded channels and check
result"
${SSH} -F $OBJ/ssh_config -p${base}02 -o 'ConnectionAttempts=4' \
somehost cat ${DATA} > ${COPY}
test -s ${COPY} || fail "failed copy of ${DATA}"
cmp ${DATA} ${COPY} || fail "corrupted copy of ${DATA}"
wait
done
leaves the ports 3301/3302 in TIME_WAIT state (as is 4242 from some
earlier test). Here are the relevant excerpts from ps -e and (Windows)
netstat output. The first group is the output prior to the above test:
PID PPID PGID WINPID TTY UID STIME COMMAND
1320 3632 1320 3964 ? 1049577 11:19:26
/home/corinna/sshbuild/bin/sshd
3512 3632 3512 3444 ? 1049577 11:19:27
/home/corinna/sshbuild/bin/sshd
3632 1 3632 3632 ? 1049577 11:17:49
/home/corinna/sshbuild/bin/sshd
Active Connections
Proto Local Address Foreign Address State PID
TCP 127.0.0.1:4242 0.0.0.0:0 LISTENING 3632
TCP 127.0.0.1:61665 127.0.0.1:4242 TIME_WAIT 0
TCP 127.0.0.1:61666 127.0.0.1:61667 TIME_WAIT 0
TCP 127.0.0.1:61668 127.0.0.1:61669 TIME_WAIT 0
TCP 127.0.0.1:61673 127.0.0.1:4242 TIME_WAIT 0
TCP 127.0.0.1:61674 127.0.0.1:61675 TIME_WAIT 0
TCP 127.0.0.1:61676 127.0.0.1:61677 TIME_WAIT 0
TCP 127.0.0.1:61679 127.0.0.1:4242 TIME_WAIT 0
TCP 127.0.0.1:61680 127.0.0.1:61681 TIME_WAIT 0
TCP 127.0.0.1:61682 127.0.0.1:61683 TIME_WAIT 0
TCP 127.0.0.1:61686 127.0.0.1:4242 TIME_WAIT 0
TCP 127.0.0.1:61687 127.0.0.1:61688 TIME_WAIT 0
TCP 127.0.0.1:61689 127.0.0.1:61690 TIME_WAIT 0
TCP 127.0.0.1:61691 127.0.0.1:4242 TIME_WAIT 0
TCP 127.0.0.1:61692 127.0.0.1:61693 TIME_WAIT 0
TCP 127.0.0.1:61694 127.0.0.1:61695 TIME_WAIT 0
the second group is the output from right between the above test and the
"transfer over chained unix domain socket forwards..."
PID PPID PGID WINPID TTY UID STIME COMMAND
3112 376 376 3212 ? 1049577 11:19:29 /usr/bin/sleep
3520 388 388 3716 ? 1049577 11:19:32 /usr/bin/sleep
3056 3632 3056 2100 ? 1049577 11:19:31
/home/corinna/sshbuild/bin/sshd
4048 1 4048 4048 ? 1049577 11:19:31
/home/corinna/sshbuild/bin/ssh
2372 3632 2372 4024 ? 1049577 11:19:33
/home/corinna/sshbuild/bin/sshd
2728 1 2728 2728 ? 1049577 11:19:28
/home/corinna/sshbuild/bin/ssh
2908 3632 2908 328 ? 1049577 11:19:28
/home/corinna/sshbuild/bin/sshd
3632 1 3632 3632 ? 1049577 11:17:49
/home/corinna/sshbuild/bin/sshd
Active Connections
Proto Local Address Foreign Address State PID
TCP 127.0.0.1:3301 0.0.0.0:0 LISTENING 2640
TCP 127.0.0.1:3301 127.0.0.1:61714 CLOSE_WAIT 2640
TCP 127.0.0.1:3302 0.0.0.0:0 LISTENING 328
TCP 127.0.0.1:3302 127.0.0.1:61713 CLOSE_WAIT 328
TCP 127.0.0.1:4242 0.0.0.0:0 LISTENING 3632
TCP 127.0.0.1:4242 127.0.0.1:61696 ESTABLISHED 3632
TCP 127.0.0.1:4242 127.0.0.1:61708 ESTABLISHED 3632
TCP 127.0.0.1:4242 127.0.0.1:61715 CLOSE_WAIT 3632
TCP 127.0.0.1:61673 127.0.0.1:4242 TIME_WAIT 0
TCP 127.0.0.1:61679 127.0.0.1:4242 TIME_WAIT 0
TCP 127.0.0.1:61680 127.0.0.1:61681 TIME_WAIT 0
TCP 127.0.0.1:61682 127.0.0.1:61683 TIME_WAIT 0
TCP 127.0.0.1:61686 127.0.0.1:4242 TIME_WAIT 0
TCP 127.0.0.1:61687 127.0.0.1:61688 TIME_WAIT 0
TCP 127.0.0.1:61689 127.0.0.1:61690 TIME_WAIT 0
TCP 127.0.0.1:61691 127.0.0.1:4242 TIME_WAIT 0
TCP 127.0.0.1:61692 127.0.0.1:61693 TIME_WAIT 0
TCP 127.0.0.1:61694 127.0.0.1:61695 TIME_WAIT 0
TCP 127.0.0.1:61696 127.0.0.1:4242 ESTABLISHED 2640
TCP 127.0.0.1:61697 127.0.0.1:61698 TIME_WAIT 0
TCP 127.0.0.1:61699 127.0.0.1:61700 TIME_WAIT 0
TCP 127.0.0.1:61701 127.0.0.1:3302 TIME_WAIT 0
TCP 127.0.0.1:61702 127.0.0.1:3301 TIME_WAIT 0
TCP 127.0.0.1:61703 127.0.0.1:4242 TIME_WAIT 0
TCP 127.0.0.1:61704 127.0.0.1:61705 TIME_WAIT 0
TCP 127.0.0.1:61706 127.0.0.1:61707 TIME_WAIT 0
TCP 127.0.0.1:61708 127.0.0.1:4242 ESTABLISHED 2988
TCP 127.0.0.1:61709 127.0.0.1:61710 TIME_WAIT 0
TCP 127.0.0.1:61711 127.0.0.1:61712 TIME_WAIT 0
TCP 127.0.0.1:61713 127.0.0.1:3302 FIN_WAIT_2 3380
TCP 127.0.0.1:61714 127.0.0.1:3301 FIN_WAIT_2 2728
TCP 127.0.0.1:61715 127.0.0.1:4242 FIN_WAIT_2 328
TCP 127.0.0.1:61716 127.0.0.1:61717 TIME_WAIT 0
TCP 127.0.0.1:61718 127.0.0.1:61719 TIME_WAIT 0
This may well be a problem local to Windows. Btw., the large number of
AF_INET sockets is a result of the way how Cygwin implements AF_LOCAL
sockets: They are emulated by local AF_INET sockets since WIndows
doesn't know the concept of AF_LOCAL sockets.
Note that there are still sleep processes running. So on a hunch I just
added a `sleep 30' between the two tests and, lo and behold, the
forwarding.sh test completes successfully every time:
diff --git a/regress/forwarding.sh b/regress/forwarding.sh
index f799d49..1489407 100644
--- a/regress/forwarding.sh
+++ b/regress/forwarding.sh
@@ -120,6 +120,8 @@ for p in 1 2; do
wait
done
+sleep 30
+
for p in 2; do
trace "transfer over chained unix domain socket forwards and check
result"
rm -f $OBJ/unix-[123].fwd
Thanks,
Corinna
--
Corinna Vinschen
Cygwin Maintainer
Red Hat
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL:
<http://lists.mindrot.org/pipermail/openssh-unix-dev/attachments/20150223/ba968482/attachment-0001.bin>