The 4.6p1 sshd is logging this error during remote commands or file transfers: error: channel 0: chan_read_failed for istate 3 Platform is Solaris 8, 4.6p1 + OpenSSL 0.9.8d. The commands and transfers work correctly, so the error message appears to be spurious. The error message does not appear when processing logins. Otherwise 4.6p1 is running without any apparent problems. This error did not occur running 4.5p1. Here is a trace: Accepted hostbased for jws from 171.64.11.11 port 64578 ssh2 debug1: Entering interactive session for SSH2. debug2: fd 4 setting O_NONBLOCK debug2: fd 5 setting O_NONBLOCK debug1: server_init_dispatch_20 debug1: server_input_channel_open: ctype session rchan 0 win 131072 max 32768 debug1: input_session_request debug1: channel 0: new [server-session] debug1: session_new: init debug1: session_new: session 0 debug1: session_open: channel 0 debug1: session_open: session 0: link with channel 0 debug1: server_input_channel_open: confirm session debug1: server_input_channel_req: channel 0 request exec reply 0 debug1: session_by_channel: session 0 channel 0 debug1: session_input_channel_req: session 0 req exec debug2: fd 3 setting TCP_NODELAY debug2: fd 7 setting O_NONBLOCK debug3: fd 7 is O_NONBLOCK debug2: fd 9 setting O_NONBLOCK debug2: channel 0: read 38 from efd 9 debug2: channel 0: rwin 131072 elen 38 euse 1 debug2: channel 0: sent ext data 38 debug2: channel 0: read 15 from efd 9 debug2: channel 0: rwin 131034 elen 15 euse 1 debug2: channel 0: sent ext data 15 debug2: channel 0: read 117 from efd 9 debug2: channel 0: rwin 131019 elen 117 euse 1 debug2: channel 0: sent ext data 117 debug2: channel 0: read 73 from efd 9 debug2: channel 0: rwin 130902 elen 73 euse 1 debug2: channel 0: sent ext data 73 debug2: channel 0: read 103 from efd 9 debug2: channel 0: rwin 130829 elen 103 euse 1 debug2: channel 0: sent ext data 103 debug2: channel 0: read<=0 rfd 7 len 0 debug2: channel 0: read failed debug2: channel 0: close_read debug2: channel 0: input open -> drain debug2: channel 0: ibuf_empty delayed efd 9/(0) debug2: channel 0: read 0 from efd 9 debug2: channel 0: closing read-efd 9 debug2: channel 0: ibuf empty debug2: channel 0: send eof debug2: channel 0: input drain -> closed debug2: notify_done: reading debug1: Received SIGCHLD. debug1: session_by_pid: pid 11033 debug1: session_exit_message: session 0 channel 0 pid 11033 debug2: channel 0: request exit-status confirm 0 debug1: session_exit_message: release channel 0 debug2: channel 0: write failed debug2: channel 0: close_write debug2: channel 0: output open -> closed debug2: channel 0: read<=0 rfd 7 len 0 debug2: channel 0: read failed channel 0: chan_read_failed for istate 3 debug2: channel 0: send close debug3: channel 0: will not send data after close debug2: channel 0: read<=0 rfd 7 len 0 debug2: channel 0: read failed channel 0: chan_read_failed for istate 3 debug2: channel 0: rcvd close debug3: channel 0: will not send data after close debug2: channel 0: is dead debug2: channel 0: gc: notify user debug1: session_by_channel: session 0 channel 0 debug1: session_close_by_channel: channel 0 child 0 debug1: session_close: session 0 pid 0 debug2: channel 0: gc: user detached debug2: channel 0: is dead debug2: channel 0: garbage collecting debug1: channel 0: free: server-session, nchannels 1 debug3: channel 0: status: The following connections are open: #0 server-session (t4 r0 i3/0 o3/0 fd 7/7 cfd -1) debug3: channel 0: close_fds r 7 w 7 e -1 c -1 Connection closed by 171.64.11.11 debug1: do_cleanup Closing connection to 171.64.11.11
On Fri, Mar 23, 2007 at 03:54:46PM -0700, Jim Stosick wrote:> The 4.6p1 sshd is logging this error during remote commands or file > transfers: > > error: channel 0: chan_read_failed for istate 3 > > Platform is Solaris 8, 4.6p1 + OpenSSL 0.9.8d. > > The commands and transfers work correctly, so the error message appears > to be spurious. The error message does not appear when processing logins. > > Otherwise 4.6p1 is running without any apparent problems. This error > did not occur running 4.5p1.[...] I suspect this is related to the change for bug #52. You could try reverting the following patch ("patch -p1 -R") and see if the error messages stop. Index: openssh/channels.c diff -u openssh/channels.c:1.250 openssh/channels.c:1.251 --- openssh/channels.c:1.250 Fri Jan 5 16:30:16 2007 +++ openssh/channels.c Mon Jan 29 10:16:28 2007 @@ -1449,10 +1449,11 @@ int len; if (c->rfd != -1 && - FD_ISSET(c->rfd, readset)) { + (c->detach_close || FD_ISSET(c->rfd, readset))) { errno = 0; len = read(c->rfd, buf, sizeof(buf)); - if (len < 0 && (errno == EINTR || errno == EAGAIN)) + if (len < 0 && (errno == EINTR || + (errno == EAGAIN && !(c->isatty && c->detach_close)))) return 1; #ifndef PTY_ZEROREAD if (len <= 0) { @@ -1604,11 +1605,12 @@ c->local_consumed += len; } } else if (c->extended_usage == CHAN_EXTENDED_READ && - FD_ISSET(c->efd, readset)) { + (c->detach_close || FD_ISSET(c->efd, readset))) { len = read(c->efd, buf, sizeof(buf)); debug2("channel %d: read %d from efd %d", c->self, len, c->efd); - if (len < 0 && (errno == EINTR || errno == EAGAIN)) + if (len < 0 && (errno == EINTR || + (errno == EAGAIN && !c->detach_close))) return 1; if (len <= 0) { debug2("channel %d: closing read-efd %d", Index: openssh/serverloop.c diff -u openssh/serverloop.c:1.150 openssh/serverloop.c:1.151 --- openssh/serverloop.c:1.150 Tue Oct 24 03:02:41 2006 +++ openssh/serverloop.c Mon Jan 29 10:16:28 2007 @@ -280,6 +280,7 @@ struct timeval tv, *tvp; int ret; int client_alive_scheduled = 0; + int program_alive_scheduled = 0; /* * if using client_alive, set the max timeout accordingly, @@ -317,6 +318,7 @@ * the client, try to get some more data from the program. */ if (packet_not_very_much_data_to_write()) { + program_alive_scheduled = child_terminated; if (!fdout_eof) FD_SET(fdout, *readsetp); if (!fderr_eof) @@ -362,8 +364,16 @@ memset(*writesetp, 0, *nallocp); if (errno != EINTR) error("select: %.100s", strerror(errno)); - } else if (ret == 0 && client_alive_scheduled) - client_alive_check(); + } else { + if (ret == 0 && client_alive_scheduled) + client_alive_check(); + if (!compat20 && program_alive_scheduled && fdin_is_tty) { + if (!fdout_eof) + FD_SET(fdout, *readsetp); + if (!fderr_eof) + FD_SET(fderr, *readsetp); + } + } notify_done(*readsetp); } @@ -407,7 +417,8 @@ if (!fdout_eof && FD_ISSET(fdout, readset)) { errno = 0; len = read(fdout, buf, sizeof(buf)); - if (len < 0 && (errno == EINTR || errno == EAGAIN)) { + if (len < 0 && (errno == EINTR || + (errno == EAGAIN && !child_terminated))) { /* do nothing */ #ifndef PTY_ZEROREAD } else if (len <= 0) { @@ -425,7 +436,8 @@ if (!fderr_eof && FD_ISSET(fderr, readset)) { errno = 0; len = read(fderr, buf, sizeof(buf)); - if (len < 0 && (errno == EINTR || errno == EAGAIN)) { + if (len < 0 && (errno == EINTR || + (errno == EAGAIN && !child_terminated))) { /* do nothing */ #ifndef PTY_ZEROREAD } else if (len <= 0) { -- Darren Tucker (dtucker at zip.com.au) GPG key 8FF4FA69 / D9A3 86E9 7EEE AF4B B2D4 37C9 C982 80C7 8FF4 FA69 Good judgement comes with experience. Unfortunately, the experience usually comes from bad judgement.
On 3/24/2007 4:49 AM, Darren Tucker wrote:> On Fri, Mar 23, 2007 at 03:54:46PM -0700, Jim Stosick wrote: >> The 4.6p1 sshd is logging this error during remote commands or file >> transfers: >> >> error: channel 0: chan_read_failed for istate 3 >> >> Platform is Solaris 8, 4.6p1 + OpenSSL 0.9.8d.> I suspect this is related to the change for bug #52. You could try > reverting the following patch ("patch -p1 -R") and see if the error > messages stop.Yes, that eliminates the chan_read_failed error messages.
On Mon, 26 Mar 2007, Jim Stosick wrote:> On 3/24/2007 4:49 AM, Darren Tucker wrote: > > > On Fri, Mar 23, 2007 at 03:54:46PM -0700, Jim Stosick wrote: > >> The 4.6p1 sshd is logging this error during remote commands or file > >> transfers: > >> > >> error: channel 0: chan_read_failed for istate 3 > >> > >> Platform is Solaris 8, 4.6p1 + OpenSSL 0.9.8d. > > > I suspect this is related to the change for bug #52. You could try > > reverting the following patch ("patch -p1 -R") and see if the error > > messages stop. > > Yes, that eliminates the chan_read_failed error messages.Well, it also undoes the fix for bug #52. I think it is safe to simply hush the error message: Index: nchan.c ==================================================================RCS file: /var/cvs/openssh/nchan.c,v retrieving revision 1.56 diff -u -p -r1.56 nchan.c --- nchan.c 5 Aug 2006 02:39:40 -0000 1.56 +++ nchan.c 4 Apr 2007 03:56:11 -0000 @@ -135,15 +135,16 @@ chan_rcvd_oclose1(Channel *c) void chan_read_failed(Channel *c) { - debug2("channel %d: read failed", c->self); + debug2("channel %d: read failed, istate %d", c->self, c->istate); switch (c->istate) { case CHAN_INPUT_OPEN: chan_shutdown_read(c); chan_set_istate(c, CHAN_INPUT_WAIT_DRAIN); break; default: - error("channel %d: chan_read_failed for istate %d", - c->self, c->istate); + if (!c->isatty || !c->detach_close) + error("channel %d: chan_read_failed for istate %d", + c->self, c->istate); break; } }
I just applied the nchan.c patch to openssh-4.6p1 as furnished in the previous message from Damien Miller:> Well, it also undoes the fix for bug #52. I think it is safe to simply hush > the error message:> Index: nchan.c > ==================================================================> CS file: /var/cvs/openssh/nchan.c,v > retrieving revision 1.56 > diff -u -p -r1.56 nchan.c > --- nchan.c 5 Aug 2006 02:39:40 -0000 1.56 > +++ nchan.c 4 Apr 2007 03:56:11 -0000 > @@ -135,15 +135,16 @@ chan_rcvd_oclose1(Channel *c) > void > chan_read_failed(Channel *c) > { > - debug2("channel %d: read failed", c->self); > + debug2("channel %d: read failed, istate %d", c->self, c->istate); > switch (c->istate) { > case CHAN_INPUT_OPEN: > chan_shutdown_read(c); > chan_set_istate(c, CHAN_INPUT_WAIT_DRAIN); > break; > default: > - error("channel %d: chan_read_failed for istate %d", > - c->self, c->istate); > + if (!c->isatty || !c->detach_close) > + error("channel %d: chan_read_failed for istate %d",< + c->self, c->istate);> break; > } > }Unfortunately, sshd still produces the message: error: channel 0: chan_read_failed for istate 3 -- --------------------------------------- ___ _ _ _ ___ _ _ | _ |_) | | | _ \__ _| |_ __| |_ | _ \ | | | | / _` | | '_ \ ' \ |___/_|_|_| |_|_\__,_|_| .__/_||_| I work for the US Navy.|_| We are not responsible for each other's mistakes. ---------------------------------------
On Wed, 4 Apr 2007, Damien Miller wrote:> Well, it also undoes the fix for bug #52. I think it is safe to simply hush > the error message:ugh, the logic in that patch was almost, but not completely, wrong. Please try this one instead: Index: channels.c ==================================================================RCS file: /var/cvs/openssh/channels.c,v retrieving revision 1.251 diff -u -p -r1.251 channels.c --- channels.c 28 Jan 2007 23:16:28 -0000 1.251 +++ channels.c 8 Apr 2007 22:43:25 -0000 @@ -932,7 +932,7 @@ channel_pre_x11_open(Channel *c, fd_set } else if (ret == -1) { logit("X11 connection rejected because of wrong authentication."); debug2("X11 rejected %d i%d/o%d", c->self, c->istate, c->ostate); - chan_read_failed(c); + chan_read_failed(c, 0); buffer_clear(&c->input); chan_ibuf_empty(c); buffer_clear(&c->output); @@ -1472,14 +1472,15 @@ channel_handle_rfd(Channel *c, fd_set *r c->type = SSH_CHANNEL_INPUT_DRAINING; debug2("channel %d: input draining.", c->self); } else { - chan_read_failed(c); + chan_read_failed(c, (errno == EAGAIN && + !(c->isatty && c->detach_close))); } return -1; } if (c->input_filter != NULL) { if (c->input_filter(c, buf, len) == -1) { debug2("channel %d: filter stops", c->self); - chan_read_failed(c); + chan_read_failed(c, 0); } } else if (c->datagram) { buffer_put_string(&c->input, buf, len); @@ -1643,7 +1644,7 @@ channel_handle_ctl(Channel *c, fd_set *r chan_mark_dead(c); return -1; } else { - chan_read_failed(c); + chan_read_failed(c, 0); chan_write_failed(c); } return -1; Index: channels.h ==================================================================RCS file: /var/cvs/openssh/channels.h,v retrieving revision 1.81 diff -u -p -r1.81 channels.h --- channels.h 5 Aug 2006 02:39:39 -0000 1.81 +++ channels.h 8 Apr 2007 22:43:41 -0000 @@ -240,7 +240,7 @@ void chan_mark_dead(Channel *); /* channel events */ void chan_rcvd_oclose(Channel *); -void chan_read_failed(Channel *); +void chan_read_failed(Channel *, int); void chan_ibuf_empty(Channel *); void chan_rcvd_ieof(Channel *); Index: nchan.c ==================================================================RCS file: /var/cvs/openssh/nchan.c,v retrieving revision 1.56 diff -u -p -r1.56 nchan.c --- nchan.c 5 Aug 2006 02:39:40 -0000 1.56 +++ nchan.c 8 Apr 2007 22:41:36 -0000 @@ -133,17 +133,19 @@ chan_rcvd_oclose1(Channel *c) } } void -chan_read_failed(Channel *c) +chan_read_failed(Channel *c, int simulated) { - debug2("channel %d: read failed", c->self); + debug2("channel %d: read failed%s, istate %d", c->self, + simulated ? " (simulated)" : "", c->istate); switch (c->istate) { case CHAN_INPUT_OPEN: chan_shutdown_read(c); chan_set_istate(c, CHAN_INPUT_WAIT_DRAIN); break; default: - error("channel %d: chan_read_failed for istate %d", - c->self, c->istate); + if (!simulated) + error("channel %d: chan_read_failed for istate %d", + c->self, c->istate); break; } }
On Mon, Apr 09, 2007 at 08:46:56AM +1000, Damien Miller wrote:> On Wed, 4 Apr 2007, Damien Miller wrote: > > > Well, it also undoes the fix for bug #52. I think it is safe to simply hush > > the error message: > > ugh, the logic in that patch was almost, but not completely, wrong. Please > try this one instead:I can reproduce the problem on Solaris 8 but that patch does not resolve the spurious errors for me (which appears to be because the read is returning zero). BTW, While looking at it, though, I was bothered enormously by the PTY_ZEROREAD ifdef mess which made it hard to read. I think we should do something like this once the original problem is sorted out. Index: Makefile.in ==================================================================RCS file: /usr/local/src/security/openssh/cvs/openssh/Makefile.in,v retrieving revision 1.284 diff -u -p -r1.284 Makefile.in --- Makefile.in 25 Mar 2007 08:26:01 -0000 1.284 +++ Makefile.in 9 Apr 2007 09:01:03 -0000 @@ -71,7 +71,7 @@ LIBSSH_OBJS=acss.o authfd.o authfile.o b atomicio.o key.o dispatch.o kex.o mac.o uidswap.o uuencode.o misc.o \ monitor_fdpass.o rijndael.o ssh-dss.o ssh-rsa.o dh.o kexdh.o \ kexgex.o kexdhc.o kexgexc.o scard.o msg.o progressmeter.o dns.o \ - entropy.o scard-opensc.o gss-genr.o + entropy.o scard-opensc.o gss-genr.o platform.o SSHOBJS= ssh.o readconf.o clientloop.o sshtty.o \ sshconnect.o sshconnect1.o sshconnect2.o @@ -86,7 +86,7 @@ SSHDOBJS=sshd.o auth-rhosts.o auth-passw auth-krb5.o \ auth2-gss.o gss-serv.o gss-serv-krb5.o \ loginrec.o auth-pam.o auth-shadow.o auth-sia.o md5crypt.o \ - audit.o audit-bsm.o platform.o + audit.o audit-bsm.o MANPAGES = scp.1.out ssh-add.1.out ssh-agent.1.out ssh-keygen.1.out ssh-keyscan.1.out ssh.1.out sshd.8.out sftp-server.8.out sftp.1.out ssh-rand-helper.8.out ssh-keysign.8.out sshd_config.5.out ssh_config.5.out MANPAGES_IN = scp.1 ssh-add.1 ssh-agent.1 ssh-keygen.1 ssh-keyscan.1 ssh.1 sshd.8 sftp-server.8 sftp.1 ssh-rand-helper.8 ssh-keysign.8 sshd_config.5 ssh_config.5 Index: channels.c ==================================================================RCS file: /usr/local/src/security/openssh/cvs/openssh/channels.c,v retrieving revision 1.251 diff -u -p -r1.251 channels.c --- channels.c 28 Jan 2007 23:16:28 -0000 1.251 +++ channels.c 9 Apr 2007 09:05:35 -0000 @@ -1455,12 +1455,7 @@ channel_handle_rfd(Channel *c, fd_set *r if (len < 0 && (errno == EINTR || (errno == EAGAIN && !(c->isatty && c->detach_close)))) return 1; -#ifndef PTY_ZEROREAD - if (len <= 0) { -#else - if ((!c->isatty && len <= 0) || - (c->isatty && (len < 0 || (len == 0 && errno != 0)))) { -#endif + if (platform_rdrw_eof(len, c->isatty, errno)) { debug2("channel %d: read<=0 rfd %d len %d", c->self, c->rfd, len); if (c->type != SSH_CHANNEL_OPEN) { Index: platform.c ==================================================================RCS file: /usr/local/src/security/openssh/cvs/openssh/platform.c,v retrieving revision 1.1 diff -u -p -r1.1 platform.c --- platform.c 30 Aug 2006 17:24:41 -0000 1.1 +++ platform.c 9 Apr 2007 09:06:36 -0000 @@ -44,3 +44,15 @@ platform_post_fork_child(void) solaris_contract_post_fork_child(); #endif } + +int +platform_rdrw_eof(size_t len, int is_tty, int errno) +{ +#ifdef PTY_ZEROREAD + if (is_tty && len == 0 && errno == 0) + return 0; +#endif + if (len <= 0) + return 1; + return 0; +} Index: platform.h ==================================================================RCS file: /usr/local/src/security/openssh/cvs/openssh/platform.h,v retrieving revision 1.1 diff -u -p -r1.1 platform.h --- platform.h 30 Aug 2006 17:24:41 -0000 1.1 +++ platform.h 9 Apr 2007 09:05:35 -0000 @@ -21,3 +21,4 @@ void platform_pre_fork(void); void platform_post_fork_parent(pid_t child_pid); void platform_post_fork_child(void); +int platform_rdrw_eof(size_t, int, int); Index: serverloop.c ==================================================================RCS file: /usr/local/src/security/openssh/cvs/openssh/serverloop.c,v retrieving revision 1.151 diff -u -p -r1.151 serverloop.c --- serverloop.c 28 Jan 2007 23:16:28 -0000 1.151 +++ serverloop.c 9 Apr 2007 09:05:35 -0000 @@ -420,12 +420,7 @@ process_input(fd_set *readset) if (len < 0 && (errno == EINTR || (errno == EAGAIN && !child_terminated))) { /* do nothing */ -#ifndef PTY_ZEROREAD - } else if (len <= 0) { -#else - } else if ((!isatty(fdout) && len <= 0) || - (isatty(fdout) && (len < 0 || (len == 0 && errno != 0)))) { -#endif + } else if (platform_rdrw_eof(len, isatty(fderr), errno)) { fdout_eof = 1; } else { buffer_append(&stdout_buffer, buf, len); @@ -439,12 +434,7 @@ process_input(fd_set *readset) if (len < 0 && (errno == EINTR || (errno == EAGAIN && !child_terminated))) { /* do nothing */ -#ifndef PTY_ZEROREAD - } else if (len <= 0) { -#else - } else if ((!isatty(fderr) && len <= 0) || - (isatty(fderr) && (len < 0 || (len == 0 && errno != 0)))) { -#endif + } else if (platform_rdrw_eof(len, isatty(fderr), errno)) { fderr_eof = 1; } else { buffer_append(&stderr_buffer, buf, len); -- Darren Tucker (dtucker at zip.com.au) GPG key 8FF4FA69 / D9A3 86E9 7EEE AF4B B2D4 37C9 C982 80C7 8FF4 FA69 Good judgement comes with experience. Unfortunately, the experience usually comes from bad judgement.
On Mon, 9 Apr 2007, Damien Miller wrote:> On Wed, 4 Apr 2007, Damien Miller wrote:>> Well, it also undoes the fix for bug #52. I think it is safe to simply hush >> the error message:> ugh, the logic in that patch was almost, but not completely, wrong. Please > try this one instead:It would also be a good idea to include errno in the debug2 message after a read(2) failure in channel_handle_rfd(). Doing so would make any future surprises in this area easier to diagnose. Marc. +----------------------------------+----------------------------------+ | Marc Aurele La France | work: 1-780-492-9310 | | Academic Information and | fax: 1-780-492-1729 | | Communications Technologies | email: tsi at ualberta.ca | | 352 General Services Building +----------------------------------+ | University of Alberta | | | Edmonton, Alberta | Standard disclaimers apply | | T6G 2H1 | | | CANADA | | +----------------------------------+----------------------------------+ XFree86 developer and VP. ATI driver and X server internals.
Possibly Parallel Threads
- Error message after upgraing the openssh 4.6P1
- [Bug 52] ssh hangs on exit
- intermittent stderr
- [Bug 1306] Spurious : "chan_read_failed for istate 3" errors from sshd
- [8.0p1] After the remote ssh command execution, the session does not terminate until an active disconnect