Hi list
At job we have an old C program which works with industrial hand terminals
from Honeywell.
That terminal has its own ssh/telnet client to connect to a linux redhat
6.6 server (openssh-server-5.3p1-104.el6_6.1.x86_64).
Once it is connected to the linux box (using a certain user), a C program
is launched by the bash shell with the following parameters
export TERM=vt200
stty raw icrnl -echo
$APLI_EXEC/program param1 param2
so the flow is like => ssh client --> ssh server-> bash --> c
program
(telnet client -> telnet server -> bash -> c program is working fine,
no
problem reported)
The application (or it seems) is working fine but sometimes (1-3-5 times
per week) a randomly terminal stops receiving data from the server but the
application receives the inputs from it. It is like if you writes Ctrl+S in
a shell.
Debuging the application and the ssh process using strace I realized about
something strange:
Having the following file descriptors in the sshd process
lr-x------ 1 root root 64 Feb 15 17:12 9 -> pipe:[383586491]
lr-x------ 1 root root 64 Feb 15 17:12 8 -> /var/lib/sss/mc/group
lrwx------ 1 root root 64 Feb 15 17:12 7 -> socket:[383586484]
lrwx------ 1 root root 64 Feb 15 17:12 6 -> socket:[383586478]
lrwx------ 1 root root 64 Feb 15 17:12 5 -> socket:[383586458]
lrwx------ 1 root root 64 Feb 15 17:12 4 -> socket:[383586457]
lrwx------ 1 root root 64 Feb 15 17:12 3 -> socket:[383585929]
lrwx------ 1 root root 64 Feb 15 17:12 2 -> /dev/null
lrwx------ 1 root root 64 Feb 15 17:12 14 -> /dev/ptmx
lrwx------ 1 root root 64 Feb 15 17:12 13 -> /dev/ptmx
lrwx------ 1 root root 64 Feb 15 17:12 11 -> /dev/ptmx
l-wx------ 1 root root 64 Feb 15 17:12 10 -> pipe:[383586491]
lrwx------ 1 root root 64 Feb 15 17:12 1 -> /dev/null
lrwx------ 1 root root 64 Feb 15 17:12 0 -> /dev/null
A strace over the sshd process:
select(14, [3 9 13], [], NULL, {900, 0}) = 1 (in [13], left {899, 998835})
<<- sshd realizes about data in fd #13 from C application
read(13, "\33[1;23H1\33[1;24H", 16384) = 15 <<- sshd
check
data from th fd#13
select(14, [3 9 13], [3], NULL, {900, 0}) = 1 (out [3], left {899,
999998}) <<- sshd sends data to fd#3 (socket)
write(3,
"\301\236W\250\333\260\r\204\316o]:*1K\203\242\204\257Vb,V\347l\242\352K\341,,\307d\273\277\202.l\32F\2471\257DJt3\36\303\5\256\21K6\27\212\253\326|l\33\270\262S",
64) = 64 (1) <<- sshd encrypts data to be sent
select(14, [3 9 13], [3], NULL, {900, 0}) = 1 (out [3], left {899,
999998}) <<-- sshd sends data thru the socket
select(14, [3 9 13], [], NULL, {900, 0}) = 1 (in [13], left {899,
998569}) <<- sshd realizes about data in fd #13 from C application
read(13, "\7\33[1;16H \33[6;6H_______\33[7;1H -INFORME CANT.
RECOGIDA-\33[7;26H", 16384) = 67 <<- sshd check data from th fd#13
select(14, [3 9], [], NULL, {900, 0}) = 1 (in [3], left {892, 12016})
<<- sshd sends data to fd#3 (socket) but... where is fd#13 where sshd
has to read it from?
read it from?
the terminal receives "\7\33[1;16H " but the rest of the string "
\33[6;6H_______\33[7;1H -INFORME CANT. RECOGIDA-\33[7;26H" is not received
in the client side.............,and fd #13 is lost????????
I have tried to reproduce the behaviour with no success
The following is what it happens
Transmission ok
read(13, "\7\33[1;16H \33[6;6H_______\33[7;1H -INFORME CANT.
RECOGIDA-\33[7;26H", 16384) = 67
write(3,
"\306n\273W\315\265\204\333\201\334\240\346(\354qtz\274L?V\214$\374m\345\321\206\242\235D\255uJ\357\315\313\230\20\375\262\241E\302\360\306\37g1Y\352\7\330h\257\250\276%\344\375o=\227\316\
354@!\205\356\177\330\213\35\330&\251F\225\335\n\312)n08\246x\245\202K\2138C,\10zJ\303\2002\237\321)U\217h\1771\215\3z)",
112) = 112
Transmission KO
read(13, "\7\33[1;16H \33[6;6H_______\33[7;1H -INFORME CANT.
RECOGIDA-\33[7;26H", 16384) = 67
write(3,
"P\247\244}\277\322\260\21\3314\7\227\223~\317\360\35\334\232\372\237\250\320\312\1;\25\37\23\363\363O&0\355i{zUbr\365,\362yyl\222",
48) = 48
It seems the file descriptor disappears while the system call (read) is
reading from it?
I ran out of ideas. Can anybody drive me to the right direction?
Regards, Nacho.