Matty
2007-Sep-26 19:56 UTC
[zfs-discuss] TCP connections not getting cleaned up after application exits
Howdy, We are running zones on a number of Solaris 10 update 3 hosts, and we are bumping into an issue where the kernel doesn''t clean up connections after an application exits. When this issue occurs, the netstat utility doesn''t show anything listening on the port the application uses (8080 in the example below), but connections are still listed in the ESTABLISHED state: $ netstat -an | grep LISTEN *.22 *.* 0 0 49152 0 LISTEN 127.0.0.1.25 *.* 0 0 49152 0 LISTEN 127.0.0.1.587 *.* 0 0 49152 0 LISTEN $ netstat -an | grep ESTAB | grep 8080 10.32.51.230.8080 10.10.12.6.34252 65535 0 49248 0 ESTABLISHED 10.32.51.230.8080 10.10.12.7.54136 1 0 49680 0 ESTABLISHED 10.32.51.230.8080 10.10.12.8.19335 62975 0 49248 0 ESTABLISHED < .... > Normally I would open a ticket with Sun when I bump into issues with Solaris 10, but I couldn''t find anything in the bug database to indicate this was a known problem. Does anyone happen to know if this is a known issue? I rebooted the server with the ''-d'' option the last time this issue occurred, so I have a core file available if anyone is interested in investigating the issue (assuming this is an unknown problem). Thanks for any insight, - Ryan -- UNIX Administrator http://prefetch.net
James Carlson
2007-Sep-26 20:23 UTC
[zfs-discuss] [networking-discuss] TCP connections not getting cleaned up after application exits
[how is this zfs-related?] Matty writes:> We are running zones on a number of Solaris 10 update 3 hosts, and we > are bumping into an issue where the kernel doesn''t clean up > connections after an application exits.Are you sure? One possible cause of this sort of problem is that the application has forked, and one of the forked processes is still holding the file descriptor open. To check this, use pgrep for processes the application may have started, or pfiles to list the open files for given processes, or download and install the open source `lsof'' tool to search for specific open file descriptors.> When this issue occurs, the > netstat utility doesn''t show anything listening on the port the > application uses (8080 in the example below), but connections are > still listed in the ESTABLISHED state:Note that the listening descriptor is separate from the accepted descriptor(s). Thus, it''s possible for one to be closed, while the other is still open.> Normally I would open a ticket with Sun when I bump into issues with > Solaris 10, but I couldn''t find anything in the bug database to > indicate this was a known problem. Does anyone happen to know if this > is a known issue? I rebooted the server with the ''-d'' option the last > time this issue occurred, so I have a core file available if anyone is > interested in investigating the issue (assuming this is an unknown > problem).Contacting Sun''s support group sounds like a good impulse, especially given that you''re using Solaris 10, and that''s not OpenSolaris. I don''t see any problems exactly like this for U3 in the database, though. -- James Carlson, Solaris Networking <james.d.carlson at sun.com> Sun Microsystems / 1 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677
Matty
2007-Sep-26 21:23 UTC
[zfs-discuss] [networking-discuss] TCP connections not getting cleaned up after application exits
On 9/26/07, James Carlson <james.d.carlson at sun.com> wrote:> [how is this zfs-related?] > > Matty writes: > > We are running zones on a number of Solaris 10 update 3 hosts, and we > > are bumping into an issue where the kernel doesn''t clean up > > connections after an application exits. > > Are you sure? One possible cause of this sort of problem is that the > application has forked, and one of the forked processes is still > holding the file descriptor open.We run a single Java process in the zone, and I am 100% certain it is dead (it doesn''t show up in the output of ps, and the pid isn''t listed in the /proc file system).> > Normally I would open a ticket with Sun when I bump into issues with > > Solaris 10, but I couldn''t find anything in the bug database to > > indicate this was a known problem. Does anyone happen to know if this > > is a known issue? I rebooted the server with the ''-d'' option the last > > time this issue occurred, so I have a core file available if anyone is > > interested in investigating the issue (assuming this is an unknown > > problem). > > Contacting Sun''s support group sounds like a good impulse, especially > given that you''re using Solaris 10, and that''s not OpenSolaris.I don''t see any issues similar to this in the opensolaris bug database, which is why I thought it would be appropriate for the opensolaris mailing lists. If anyone is interested in looking at the core files, I can make them available. Assuming this is an unknown bug, identifying the root cause of it would help everyone who uses Solaris and opensolaris. Thanks for the feedback, - Ryan -- UNIX Administrator http://prefetch.net