thr3ads.net - freebsd stable - socketpair: No buffer space available [Mar 2007]

If this information is useful, please help other people find it:
Share via:

Marc G. Fournier

2007-Mar-24 18:22 UTC

socketpair: No buffer space available

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



Almost like clockwork, every 3 days, I have one server that starts to generate
errors similar to below ... it isn't a 'continous thing' at the
start, but
gradually grows worse ... it just started happening again today, after 3 days, 
2hrs of uptime ...

Mar 20 07:59:26 mars sshd[717]: error: reexec socketpair: No buffer space
available

As unrelated as this might sound, out of three servers that are virtually
identical, this is the only one using gmirror for its drives vs a hardware raid
controller, two of the three running kernels from about the same time ...

# ssh jupiter uname -a
FreeBSD jupiter.hub.org 6.2-STABLE FreeBSD 6.2-STABLE #1: Fri Mar 16 13:13:02
ADT 2007     root@jupiter.hub.org:/usr/obj/usr/src/sys/kernel  i386

vs

# ssh mars uname -a
FreeBSD mars.hub.org 6.2-STABLE FreeBSD 6.2-STABLE #5: Tue Mar 13 02:29:37 ADT
2007     root@mars.hub.org:/usr/obj/usr/src/sys/kernel  i386

jupiter is running more on it then mars right now ...

So, I either have something mis-configured on mars that is done right on
jupiter, or there is a bug that is being tickled on mars that isn't being
tickled on jupiter ...

If I have a login session on the machine, I can easily do a reboot of the
machine, and it seems to come up clean every time (ie. no fsck's need to be
run) ...

Does anyone have any ideas of what I can look at?

I've checked nmbclusters between the two machines, and both are at 25600,
but
not sure what sysctl to look at for how much is actually used out of that 25600 
...


- ----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email . scrappy@hub.org                              MSN . scrappy@hub.org
Yahoo . yscrappy               Skype: hub.org        ICQ . 7615664
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGA+sG4QvfyHIvDvMRAoRuAJ9LXJ5RUZNXEQhEwkDFiMudThyASgCeNJXu
9Y7KZ6fSlk07/WmHGywTvJ4=n3XS
-----END PGP SIGNATURE-----

Marc G. Fournier

2007-Mar-24 18:22 UTC

head link

socketpair: No buffer space available

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Almost like clockwork, every 3 days, I have one server that starts to generate 
errors similar to below ... it isn't a 'continous thing' at the
start, but
gradually grows worse ...

Mar 20 07:59:26 mars sshd[717]: error: reexec socketpair: No buffer space 
available

As unrelated as this might sound, out of three servers that are virtually 
identical, this is the only one using gmirror for its drives vs a hardware raid 
controller, two of the three running kernels from about the same time ...

# ssh jupiter uname -a
FreeBSD jupiter.hub.org 6.2-STABLE FreeBSD 6.2-STABLE #1: Fri Mar 16 13:13:02 
ADT 2007     root@jupiter.hub.org:/usr/obj/usr/src/sys/kernel  i386

vs

# ssh mars uname -a
FreeBSD mars.hub.org 6.2-STABLE FreeBSD 6.2-STABLE #5: Tue Mar 13 02:29:37 ADT 
2007     root@mars.hub.org:/usr/obj/usr/src/sys/kernel  i386

jupiter is running more on it then mars right now ...

So, I either have something mis-configured on mars that is done right on 
jupiter, or there is a bug that is being tickled on mars that isn't being 
tickled on jupiter ...

If I have a login session on the machine, I can easily do a reboot of the 
machine, and it seems to come up clean every time (ie. no fsck's need to be 
run) ...

Does anyone have any ideas of what I can look at?

- ----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email . scrappy@hub.org                              MSN . scrappy@hub.org
Yahoo . yscrappy               Skype: hub.org        ICQ . 7615664
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGAV294QvfyHIvDvMRAogOAKCCbTIYS59dQFmV9/gfRth8nUZMpgCggZ9r
8zBIHioOQjlNBgovjv+eDA4=lIyS
-----END PGP SIGNATURE-----

Bruce M. Simpson

2007-Mar-25 23:08 UTC

head link

socketpair: No buffer space available

Marc G. Fournier wrote:> Mar 20 07:59:26 mars sshd[717]: error: reexec socketpair: No buffer space
> available
>
>
> If I have a login session on the machine, I can easily do a reboot of the
> machine, and it seems to come up clean every time (ie. no fsck's need
to be
> run) ...
> Does anyone have any ideas of what I can look at?
>   How odd. The re-exec feature is not documented in the man page. It 
appears that it can be turned off with the -r switch according to 
sshd.c. Can you give that a try and see if that offers symptomatic 
relief? It would be somewhat less secure as sshd will fork rather than 
fork..exec.

The code does indeed appear to use socketpair. FreeBSD implements 
socketpair as a system call. Only AF_UNIX, SOCK_STREAM sockets are 
accepted.  A quick look in KScope suggests the first place where this 
can fail with ENOBUFS is soalloc() from socreate().

Is this machine under heavy memory load in any way? soalloc() uses a 
zone allocator. I'm not sure how to track that from userland, vmstat -m 
only deals with kernel malloc() stats.

BMS

Marc G. Fournier

2007-Mar-26 00:40 UTC

head link

socketpair: No buffer space available

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



- --On Monday, March 26, 2007 00:08:07 +0100 "Bruce M. Simpson"
<bms@FreeBSD.org> 
wrote:
> Marc G. Fournier wrote:
>> Mar 20 07:59:26 mars sshd[717]: error: reexec socketpair: No buffer
space
>> available
>>
>>
>> If I have a login session on the machine, I can easily do a reboot of
the
>> machine, and it seems to come up clean every time (ie. no fsck's
need to be
>> run) ...
>> Does anyone have any ideas of what I can look at?
>>
> How odd. The re-exec feature is not documented in the man page. It appears
> that it can be turned off with the -r switch according to sshd.c. Can you
> give that a try and see if that offers symptomatic relief? It would be
> somewhat less secure as sshd will fork rather than fork..exec.
That was actually just one example ... I get more of:

sendmail[82066]: l2NEA1Ht082066: SYSERR(root): makeconnection: cannot create 
socket: No buffer space available

then I do the sshd errors ... in another 15 hours or so, they will all start up 
again, like clock work :(


- ----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email . scrappy@hub.org                              MSN . scrappy@hub.org
Yahoo . yscrappy               Skype: hub.org        ICQ . 7615664
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGBxZ84QvfyHIvDvMRAoNTAKDBkGZL7aCOXEW22QibCCpnJJJnEgCfafMa
ex0pM7sKPgCjVdURJ9nwfH0=egaO
-----END PGP SIGNATURE-----

Robert Watson

2007-Mar-27 20:29 UTC

head link

socketpair: No buffer space available

On Fri, 23 Mar 2007, Marc G. Fournier wrote:
> I've checked nmbclusters between the two machines, and both are at
25600,
> but not sure what sysctl to look at for how much is actually used out of 
> that 25600 ...
netstat -mb

nmbclusters directly affects the number of clusters available in the network 
stack; it also indirectly affects the scaling of other settings, such as 
resource limits on the number of sockets.  vmstat -z is also generally useful.

There are a few paths to ENOBUFS in the socket allocation code--one path is if 
you are over-committed on socket buffer resources with respect to the resource 
limits of the user.  Check the output of limits and the socket buffer size 
limit.

Robert N M Watson
Computer Laboratory
University of Cambridge

freebsd stable - Mar 2007 - socketpair: No buffer space available

socketpair: No buffer space available

socketpair: No buffer space available

socketpair: No buffer space available

socketpair: No buffer space available

socketpair: No buffer space available