thr3ads.net - freebsd stable - bsnmpd always died on HDD detach [Sep 2012]

If this information is useful, please help other people find it:
Share via:

Miroslav Lachman

2012-Sep-09 22:04 UTC

bsnmpd always died on HDD detach

I am running bsnmpd with basic snmpd.config (only community and location 
changed).

When there is a problem with HDD and disk disapeared from ATA channel 
(eg.: disc physically removed) the bsnmpd always dumps core:

kernel: pid 1188 (bsnmpd), uid 0: exited on signal 11 (core dumped)

I see this for a long rime on all releases of 7.x and 8.x branches (i386 
and amd64). I did not tested 9.x.

Is it a known bug, or should I file PR?

Miroslav Lachman

Mikolaj Golub

2012-Sep-10 05:45 UTC

head link

bsnmpd always died on HDD detach

On Sun, Sep 09, 2012 at 11:56:55PM +0200, Miroslav Lachman
wrote:> I am running bsnmpd with basic snmpd.config (only community and location 
> changed).
> 
> When there is a problem with HDD and disk disapeared from ATA channel 
> (eg.: disc physically removed) the bsnmpd always dumps core:
> 
> kernel: pid 1188 (bsnmpd), uid 0: exited on signal 11 (core dumped)
> 
> I see this for a long rime on all releases of 7.x and 8.x branches (i386 
> and amd64). I did not tested 9.x.
> 
> Is it a known bug, or should I file PR?
Do you happen to run bsnmp-ucd too? If you do then what version is it?
In bsnmp-ucd-0.3.5 I introduced a bug that lead to bsnmpd crash on a
disk detach. It has been fixed (thanks to Brian Somers) in 0.3.6.

-- 
Mikolaj Golub

Miroslav Lachman

2012-Sep-10 14:46 UTC

head link

bsnmpd always died on HDD detach

Mikolaj Golub wrote:> On Sun, Sep 09, 2012 at 11:56:55PM +0200, Miroslav Lachman wrote:
>> I am running bsnmpd with basic snmpd.config (only community and
location
>> changed).
>>
>> When there is a problem with HDD and disk disapeared from ATA channel
>> (eg.: disc physically removed) the bsnmpd always dumps core:
>>
>> kernel: pid 1188 (bsnmpd), uid 0: exited on signal 11 (core dumped)
>>
>> I see this for a long rime on all releases of 7.x and 8.x branches
(i386
>> and amd64). I did not tested 9.x.
>>
>> Is it a known bug, or should I file PR?
>
> Do you happen to run bsnmp-ucd too? If you do then what version is it?
> In bsnmp-ucd-0.3.5 I introduced a bug that lead to bsnmpd crash on a
> disk detach. It has been fixed (thanks to Brian Somers) in 0.3.6.
No, I never installed bsnmpd-ucd. We are using plain bsnmpd from base 
without any modules.
It is used by MRTG only for network traffic. Nothing else.

Miroslav Lachman

Mikolaj Golub

2012-Sep-10 20:35 UTC

head link

bsnmpd always died on HDD detach

On Mon, Sep 10, 2012 at 04:46:15PM +0200, Miroslav Lachman
wrote:> Mikolaj Golub wrote:
> > On Sun, Sep 09, 2012 at 11:56:55PM +0200, Miroslav Lachman wrote:
> >> I am running bsnmpd with basic snmpd.config (only community and
location
> >> changed).
> >>
> >> When there is a problem with HDD and disk disapeared from ATA
channel
> >> (eg.: disc physically removed) the bsnmpd always dumps core:
> >>
> >> kernel: pid 1188 (bsnmpd), uid 0: exited on signal 11 (core
dumped)
> >>
> >> I see this for a long rime on all releases of 7.x and 8.x branches
(i386
> >> and amd64). I did not tested 9.x.
> >>
> >> Is it a known bug, or should I file PR?
> >
> > Do you happen to run bsnmp-ucd too? If you do then what version is it?
> > In bsnmp-ucd-0.3.5 I introduced a bug that lead to bsnmpd crash on a
> > disk detach. It has been fixed (thanks to Brian Somers) in 0.3.6.
> 
> No, I never installed bsnmpd-ucd. We are using plain bsnmpd from base 
> without any modules.
> It is used by MRTG only for network traffic. Nothing else.
Then the backtrace might be useful.

gdb /usr/sbin/bsnmpd /path/to/bsnmpd.core
bt

-- 
Mikolaj Golub

Mikolaj Golub

2012-Sep-15 12:50 UTC

head link

bsnmpd always died on HDD detach

On Sun, Sep 09, 2012 at 11:56:55PM +0200, Miroslav Lachman
wrote:> I am running bsnmpd with basic snmpd.config (only community and location 
> changed).
> 
> When there is a problem with HDD and disk disapeared from ATA channel 
> (eg.: disc physically removed) the bsnmpd always dumps core:
> 
> kernel: pid 1188 (bsnmpd), uid 0: exited on signal 11 (core dumped)
> 
> I see this for a long rime on all releases of 7.x and 8.x branches (i386 
> and amd64). I did not tested 9.x.
Ok, I was able to to reproduce this under qemu doing

  atacontrol detach ata1

It crashes in snmp_hostres module, in

  refresh_device_tbl->refresh_disk_storage_tbl->disk_OS_get_ATA_disks

when traversing device_map list and dereferencing map->entry_p, which
is NULL here.

device_map table is used for consistent device table indexing.

refresh_device_tbl(), refresh routine for hrDeviceTable, checks the
list of available devices and calls device_entry_delete() for devices
that have gone. It does not remove the entry from device_map table,
but just sets entry_p to NULL for it (to preserve index reuse by
another device).

Then refresh_disk_storage_tbl() is called, which in turn calls

 disk_OS_get_ATA_disks();
 disk_OS_get_MD_disks();
 disk_OS_get_disks();

and it crashes in disk_OS_get_ATA_disks() when the removed map entry
is dereferenced.

I am attaching the patch that fixes the issue for me.

I was wandering why the issue was not observed after md device
removal, as disk_OS_get_MD_disks() did the same things. It has turned
out that hostres just does not see md devices, so this function is
currently useless. hostres gets devices from devinfo(3), which does
not return md devices.

disk_OS_get_disks() calls kern.disks sysctl to get the list of disks,
and uses device_map differently, so it is not affected.

-- 
Mikolaj Golub
-------------- next part --------------
A non-text attachment was scrubbed...
Name: hostres_diskstorage_tbl.c.skip.patch
Type: text/x-diff
Size: 940 bytes
Desc: not available
Url :
http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20120915/71997662/hostres_diskstorage_tbl.c.skip.bin

Andrey V. Elsukov

2012-Sep-16 13:56 UTC

head link

bsnmpd always died on HDD detach

On 15.09.2012 16:50, Mikolaj Golub wrote:> I am attaching the patch that fixes the issue for me.
> 
> I was wandering why the issue was not observed after md device
> removal, as disk_OS_get_MD_disks() did the same things. It has turned
> out that hostres just does not see md devices, so this function is
> currently useless. hostres gets devices from devinfo(3), which does
> not return md devices.
> 
> disk_OS_get_disks() calls kern.disks sysctl to get the list of disks,
> and uses device_map differently, so it is not affected.
I also have a big patch to the hostres module, but it is not yet
finished. Probably i should commit the part related to the disk
subsystem. This part has been rewritten to be GEOM aware.

-- 
WBR, Andrey V. Elsukov

Miroslav Lachman

2012-Sep-16 17:07 UTC

head link

bsnmpd always died on HDD detach

Mikolaj Golub wrote:> On Sun, Sep 09, 2012 at 11:56:55PM +0200, Miroslav Lachman wrote:
>> I am running bsnmpd with basic snmpd.config (only community and
location
>> changed).
>>
>> When there is a problem with HDD and disk disapeared from ATA channel
>> (eg.: disc physically removed) the bsnmpd always dumps core:
>>
>> kernel: pid 1188 (bsnmpd), uid 0: exited on signal 11 (core dumped)
>>
>> I see this for a long rime on all releases of 7.x and 8.x branches
(i386
>> and amd64). I did not tested 9.x.
>
> Ok, I was able to to reproduce this under qemu doing
>
>    atacontrol detach ata1
[...]
> and it crashes in disk_OS_get_ATA_disks() when the removed map entry
> is dereferenced.
>
> I am attaching the patch that fixes the issue for me.
I am glad to read that you found the bug!
The fix (patch) seems trivial - will it be commited / MFCed? :)

Thank you for your work on this problem!

Miroslav Lachman

Mikolaj Golub

2012-Sep-16 18:53 UTC

head link

bsnmpd always died on HDD detach

On Sun, Sep 16, 2012 at 05:56:22PM +0400, Andrey V. Elsukov
wrote:> On 15.09.2012 16:50, Mikolaj Golub wrote:
> > I am attaching the patch that fixes the issue for me.
> > 
> > I was wandering why the issue was not observed after md device
> > removal, as disk_OS_get_MD_disks() did the same things. It has turned
> > out that hostres just does not see md devices, so this function is
> > currently useless. hostres gets devices from devinfo(3), which does
> > not return md devices.
> > 
> > disk_OS_get_disks() calls kern.disks sysctl to get the list of disks,
> > and uses device_map differently, so it is not affected.
> 
> I also have a big patch to the hostres module, but it is not yet
> finished. Probably i should commit the part related to the disk
> subsystem. This part has been rewritten to be GEOM aware.
Wonderful! And as I understand it will solve this problem too? Then I
think no need in committing my patch, unless you are not planning to
merge to stable/[78] (where any fix for this problem is highly
desirable).

-- 
Mikolaj Golub

Mikolaj Golub

2012-Sep-17 07:41 UTC

head link

bsnmpd always died on HDD detach

On Sun, Sep 16, 2012 at 07:07:20PM +0200, Miroslav Lachman wrote:
 > I am glad to read that you found the bug!
> The fix (patch) seems trivial - will it be commited / MFCed? :)
Andrey told me that he was not sure when he would be able to commit
his work, so I have just committed my fix. I am going to MFC it.

-- 
Mikolaj Golub

Apparently Analagous Threads

Search for more maybe matching threads

freebsd stable - Sep 2012 - bsnmpd always died on HDD detach

bsnmpd always died on HDD detach

bsnmpd always died on HDD detach

bsnmpd always died on HDD detach

bsnmpd always died on HDD detach

bsnmpd always died on HDD detach

bsnmpd always died on HDD detach

bsnmpd always died on HDD detach

bsnmpd always died on HDD detach

bsnmpd always died on HDD detach

Apparently Analagous Threads