En/na Mikolaj Golub ha escrit:> On Fri, 29 Jan 2010 12:37:52 +0100 Gustau P?rez wrote:
>
>
>> Hi,
>>
>> I'm using cacti to monitor some servers running FBSD. I was using
7.2
>> with SCHED_4BSD. With this configuration : bsnmpd+bsnmp-ucd was
>> returning right values for the cores' load.
>>
>> I recently updated the servers (via csup) to RELENG_8 and bsnmpd is
>> returning negative values for the cores' load. If I try something
like
>> in a 4-core system :
>>
>> snmpwalk -v 2c -c community server .1.3.6.1.2.1.25.3.3.1
>>
>> what I get is :
>>
>> .1.3.6.1.2.1.25.3.3.1.1.6 = OID: .0.0
>> .1.3.6.1.2.1.25.3.3.1.1.10 = OID: .0.0
>> .1.3.6.1.2.1.25.3.3.1.1.14 = OID: .0.0
>> .1.3.6.1.2.1.25.3.3.1.1.18 = OID: .0.0
>> .1.3.6.1.2.1.25.3.3.1.2.6 = INTEGER: -182
>> .1.3.6.1.2.1.25.3.3.1.2.10 = INTEGER: -182
>> .1.3.6.1.2.1.25.3.3.1.2.14 = INTEGER: -182
>> .1.3.6.1.2.1.25.3.3.1.2.18 = INTEGER: -182
>>
>> I tried and old bsnmpd-ucd (0.2.1, works fine in a 7,2 system) with a
>> 8.0 system. Same wrong results. And it seems bsnmpd in /usr/src/contrib
>> has not changed between 7.2 and 8.0.
>>
>> Any ideas ? I'm not an expert, but with tcpdump I see different
>> results. Against an old 7.2 system, the field related to each core load
>> gives the right value. Instead, against and 8.0 system, those field
show
>> (in hex) values like fd 4b. What I don't know is how bsdnmp-ucb
retrives
>> those values and how it construct the udp response packet.
>>
>
> bsnmpd-ucd has nothing to do with HOST-RESOURCES-MIB. These mibs are
provided
> by snmp_hostres(3) module (/usr/lib/snmp_hostres.so). So something wrong is
> there (I suppose it is not in sync with some recent changes in kernel or
> libkvm).
>
>
You are right. I checked the
usr.sbin/bsnmpd/modules/snmp_hostres/hostres_processor_tbl.c. I think it
has something to do with the processor_getpcpu function (line 122). The
code is :
> if (ccpu == 0 || fscale == 0)
> return (0.0);
>
> #define fxtofl(fixpt) ((double)(fixpt) / fscale)
> return (100.0 * fxtofl(ki_p->ki_pctcpu) /
> (1.0 - exp(ki_p->ki_swtime * log(fxtofl(ccpu)))));
With 4 core SCHED_ULE system I checked it and ccpu is always 0
(sysctl kern.ccpu gives 0 too). So this routine always returns 0.0. That
makes the save_sample routine to fill e->samples[#cpu] with 100. If I
comment the ccpu ==0, the I see strange values. I know, I changed the code.
With some printfs, I see the returned value when starting bsnmpd is
98~99. But the it goes up until 350~400 (strange). I put some others
printfs and then I saw that when starting the daemon it return 98~99 for
each processor and the ki_pctcpu is 2026 (in my case). Then, the next
time bsnmpd refreshes its values I see it returns wrong values and
ki_pctcpu goes up four times. So the function returns nearly 400% of
idle time for each processor...
So I checked it with SCHED_4BSD with an 8 core system. The same
behaviour, but this time I got an increase of eight times for the
ki_pctcpu.
Now I'm stuck in here. I think the kinfo_proc info is obtained ny
using kvm_getprocs. Do you have any idea why it returns those values ?
Regards,
Gus
-
--
PGP KEY : http://www-entel.upc.edu/gus/gus.asc