I want to obtain real time CPU loads of the system using dtrace. My solution is to record the interval between a pair of sched:::on and sched:::off probes if the pid is 0 (the idle process). Each second, I record how long the idle process has been running, so I can deduce the load of the CPU. The idle time should be divided by the number of CPUs on the system, so can anyone tell me how to obtain the number of CPUs in dtrace? Or does anyone have a better solution to get the real time CPU load in dtrace? Thank you!
NiuLin wrote:> I want to obtain real time CPU loads of the system using dtrace. > > My solution is to record the interval between a pair of sched:::on and > sched:::off probes if the pid is 0 (the idle process).just one comment: 0 is the proc id for all kernel threads, that is *definitely not* the idle thread (although the idle thread also is associated with proc 0, IINM). you''ll have to find some other entity to watch, I''m afraid. Michael -- Michael Schuster Sun Microsystems, Inc. recursion, n: see ''recursion''
Brendan Gregg - Sun Microsystems
2007-Jul-19 18:20 UTC
[dtrace-discuss] Real time CPU load
G''Day, On Thu, Jul 19, 2007 at 10:27:33PM +0800, NiuLin wrote:> I want to obtain real time CPU loads of the system using dtrace. > > My solution is to record the interval between a pair of sched:::on and > sched:::off probes if the pid is 0 (the idle process).See "Kernel/cputimes" in the DTraceToolkit, which I wrote to provide something similar. I indended to use DTrace to break down kernel activity further - but haven''t coded that yet.> Each second, I record how long the idle process has been running, so I > can deduce the load of the CPU.What do you mean by load? Time that CPUs are not in the idle thread? (which can be called "utilization" time). CPU microstate accounting does provide accurate values for user/kernel/idle times, and can be accessed via kstat, # kstat -p cpu::sys:cpu_nsec\* cpu:0:sys:cpu_nsec_idle 5850667205853232 cpu:0:sys:cpu_nsec_kernel 168126646534655 cpu:0:sys:cpu_nsec_user 185623549296986 cpu:1:sys:cpu_nsec_idle 5854782315586659 cpu:1:sys:cpu_nsec_kernel 169522406342757 cpu:1:sys:cpu_nsec_user 180100808387217 The values are cumulative, so real time values can be obtained through the delta of two measurements.> The idle time should be divided by the number of CPUs on the system, > so can anyone tell me how to obtain the number of CPUs in dtrace?The "`ncpus_online" variable will tell you how many are online; use "`ncpus" for the total count.> Or does anyone have a better solution to get the real time CPU load in dtrace?The last three fields from "vmstat 1" are from those CPU microstates, and are real time values. I''d also pay attention to the length of the CPU dispatcher queues - as a measure of CPU saturation. There is a significant difference between 0% idle time with no saturation, and 0% idle time with heavy saturation. In other words, idle times don''t tell the full story. no worries, Brendan -- Brendan [CA, USA]
All, I''ld like to be able to correlate INUMs with devices (sun4u sunfire) but don''t seem to be able to find the appropriate information yet... some mdb -k output, note that 1536 == 24 << 6, these are all devices on schizo 24, aka "pci at 18"> ::softintINUM ADDR PEND PIL ARG HANDLER 1540 000000007001c100 0 6 6000f2dbf20 0 pci_intr_wrapper 1541 000000007001c140 0 6 6000f2db660 0 pci_intr_wrapper 1542 000000007001c180 0 4 300001dfce8 0 pci_intr_wrapper 1543 000000007001c1c0 0 4 600188dac40 0 pci_intr_wrapper 1544 000000007001c200 0 6 6000f2dbdd0 0 pci_intr_wrapper 1548 000000007001c300 0 6 6000f2db5f0 0 pci_intr_wrapper 1552 000000007001c400 0 11 300001dfd58 0 pci_intr_wrapper 1584 000000007001cc00 0 14 300001e0978 0 ecc_intr 1585 000000007001cc40 0 14 300001e09b0 0 ecc_intr 1586 000000007001cc80 0 14 6000f118c80 0 pbm_error_intr 1587 000000007001ccc0 0 14 300001c9e40 0 pbm_error_intr 1588 000000007001cd00 0 14 300001dfc08 0 cb_buserr_intr 1589 000000007001cd40 0 14 300001c96c0 0 pci_pbm_cdma_intr 1590 000000007001cd80 0 14 6000f118640 0 pci_pbm_cdma_intr and here are all the "pci at 18" devices in /etc/path_to_inst "/ssm at 0,0/pci at 18,700000" 0 "pcisch" "/ssm at 0,0/pci at 18,700000/network at 3" 0 "ce" "/ssm at 0,0/pci at 18,700000/pci at 2" 0 "pci_pci" "/ssm at 0,0/pci at 18,700000/pci at 2/network at 0" 1 "ce" "/ssm at 0,0/pci at 18,700000/pci at 2/network at 1" 2 "ce" "/ssm at 0,0/pci at 18,700000/pci at 2/scsi at 2" 0 "glm" "/ssm at 0,0/pci at 18,700000/pci at 2/scsi at 2/sd at 0,0" 2 "sd" "/ssm at 0,0/pci at 18,700000/pci at 2/scsi at 2/sd at 1,0" 0 "sd" "/ssm at 0,0/pci at 18,700000/pci at 2/scsi at 2/sd at 4,0" 5 "sd" "/ssm at 0,0/pci at 18,700000/pci at 2/scsi at 2/sd at 6,0" 9 "sd" "/ssm at 0,0/pci at 18,700000/pci at 2/scsi at 2/ses at 2,0" 2 "ses" "/ssm at 0,0/pci at 18,700000/pci at 2/scsi at 2/ses at 3,0" 3 "ses" "/ssm at 0,0/pci at 18,700000/pci at 2/scsi at 2,1" 1 "glm" "/ssm at 0,0/pci at 18,700000/pci at 2/scsi at 2,1/sd at 0,0" 4 "sd" "/ssm at 0,0/pci at 18,700000/pci at 2/scsi at 2,1/sd at 1,0" 6 "sd" "/ssm at 0,0/pci at 18,700000/pci at 2/scsi at 2,1/sd at 2,0" 8 "sd" "/ssm at 0,0/pci at 18,700000/pci at 2/scsi at 2,1/sd at 3,0" 10 "sd" "/ssm at 0,0/pci at 18,700000/bootbus-controller at 4" 0 "sgsbbc" "/ssm at 0,0/pci at 18,600000" 1 "pcisch" "/ssm at 0,0/pci at 18,600000/network at 1" 3 "ce" the question is which INUMs are assigned to which devices, just knowing that its handler is "pci_intr_wrapper" doesn''t help much ? anyone got a Solaris-10/11, SPARC-sun4u-SunFire, answer to this one ? thanks, Pete Lawrence. PS, here is `intradm'' output for pci at 18, but everyone says it is un-supported and un-reliable, however its the only tool that seems to have the correspondence I''m looking for (abeit obscured, these INUMs have to be right-shifted by 1 to match mdb::softint''s INUMs, for reasons that I haven''t figured out). INUM PIL DRIVER CPU PATH c08 6 ce#1 14 /ssm at 0,0/pci at 18,700000/pci at 2/network at 0 c0a 6 ce#2 0 /ssm at 0,0/pci at 18,700000/pci at 2/network at 1 c0c 4 glm#0 0 /ssm at 0,0/pci at 18,700000/pci at 2/scsi at 2 c0e 4 glm#1 0 /ssm at 0,0/pci at 18,700000/pci at 2/scsi at 2,1 c10 6 ce#0 11 /ssm at 0,0/pci at 18,700000/network at 3 c18 6 ce#3 0 /ssm at 0,0/pci at 18,600000/network at 1 c20 b sgsbbc#0 581 /ssm at 0,0/pci at 18,700000/bootbus-controller at 4