Jürgen Keil
2005-Oct-18 14:41 UTC
[dtrace-discuss] intrstat prints incomplete interrupt statistics
> There are times when arg0 can be NULL [for the interrupt-start or > interrupt-complete probe] if there''s no associated device. > There''s already a bug open to address the error in the documentation. > 6284911 args to interrupt-start and interrupt-complete need more infoHmm, I noticed that, too. There''s currently a thread on yahoo''s solarisx86 mailing list, with an snv_20 user complaining about high interrupt rates (> 32000 interrupts/sec) on a w2100z: http://groups.yahoo.com/group/solarisx86/message/29841 The interrupts are most likely acpi interrupts, but "intrstat" does not report any acpi interrupt activity. This is because arg0 == NULL for acpi interrupts in the interrupt-start action inside intrstat, so that intrstat won''t report acpi interrupts: http://groups.yahoo.com/group/solarisx86/message/29998 Other interrupts that are missing in intrstat output: cbe_fire(), xc_serv(), kcpc_hw_overflow_intr(), apic_error_intr(). Wouldn''t it make sense to extend intrstat to include interrupt statistics for interrupt handlers that don''t have a "struct dev_info *" associated device? This message posted from opensolaris.org
Brendan Gregg
2005-Oct-19 15:38 UTC
[dtrace-discuss] intrstat prints incomplete interrupt statistics
G''Day Folks, On Tue, 18 Oct 2005, JC<rgen Keil wrote:> > There are times when arg0 can be NULL [for the interrupt-start or > > interrupt-complete probe] if there''s no associated device. > > There''s already a bug open to address the error in the documentation. > > 6284911 args to interrupt-start and interrupt-complete need more info > > Hmm, I noticed that, too. There''s currently a thread on yahoo''s solarisx86 > mailing list, with an snv_20 user complaining about high interrupt rates > (> 32000 interrupts/sec) on a w2100z: > > http://groups.yahoo.com/group/solarisx86/message/29841 > > The interrupts are most likely acpi interrupts, but "intrstat" does not report > any acpi interrupt activity. This is because arg0 == NULL for acpi > interrupts in the interrupt-start action inside intrstat, so that intrstat > won''t report acpi interrupts:Oh boy! Try this one, http://www.brendangregg.com/DTrace/intrtime Which doesn''t toss out NULL dev_infos, it reports them with the name "-1". Of course, Bryan/Mike/Adam may have already fixed intrstat by now, which makes intrtime redundant (intrstat is better ... yeah, I should really drop intrtime off my site (I never put it in the toolkit anyway)). ... <rant> I''m only excited as it may be the one and only time my crufty intrtime script is actually useful. Back in Feburary I realised measuring interrupt time was doable with DTrace (HOORAY! If you don''t know why that''s exciting, try doing it in Solaris 9), wrote intrtime, came up with a perl-crufty way to translate major numbers to instance names (lets suck in /etc/name_to_major), and then posted it on the web. Then, I found similar scripts while more carefully reading the DTrace Guide (http://docs.sun.com/app/docs/doc/817-6223/6mlkidlkd?a=view), which had a better translation technique, AND discovered /usr/sbin/intrstat which was much better also. oops. I added a "SEE ALSO" section to the scripts comments, but (remembering the sleepless nights while trying to solve this problem on Solaris 9) couldn''t bear to remove intrtime from my site. Yeah, I should remove it, or change "SEE ALSO" to "NO, I REALLY MEAN SEE ALSO". /usr/sbin/intrstat is way cool. Some people think processing network traffic is CPU negligible, and maybe fair enough, since it wasn''t easy to measure before. Take an Ultra 5 and run Solaris 10 (even WITH FireEngine and TCP_MDT reducing CPU overhead dramatically), and driving the network can still chew 40% CPU for a 100 Mb/s interface. No, no, I said Ultra 5''s - don''t use that ratio for any remotely new Sun server that will have much faster CPUs! (I just happen own a few Ultra 5''s). </rant>> Wouldn''t it make sense to extend intrstat to include interrupt statistics > for interrupt handlers that don''t have a "struct dev_info *" associated > device?Yes, it makes sense. I would guess it hasn''t been done as we simply haven''t encountered this before. struct dev_info is indeed handy for the major numbers (and then the driver name). If I can hunt down a server with one of these shy interrupts I''ll dig out the major number another way (shouldn''t be hard). Brendan
Adam Leventhal
2005-Oct-27 16:32 UTC
[dtrace-discuss] intrstat prints incomplete interrupt statistics
On Thu, Oct 20, 2005 at 01:38:04AM +1000, Brendan Gregg wrote:> Of course, Bryan/Mike/Adam may have already fixed intrstat by now, which > makes intrtime redundant (intrstat is better ... yeah, I should really > drop intrtime off my site (I never put it in the toolkit anyway)).Sorry to let you down, but we didn''t even have a bug filed on the issue. I''ve added this one to the list: 6342692 intrstat should handle ACPI interrupts Adam -- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl