Qiang Liu
2006-Mar-30 23:12 UTC
[dtrace-discuss] iostat -xn 5 _donot_ update: how to use DTrace
on Solaris 10 5.10 Generic_118822-23 sun4v sparc SUNW,Sun-Fire-T200 I run #iostat -xn 5 to monitor the IO statistics on SF T2000 server. The system also have a heavy IO load, for some reason iostat donot refresh (no any update). It seems like iostat is calling pause() and stucked there. Also my HBA driver''s interrupt stack trace indicates there is a lot of swtch(), the overall IOPS is not good. I have 24 CPUs on this system. First I am curious about why no update in iostat output. If pause() is waiting for a signal why the wakeup signal does not have chance to get delivered. I hope DTrace could help dig into somewhere. Thanks in advance. This message posted from opensolaris.org
Eric C. Saxe
2006-Mar-31 00:12 UTC
[dtrace-discuss] Re: iostat -xn 5 _donot_ update: how to use DTrace
You could try the following simple script to see what iostat is doing once (and if) it gets the CPU: #!/usr/sbin/dtrace -Fs fbt::: /execname == "iostat"/ { trace(timestamp); } If you never see any output, then iostat may not be getting any CPU time. If that''s the case, you can dig deeper with the probes described in the sched provider (in the DTrace answerbook) to understand why that might be happening. If you are getting CPU time, (but iostat just isn''t making much progress), then the above script should show where, in the kernel, it''s spending it''s time. Is it just iostat giving you problems, or are other commands sluggish? -Eric This message posted from opensolaris.org
Qiang Liu
2006-Mar-31 00:49 UTC
[dtrace-discuss] Re: iostat -xn 5 _donot_ update: how to use DTrace
Thank Eric, no any output. we are using vdbench Sun provided. Yes, the IOs are running but the benchmark couldn''t report any result. I am going to read sched provider. john This message posted from opensolaris.org
Qiang Liu
2006-Mar-31 19:33 UTC
[dtrace-discuss] Re: iostat -xn 5 _donot_ update: how to use DTrace
a gentalman recommend some tips to check the system. and I tried ontario# mdb -k Loading modules: [ unix krtld genunix specfs dtrace ufs sd ip sctp usba random fcp fctl emlxs nca lofs nfs ssd sppp crypto logindmux ptm md cpc fcip ipc ]> 0t616::pid2proc| ::print proc_t p_tlist |::findstack stack pointer for thread 300028f4cc0: 2a101ac5181 [ 000002a101ac5181 cv_wait_sig_swap_core+0x130() ] 000002a101ac5231 pause+0x10() 000002a101ac52e1 syscall_trap32+0xcc()>ontario# ps -ef | grep iostat root 1189 588 0 17:44:07 pts/2 0:43 iostat -nx 3 root 1654 1617 0 11:04:50 pts/1 0:00 grep iostat ontario# dtrace -n syscall::timer_create:return''{ printf("timer_create returned %d\n", arg1) }'' dtrace: description ''syscall::timer_create:return'' matched 1 probe ^C ontario# dtrace -n proc:::signal\*''/execname ="iostat"/{ @[pid, probename] = count() }'' dtrace: description ''proc:::signal*'' matched 4 probes ^C dtrace: couldn''t stop tracing: Invalid argument ontario# pargs 1189 1189: iostat -nx 3 argv[0]: iostat argv[1]: -nx argv[2]: 3 In a healthy system with iostat keeping refreshed I tried # dtrace -n proc:::signal\*''/execname == "iostat"/{ @[pid, probename] = count() }'' dtrace: description ''proc:::signal*'' matched 4 probes ^C 3140 signal-handle 2 # pargs 3140 3140: iostat -xn 5 argv[0]: iostat argv[1]: -xn argv[2]: 5 # As you can see, there is no signal-handle in my system in question. then how to use dtrace to go deeper and deeper...? Thank you dtrace. This message posted from opensolaris.org
Qiang Liu
2006-Apr-06 00:40 UTC
[dtrace-discuss] Re: iostat -xn 5 _donot_ update: how to use DTrace
I haven''t given up yet, this sounds like a dumb questions. I read OpenSolaris iostat.c to understand set_timer() and pause() inside its main(). In short I run iostat -xn 5 and I want to get a snapshot every 5 second, however I end up getting it every 1hr or so. Here is my dtrace for the normal iostat run (refresh every 5 sec) 0 | cv_wait_sig_swap_core:return 0 <- cv_wait_sig_swap_core 0 -> cv_wait_sig_swap_core 0 | cv_wait_sig_swap_core:entry 0 -> thread_lock 0 <- thread_lock 0 -> cv_block 0 -> new_mstate 0 -> cpu_update_pct 0 -> cpu_grow 0 -> cpu_decay 0 -> exp_x 0 <- exp_x 0 <- cpu_decay 0 <- cpu_grow 0 <- cpu_update_pct 0 <- new_mstate 0 -> disp_lock_enter_high 0 <- disp_lock_enter_high 0 -> disp_lock_exit_high 0 <- disp_lock_exit_high 0 <- cv_block 0 -> sleepq_insert 0 <- sleepq_insert 0 -> disp_lock_exit_nopreempt 0 <- disp_lock_exit_nopreempt 0 -> sigcheck 0 <- sigcheck 0 -> swtch 0 -> disp 0 -> disp_lock_enter 0 <- disp_lock_enter 0 -> disp_lock_exit_high 0 <- disp_lock_exit_high 0 -> disp_ratify 0 <- disp_ratify 0 -> restore_mstate 0 -> cpu_update_pct 0 -> cpu_decay 0 -> exp_x 0 <- exp_x 0 <- cpu_decay 0 <- cpu_update_pct 0 <- restore_mstate 0 <- disp 0 <- swtch 0 -> resume 0 -> savectx 0 -> schedctl_save 0 <- schedctl_save 0 <- savectx 0 -> restorectx 0 -> schedctl_restore 0 <- schedctl_restore 0 <- restorectx 0 <- resume 0 -> sigcheck 0 <- sigcheck 0 -> issig 0 <- issig 0 -> issig_forreal 0 -> schedctl_finish_sigblock 0 <- schedctl_finish_sigblock 0 -> fsig 0 -> sigdiffset 0 <- sigdiffset 0 <- fsig 0 -> fsig 0 -> sigdiffset 0 <- sigdiffset 0 -> lowbit 0 <- lowbit 0 <- fsig 0 -> sigdeq 0 <- sigdeq 0 -> isjobstop 0 <- isjobstop 0 -> undo_watch_step 0 <- undo_watch_step 0 <- issig_forreal Can anybody explain the above trace results from -> resume anybody can tell me how the set_timer() and pause() are related in iostat.c''s main() In my test with abnormal iostat''s behavior, pause() won''t return in 5 sec even if we run iostat -xn 5. basically the dtrace output has different stuff after -> resume. Thanks This message posted from opensolaris.org
Qiang Liu
2006-Apr-07 00:41 UTC
[dtrace-discuss] Re: iostat -xn 5 _donot_ update: how to use DTrace
Hi Gurus, we try to figure out why our iostat -xn 5 does not update for quite sometime, here is what we thought what may have happened. in iostat.c set_timer() eventually setup a *real-time* callout which will be handled through a soft interrupt. This very software interrupt use 1 as the priority level in softcall. Our HBA''s hardware interrupt priority level is 5. iostat thread in the kernel is being context switched and in the system''s sleeping queue. The system clock interrupt handler with priority level 10 will trigger softcall (priority level 1) to try to interrupt the processor but failed since the system has a heavy IO (processor is servcing hba interrupt, or swtch, or normal IO holding lock) Then we tried to create a kernel thread to service the low priority work in our hardware interrupt handler, the created kernel thread has lower global priority (99 < 101) than softcall''s, then the iostat output delay issue is gone. We want to ask Sun''s Guru''s to see if we are in the right direction to understand this issue. We only see this issue in T2000, which we want to know why. Thanks This message posted from opensolaris.org
Roch Bourbonnais - Performance Engineering
2006-Apr-07 08:41 UTC
[dtrace-discuss] Re: iostat -xn 5 _donot_ update: how to use DTrace
Qiang Liu writes: > Hi Gurus, > we try to figure out why our iostat -xn 5 does not update for quite > sometime, here is what we thought what may have happened. > > in iostat.c set_timer() eventually setup a *real-time* callout which > will be handled through a soft interrupt. This very software interrupt > use 1 as the priority level in softcall. > > Our HBA''s hardware interrupt priority level is 5. > > iostat thread in the kernel is being context switched and in the > system''s sleeping queue. The system clock interrupt handler with > priority level 10 will trigger softcall (priority level 1) to try to > interrupt the processor but failed since the system has a heavy IO > (processor is servcing hba interrupt, or swtch, or normal IO holding > lock) > > Then we tried to create a kernel thread to service the low priority > work in our hardware interrupt handler, the created kernel thread has > lower global priority (99 < 101) than softcall''s, then the iostat > output delay issue is gone. > > We want to ask Sun''s Guru''s to see if we are in the right direction to > understand this issue. We only see this issue in T2000, which we want > to know why. Sounds like an ok direction. If you have low priority work, don''t use the interrupts for that. Part of the problem is that we have not provided a stable interface to thread_create for drivers. But Solaris now has a stable way to do it using taskq(9F). As for the T2000, maybe multiple interrupts are colliding onto a single core, if that''s the case, only enable interrupts on a single strand per core using psradm. If you have a single interrupt source which possibly requires more punch than what a single strand can provice, you may want to dedicate a full core to handling the work. For that you can create a psrset containing all CPU from the core handling the interrupt. -r ____________________________________________________________________________________ Roch Bourbonnais Sun Microsystems, Icnc-Grenoble Senior Performance Analyst 180, Avenue De L''Europe, 38330, Montbonnot Saint Martin, France Performance & Availability Engineering http://icncweb.france/~rbourbon http://blogs.sun.com/roller/page/roch Roch.Bourbonnais at Sun.Com (+33).4.76.18.83.20 > > Thanks > > > This message posted from opensolaris.org > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
Qiang Liu
2006-Apr-08 00:02 UTC
[dtrace-discuss] Re: Re: iostat -xn 5 _donot_ update: how to use DTrace
Thanks, somebody told me it seems to be this bug: id 6292092 please evaluate. This message posted from opensolaris.org
Qiang Liu
2006-Apr-08 00:57 UTC
[dtrace-discuss] Re: Re: iostat -xn 5 _donot_ update: how to use DTrace
Another information to share, we used Dtrace to trace clock interrupt handler clock(), clock_realtime_fire_first(), clock_realtime_fire(), and HBA interrupt handler, we noticed that in one round test our hardware interrupt handler always run on a different logical CPU (say 9) than that the clock handler was using. setsoftint did get called from clock() but clock_realtime_fire() was not called at all. This message posted from opensolaris.org