Neelam
2007-Nov-27 04:29 UTC
[dtrace-discuss] Dtrace probes for voluntary and involuntary context switches
Hi, I am profiling some workloads for the voluntary and involuntary context switches. I am interested in finding out the reasons causing these two types of context switches. As far as I understand, involuntary context switch happens on expiration of time slice or when a higher priority process comes in. While the voluntary switch generally happens when a process is waiting for I/O etc. So to find out what system calls are causing voluntary context switches in my workloads, I printed whenever a system calls is invoked and whenever a context switch happens. I am profiling the system calls and context switched inside critical sections (while some lock is being held). But I see something unexpected. I see * Voluntary context switches occur almost every time due to doorfs() system call. They do occur for a few times due to lwp_park() and very few times due to yield(). * Involuntary happens anytime. (lwp_park(), read(), fstat(), putmsg(), gtime() and sometime without any system call!!) Does anyone have any idea, what could be the reason for this unexpected behavior? Thanks, Neelam -- This message posted from opensolaris.org
max at bruningsystems.com
2007-Nov-27 07:04 UTC
[dtrace-discuss] Dtrace probes for voluntary and involuntary context switches
Hi Neelam, Neelam wrote:> Hi, > > I am profiling some workloads for the voluntary and involuntary context switches. I am interested in finding out the reasons causing these two types of context switches. As far as I understand, involuntary context switch happens on expiration of time slice or when a higher priority process comes in. While the voluntary switch generally happens when a process is waiting for I/O etc. > > So to find out what system calls are causing voluntary context switches in my workloads, I printed whenever a system calls is invoked and whenever a context switch happens. I am profiling the system calls and context switched inside critical sections (while some lock is being held). > > But I see something unexpected. I see > > * Voluntary context switches occur almost every time due to doorfs() > system call. They do occur for a few times due to lwp_park() and very > few times due to yield(). > > * Involuntary happens anytime. (lwp_park(), read(), fstat(), putmsg(), > gtime() and sometime without any system call!!) > > Does anyone have any idea, what could be the reason for this unexpected behavior? >This behavior is not unexpected. In general, threads should not be sleeping while holding locks. What do you think is unexpected? max> Thanks, > Neelam > > > -- > This message posted from opensolaris.org > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org > >
Gavin Maltby
2007-Nov-27 11:06 UTC
[dtrace-discuss] Dtrace probes for voluntary and involuntary context switches
Hi, max at bruningsystems.com wrote:> Hi Neelam, > Neelam wrote: >> Hi, >> >> I am profiling some workloads for the voluntary and involuntary context switches. I am interested in finding out the reasons causing these two types of context switches. As far as I understand, involuntary context switch happens on expiration of time slice or when a higher priority process comes in. While the voluntary switch generally happens when a process is waiting for I/O etc. >> >> So to find out what system calls are causing voluntary context switches in my workloads, I printed whenever a system calls is invoked and whenever a context switch happens. I am profiling the system calls and context switched inside critical sections (while some lock is being held).Look at usr/src/cmd/stat/mpstat/mpstat.c and you''ll see that it figures voluntary and involuntary context switches from kstats in module cpu, class misc, name sys members "pswitch" and "inv_swtch". See kstat -m cpu -n sys output. Armed with that knowledge, the ''sysinfo'' provider is your friend: # dtrace -l -P sysinfo | egrep ''pswitch|inv_swtch'' 75877 sysinfo unix swtch_to pswitch 75878 sysinfo unix swtch_from_zombie pswitch 75879 sysinfo unix swtch pswitch 75899 sysinfo unix preempt inv_swtch So those are probes at the point that corresponding kstat is incremented. Now write a script to aggregate on stacks when we hit these probes. That will produce a lot of distinct stacks since there are myriad ways of getting to these points, so you may want to begin refining on depth of stack to collect or perhaps using only ''caller'', eg: #!/usr/sbin/dtrace -s sysinfo:::pswitch,sysinfo:::inv_swtch { @a[probename, caller] = count(); } END { printa("%s from %a: %@u\n", @a); } Run for a few seconds on my workstation: # /tmp/a.d dtrace: script ''/tmp/a.d'' matched 5 probes ^C CPU ID FUNCTION:NAME 3 2 :END pswitch from genunix`cv_wait_stop+0x100: 1 pswitch from genunix`turnstile_block+0x76b: 1 pswitch from genunix`yield+0x57: 8 pswitch from genunix`cv_wait_sig+0x13d: 10 pswitch from genunix`lwp_cond_wait+0x5c9: 14 pswitch from genunix`sema_p+0x1cb: 27 inv_swtch from genunix`post_syscall+0x59e: 77 pswitch from unix`idle+0xb2: 96 inv_swtch from unix`trap+0x1167: 102 pswitch from unix`preempt+0xda: 129 pswitch from genunix`cv_timedwait+0xcf: 541 pswitch from genunix`cv_wait_sig_swap_core+0x165: 894 pswitch from genunix`cv_wait+0x61: 1408 pswitch from genunix`cv_timedwait_sig+0x194: 3202 pswitch from unix`idle+0xbe: 5523 It''s possible/likely that the ''sched'' provider helps here too, but you''d have to read the docs for that. It''s not really a particularly meaningful question to ask "which system calls are causing voluntary and involuntary context switches", although there are perhaps some system calls (eg the "slow" ones for IO) which will routinely go down the voluntary path. I think the question needs to be posed deeper down in the kernel.>> But I see something unexpected. I see >> >> * Voluntary context switches occur almost every time due to doorfs() >> system call. They do occur for a few times due to lwp_park() and very >> few times due to yield(). >> >> * Involuntary happens anytime. (lwp_park(), read(), fstat(), putmsg(), >> gtime() and sometime without any system call!!)Yes. An involuntary context switch is essentially a case of the kernel deciding to yank one thread (which may be running in the kernel or userland at the time) off of cpu in favour of a higher priority thread that has become runnable on that cpu. For example if you have a userland app that simply does for (;;); thus spinning endless and watch switches of that you''ll very like see lots of involuntary switches (unless you have lots of cpus and little work for them). Gavin -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3249 bytes Desc: S/MIME Cryptographic Signature URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20071127/18bfbf07/attachment.bin>
Jim Mauro
2007-Nov-27 23:22 UTC
[dtrace-discuss] Dtrace probes for voluntary and involuntary context switches
> * Involuntary happens anytime. (lwp_park(), read(), fstat(), putmsg(), > gtime() and sometime without any system call!!)Hence the name (involuntary) ... :^) Involuntary context switches result from a thread being preempted, and the preemption will happen for 1 of 2 reasons: 1. The thread used it''s time quantum, in which case the kernel uses the preemption mechanism to kick it off the CPU. 2. A higher priority thread is enqueued on the CPUs run queue. You can track this activity with /usr/demo/dtrace/whopreempt.d. What is it you''re seeing that you find unexpected? Thanks, /jim
Nicolas Williams
2007-Nov-27 23:32 UTC
[dtrace-discuss] Dtrace probes for voluntary and involuntary context switches
On Mon, Nov 26, 2007 at 08:29:44PM -0800, Neelam wrote:> * Voluntary context switches occur almost every time due to doorfs() > system call. They do occur for a few times due to lwp_park() and very > few times due to yield().door_call() does a context switch, yes (an optimized one, but one nonetheless). Solaris uses lots of doors for lots of things... Few apps call yield().> * Involuntary happens anytime. (lwp_park(), read(), fstat(), putmsg(), > gtime() and sometime without any system call!!) > > Does anyone have any idea, what could be the reason for this unexpected behavior?Yes: the kernel is fully preemtible. Non-fully-preemtible kernels only do involuntary context switches at very well defined boundaries: when the process is running in user-land, when the process returns from a system call, ... But Solaris'' is a fully preemtible kernel and there''s only a few points where preemption is not possible, which means that involuntary context switches can happen in surprising places -- surprising if you are used to old Unix kernel designs. Nico --
Neelam
2007-Nov-28 03:25 UTC
[dtrace-discuss] Dtrace probes for voluntary and involuntary context switches
I am already using sysinfo:::pswitch and sysinfo::preempt:inv_swtch to profile voluntary and involuntary context switches. The output for voluntary switches seems to be ok (i see them on system calls like doorfs(), lwp_park(), yield() etc). But I was not really sure about the output for involuntary context switches. Sometimes I see them happening without any system calls which is fine. But many times I see them happening with many other system calls such as read(), fstat() etc and with lwp_park() as well. I was expecting voluntary switches with system calls!! I have another confusion, is lwp_park() always followed by a context switch or not? Thanks, Neelam -- This message posted from opensolaris.org