Hi, I would like to know about which threads will be preempted by which on my OpenSolaris machine. Therefore, I ran a multithreaded program "myprogram" with 32 threads on my 24-core Solaris machine. I make sure that each thread of my program has same priority (priority zero), so that we can reduce priority inversions (saving preemptions -- system overhead). However, I ran the following script whoprempt.d to see who preempted myprogram threads and got the following output Unlike what I thought, myprogram threads are preempted (for 2796 times -- last line of the output) by the threads of same myprogram. Could anyone explain why this happens, please? DTrace script ============= #pragma D option quiet sched:::preempt { self->preempt = 1; } sched:::remain-cpu /self->preempt/ { self->preempt = 0; } sched:::off-cpu /self->preempt/ { /* * If we were told to preempt ourselves, see who we ended up giving * the CPU to. */ @[stringof(args[1]->pr_fname), args[0]->pr_pri, execname, curlwpsinfo->pr_pri] = count(); self->preempt = 0; } END { printf("%30s %3s %30s %3s %5s\n", "PREEMPTOR", "PRI","||","PREEMPTED", "PRI", "#"); printa("%30s %3d %30s %3d %5 at d\n", @); } Output: ======PREEMPTOR PRI || PREEMPTED PRI # dtrace 0 || myprogram 0 1 dtrace 50 || myprogram 0 1 sched -1 || myprogram 0 1 myprogram 0 || dtrace 0 1 .... ..... nscd 59 || myprogram 0 4 sendmail 59 || myprogram 0 4 sched 60 || myprogram 0 92 sched 98 || myprogram 0 272 sched 99 || myprogram 0 2110 myprogram 0 || myprogram 0 2796 -- This message posted from opensolaris.org
Big subject! You haven''t said what your 32 threads are doing, or how you gave them the same priority, or what scheduler class they are running in. However, you only have 24 VCPUs, and (I assume) 32 active threads, so Solaris will try to share resources evenly, and yes, it will preempt one of your threads to run another. The preemption behaviour, including the time a thread is allowed to run without interruption, will depend on the scheduling class and parameters of each thread. If you want to reduce preemption, you can move threads to the FX class, set an absolute priority, and tune the time quantum. What you are seeing is expected. Hope this helps, Phil p.s. if you need any more help with this, please feel free to contact me offline. On 18/01/2011 06:13, Kishore Kumar Pusukuri wrote:> Hi, > I would like to know about which threads will be preempted by which on my OpenSolaris machine. > Therefore, I ran a multithreaded program "myprogram" with 32 threads on my 24-core Solaris machine. I make sure that each thread of my program has same priority (priority zero), so that we can reduce priority inversions (saving preemptions -- system overhead). However, I ran the following script whoprempt.d to see who preempted myprogram threads and got the following output Unlike what I thought, myprogram threads are preempted (for 2796 times -- last line of the output) by the threads of same myprogram. > > Could anyone explain why this happens, please? > > DTrace script > =============> > #pragma D option quiet > > sched:::preempt > { > self->preempt = 1; > } > > sched:::remain-cpu > /self->preempt/ > { > self->preempt = 0; > } > > sched:::off-cpu > /self->preempt/ > { > /* > * If we were told to preempt ourselves, see who we ended up giving > * the CPU to. > */ > @[stringof(args[1]->pr_fname), args[0]->pr_pri, execname, > curlwpsinfo->pr_pri] = count(); > self->preempt = 0; > } > > END > { > printf("%30s %3s %30s %3s %5s\n", "PREEMPTOR", "PRI","||","PREEMPTED", "PRI", "#"); > printa("%30s %3d %30s %3d %5 at d\n", @); > } > > > Output: > ======> PREEMPTOR PRI || PREEMPTED PRI # > dtrace 0 || myprogram 0 1 > dtrace 50 || myprogram 0 1 > sched -1 || myprogram 0 1 > myprogram 0 || dtrace 0 1 > .... > ..... > nscd 59 || myprogram 0 4 > sendmail 59 || myprogram 0 4 > sched 60 || myprogram 0 92 > sched 98 || myprogram 0 272 > sched 99 || myprogram 0 2110 > myprogram 0 || myprogram 0 2796
Hi Kishore - If memory serves, the kernel uses the preemption mechanism when a thread uses its time quantum and thus must be forced to give up the CPU. If your "myprogram" threads are compute-bound, I would suspect they are being preempted by other myprogram threads of the same priority due to time quantum expiration. I have a DTrace script that tests for this condition somewhere, but I can''t find it. I will poke around. As an aside, this isn''t a ZFS question, or even a DTrace question, and is thus probably more suited for a general performace discussion alias, such as perf-discuss at opensolaris.org. Thanks /jim On Jan 18, 2011, at 1:13 AM, Kishore Kumar Pusukuri wrote:> Hi, > I would like to know about which threads will be preempted by which on my OpenSolaris machine. > Therefore, I ran a multithreaded program "myprogram" with 32 threads on my 24-core Solaris machine. I make sure that each thread of my program has same priority (priority zero), so that we can reduce priority inversions (saving preemptions -- system overhead). However, I ran the following script whoprempt.d to see who preempted myprogram threads and got the following output Unlike what I thought, myprogram threads are preempted (for 2796 times -- last line of the output) by the threads of same myprogram. > > Could anyone explain why this happens, please? > > DTrace script > =============> > #pragma D option quiet > > sched:::preempt > { > self->preempt = 1; > } > > sched:::remain-cpu > /self->preempt/ > { > self->preempt = 0; > } > > sched:::off-cpu > /self->preempt/ > { > /* > * If we were told to preempt ourselves, see who we ended up giving > * the CPU to. > */ > @[stringof(args[1]->pr_fname), args[0]->pr_pri, execname, > curlwpsinfo->pr_pri] = count(); > self->preempt = 0; > } > > END > { > printf("%30s %3s %30s %3s %5s\n", "PREEMPTOR", "PRI","||","PREEMPTED", "PRI", "#"); > printa("%30s %3d %30s %3d %5 at d\n", @); > } > > > Output: > ======> PREEMPTOR PRI || PREEMPTED PRI # > dtrace 0 || myprogram 0 1 > dtrace 50 || myprogram 0 1 > sched -1 || myprogram 0 1 > myprogram 0 || dtrace 0 1 > .... > ..... > nscd 59 || myprogram 0 4 > sendmail 59 || myprogram 0 4 > sched 60 || myprogram 0 92 > sched 98 || myprogram 0 272 > sched 99 || myprogram 0 2110 > myprogram 0 || myprogram 0 2796 > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss