Hi, I am looking to measure how long a thread takes to migrate between cpu''s and how often , what I have is below which is checking just one process is this the correct track to be on here? My aim is to look at a process and look at upping the reboose_interval on some of our servers. #!/usr/sbin/dtrace -s sched:::off-cpu { self->cpu = cpu; self->timestamp=timestamp; } sched:::on-cpu /self->cpu != cpu && execname=="mstragent"/ { printf("%s migrated from cpu %d to cpu %d nsec:%d \n",execname,self->cpu,cpu,timestamp-self->timestamp); printf("%s migrated from cpu %d to cpu %d ms:%d \n",execname,self->cpu,cpu,(timestamp-self->timestamp)/1000000); self->cpu = 0; self->timestamp = 0; } Sample output is as follows dtrace: script ''./migr.d'' matched 3 probes CPU ID FUNCTION:NAME 1 782 resume:on-cpu mstragent migrated from cpu 0 to cpu 1 nsec:1123786101964690 mstragent migrated from cpu 0 to cpu 1 ms:1123786101 0 782 resume:on-cpu mstragent migrated from cpu 1 to cpu 0 nsec:38049334 mstragent migrated from cpu 1 to cpu 0 ms:38 0 782 resume:on-cpu mstragent migrated from cpu 1 to cpu 0 nsec:39942749 mstragent migrated from cpu 1 to cpu 0 ms:39 0 782 resume:on-cpu mstragent migrated from cpu 1 to cpu 0 nsec:40069916 mstragent migrated from cpu 1 to cpu 0 ms:40 0 782 resume:on-cpu mstragent migrated from cpu 1 to cpu 0 nsec:39936750 mstragent migrated from cpu 1 to cpu 0 ms:39 0 782 resume:on-cpu mstragent migrated from cpu 1 to cpu 0 nsec:39991167 mstragent migrated from cpu 1 to cpu 0 ms:39 0 782 resume:on-cpu mstragent migrated from cpu 1 to cpu 0 nsec:40012583 mstragent migrated from cpu 1 to cpu 0 ms:40 0 782 resume:on-cpu mstragent migrated from cpu 1 to cpu 0 nsec:39956000 mstragent migrated from cpu 1 to cpu 0 ms:39 0 782 resume:on-cpu mstragent migrated from cpu 1 to cpu 0 nsec:39970999 mstragent migrated from cpu 1 to cpu 0 ms:39 Thanks Al -- This message posted from opensolaris.org
Allan wrote:> Hi, > > I am looking to measure how long a thread takes to migrate between cpu''s and how often , what I have is below which is checking just one process is this the correct track to be on here? > > My aim is to look at a process and look at upping the reboose_interval on some of our servers. > > > #!/usr/sbin/dtrace -s > sched:::off-cpu > { > self->cpu = cpu; > self->timestamp=timestamp; > } > > sched:::on-cpu /self->cpu != cpu && execname=="mstragent"/ > { > printf("%s migrated from cpu %d to cpu %d nsec:%d \n",execname,self->cpu,cpu,timestamp-self->timestamp); > printf("%s migrated from cpu %d to cpu %d ms:%d \n",execname,self->cpu,cpu,(timestamp-self->timestamp)/1000000); > self->cpu = 0; > self->timestamp = 0; > } > [snip]Hi Alan Your script is not accounting for CPU 0, so you need to set a thread local flag variable on the off-cpu probe, and predicate the on-cpu on that. If you''re looking at reducing migrations, the nosteal_nsec variable determines how long a thread should remain in the runq before the scheduler allows it to be stolen by another CPU. Increasing it might help you lower migrations. Rafael
Hi Rafael, Rafael Vanoni wrote:> Allan wrote: >> Hi, >> >> I am looking to measure how long a thread takes to migrate between >> cpu''s and how often , what I have is below which is checking just one >> process is this the correct track to be on here? >> My aim is to look at a process and look at upping the >> reboose_interval on some of our servers. >> >> >> #!/usr/sbin/dtrace -s >> sched:::off-cpu >> { >> self->cpu = cpu; >> self->timestamp=timestamp; >> } >> >> sched:::on-cpu /self->cpu != cpu && execname=="mstragent"/ >> { >> printf("%s migrated from cpu %d to cpu %d nsec:%d >> \n",execname,self->cpu,cpu,timestamp-self->timestamp); >> printf("%s migrated from cpu %d to cpu %d ms:%d >> \n",execname,self->cpu,cpu,(timestamp-self->timestamp)/1000000); >> self->cpu = 0; >> self->timestamp = 0; >> } >> [snip] > > Hi Alan > > Your script is not accounting for CPU 0, so you need to set a thread > local flag variable on the off-cpu probe, and predicate the on-cpu on > that.Can you explain this? Why special accounting for CPU 0?> > If you''re looking at reducing migrations, the nosteal_nsec variable > determines how long a thread should remain in the runq before the > scheduler allows it to be stolen by another CPU. Increasing it might > help you lower migrations.Thanks for the tip about nosteal_nsec. max
Hi, Rafael Vanoni wrote:> Allan wrote: >> Hi, >> >> I am looking to measure how long a thread takes to migrate between >> cpu''s and how often , what I have is below which is checking just one >> process is this the correct track to be on here? >> My aim is to look at a process and look at upping the >> reboose_interval on some of our servers. >> >> >> #!/usr/sbin/dtrace -s >> sched:::off-cpu >> { >> self->cpu = cpu; >> self->timestamp=timestamp; >> } >> >> sched:::on-cpu /self->cpu != cpu && execname=="mstragent"/ >> { >> printf("%s migrated from cpu %d to cpu %d nsec:%d >> \n",execname,self->cpu,cpu,timestamp-self->timestamp); >> printf("%s migrated from cpu %d to cpu %d ms:%d >> \n",execname,self->cpu,cpu,(timestamp-self->timestamp)/1000000); >> self->cpu = 0; >> self->timestamp = 0; >> } >> [snip] > > Hi Alan > > Your script is not accounting for CPU 0, so you need to set a thread > local flag variable on the off-cpu probe, and predicate the on-cpu on > that.Oh. You mean the on-cpu probe firing before the off-cpu when the script starts... Which would explain the large number Allan gets in his output for the first time the probe fires. max
Rafael Thanks for the update I will be looking to implement the changes. Can you advise on the benefit of changing the nosteal_nsec vs rechoose_interval , my thoughts were if the rechoose interval was increased to an amount larger than the time spent migrating I would benefit with keeping cpu affinity or would you set them to the same? I am looking at a application that has i/o latency issues where it cant handle a 2-3ms delay which we see when we replicate the data via srdf, I was initially thinking that the thread was being migrated between the cpus which in turn would add another delay which I was trying to measure other than that of the initial i/o. If the thread goes into a sleep state when the i/o is ongoing and then awakens would the rechoose interval still be exceeded by the sleep time causing the migration? Thanks Al -- This message posted from opensolaris.org
Allan wrote:> Rafael > > Thanks for the update I will be looking to implement the changes. > > Can you advise on the benefit of changing the nosteal_nsec vs rechoose_interval , my thoughts were if the rechoose interval was increased to an amount larger than the time spent migrating I would benefit with keeping cpu affinity or would you set them to the same? > > I am looking at a application that has i/o latency issues where it cant handle a 2-3ms delay which we see when we replicate the data via srdf, I was initially thinking that the thread was being migrated between the cpus which in turn would add another delay which I was trying to measure other than that of the initial i/o. If the thread goes into a sleep state when the i/o is ongoing and then awakens would the rechoose interval still be exceeded by the sleep time causing the migration? >The rechoose_interval value is set so that a thread that sleeps for a short period of time (< rechoose_interval ticks) will run on the same cpu it last ran on. This is to take advantage of a possibly still warm cache on that cpu. The migration is occuring because there is some other cpu which has a lower priority running thread, or the maximum priority of threads waiting for the cpu is the lowest value. In other words, the system tries to run the thread after it wakes up as soon as possible, first checking to see if it is best to run it where it last ran. The only additional cost for migration might be a cross call(?). You might try running the thread bound to a cpu (via pbind(1) or processor_bind(2)) to see what the difference is in performance. (Or try using processor sets). max
Max Thanks very much for the input Regards Al -- This message posted from opensolaris.org
max at bruningsystems.com wrote:> Allan wrote: >> Rafael >> Thanks for the update I will be looking to implement the changes. >> Can you advise on the benefit of changing the nosteal_nsec vs >> rechoose_interval , my thoughts were if the rechoose interval was >> increased to an amount larger than the time spent migrating I would >> benefit with keeping cpu affinity or would you set them to the same? >> I am looking at a application that has i/o latency issues where it >> cant handle a 2-3ms delay which we see when we replicate the data via >> srdf, I was initially thinking that the thread was being migrated >> between the cpus which in turn would add another delay which I was >> trying to measure other than that of the initial i/o. If the thread >> goes into a sleep state when the i/o is ongoing and then awakens would >> the rechoose interval still be exceeded by the sleep time causing the >> migration? >> > The rechoose_interval value is set so that a thread that sleeps for a short > period of time (< rechoose_interval ticks) will run on the same cpu it > last ran on. This is to take advantage of a possibly still warm cache on > that cpu. The migration is occuring because there is some other cpu which > has a lower priority running thread, or the maximum priority of threads > waiting for the cpu is the lowest value.That''s correct, but these variables are used in different situations for similar purposes. rechoose_interval is used to help decide on which CPU''s runq a thread should be placed. nosteal_nsec is used when an idle CPU checks if there''s any work that can be stolen from another CPU''s runq. This means in part that a thread that was kept on a runq to preserve its cache warm might still get stolen by another CPU if possible. But the code that does the stealing checks for a number of different things, one of which is whether the stealing CPU and the current target CPU share cache. If they don''t share, then nosteal_nsec is used to see if the target thread has been on the runq for long enough. Please note that there are a number of other things that are taken into consideration when placing threads on qs and when stealing them, and that changing these variables from their default values is at your own risk. I''m currently working on optimizations to reduce thread migrations, but we don''t have a target for integration yet. Rafael