Hi All, We have a xen server and using 8 core processor. I can see that there is 99% iowait on only core 0. 02:28:49 AM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s 02:28:54 AM all 0.00 0.00 0.00 12.65 0.00 0.02 2.24 85.08 1359.88 02:28:54 AM 0 0.00 0.00 0.00 96.21 0.00 0.20 3.19 0.40 847.11 02:28:54 AM 1 0.00 0.00 0.00 6.41 0.00 0.00 9.42 84.17 219.56 02:28:54 AM 2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 2.59 02:28:54 AM 3 0.00 0.00 0.00 0.00 0.00 0.00 2.12 97.88 76.25 02:28:54 AM 4 0.00 0.00 0.00 0.00 0.00 0.00 1.20 98.80 118.56 02:28:54 AM 5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 3.59 02:28:54 AM 6 0.00 0.00 0.00 0.00 0.00 0.00 2.02 97.98 89.62 02:28:54 AM 7 0.00 0.00 0.00 0.00 0.00 0.00 0.20 99.80 2.59 02:28:54 AM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s 02:28:59 AM all 0.00 0.00 0.00 12.48 0.00 0.00 2.78 84.74 1317.43 02:28:59 AM 0 0.00 0.00 0.00 98.80 0.00 0.00 0.80 0.40 885.17 02:28:59 AM 1 0.00 0.00 0.00 0.00 0.00 0.00 11.38 88.62 151.30 02:28:59 AM 2 0.00 0.00 0.00 0.00 0.00 0.00 0.20 99.80 2.81 02:28:59 AM 3 0.00 0.00 0.00 0.00 0.00 0.00 7.21 92.79 94.79 02:28:59 AM 4 0.00 0.00 0.00 0.00 0.00 0.00 2.20 97.80 170.34 02:28:59 AM 5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 4.41 02:28:59 AM 6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 5.81 02:28:59 AM 7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 2.81 I have even tries changing the cpus mapped to the DomU with out no effect. Name CPU CPU Affinity 4pulse 1 1 2music 3 3 Domain-0 0 0 Domain-0 1 1 Domain-0 2 2 Domain-0 3 3 Domain-0 4 4 Domain-0 5 5 Domain-0 6 6 Domain-0 7 7 analshah 6 any cpu arunvelayudhan 7 any cpu backup 7 any cpu crickruns 3 1-3 crickruns 2 1-3 crickruns 1 1-3 crickruns 2 1-3 crickruns 1 1-3 crickruns 1 1-3 dedicatedjv 7 any cpu yeluthu 4 3-5 yeluthu 3 3-5 yeluthu 3 3-5 yeluthu 3 3-5 yeluthu 3 3-5 yeluthu 3 3-5 freshnfresh 3 any cpu monitoring 7 any cpu reporter 6 5-7 reporter 7 5-7 reporter 6 5-7 reporter 6 5-7 reporter 7 5-7 reporter 7 5-7 reporter 5 5-7 reporter 7 5-7 radio03 7 any cpu saampeter 2 1-2 saampeter 2 1-2 Thanks, Rajesh
Hi, I experianced the same. I figured the reason behind this has nothing to do with what (v)cpus you assign to the domUs. In fact, I think that the real reason is that all I/O emulation from the domUs (like HDD, Network and Stuff) is all handled by CPU0 of dom0, even in a multi-core-architecture. You can test this with trying something I/O intensive like a hdd benchmark in the domU and you will see the dom0-cpu0 utilization rising. Using PVHVM or GPLPV-Driver only reduces the amount of utilization but it is still handled by cpu0 only. Due to that fact i leave cpu0 to my dom0 alone and only assign the other cpu-cores to the domUs. Note: This is only an assumption I made after seeing my machines behaviour. If the above isn''t true, please correct me. 2012/7/11 Rajesh Kumar <rajesh@hiox.com>:> Hi All, > > We have a xen server and using 8 core processor. > > I can see that there is 99% iowait on only core 0. > > > 02:28:49 AM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s > 02:28:54 AM all 0.00 0.00 0.00 12.65 0.00 0.02 2.24 85.08 1359.88 > 02:28:54 AM 0 0.00 0.00 0.00 96.21 0.00 0.20 3.19 0.40 847.11 > 02:28:54 AM 1 0.00 0.00 0.00 6.41 0.00 0.00 9.42 84.17 219.56 > 02:28:54 AM 2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 2.59 > 02:28:54 AM 3 0.00 0.00 0.00 0.00 0.00 0.00 2.12 97.88 76.25 > 02:28:54 AM 4 0.00 0.00 0.00 0.00 0.00 0.00 1.20 98.80 118.56 > 02:28:54 AM 5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 3.59 > 02:28:54 AM 6 0.00 0.00 0.00 0.00 0.00 0.00 2.02 97.98 89.62 > 02:28:54 AM 7 0.00 0.00 0.00 0.00 0.00 0.00 0.20 99.80 2.59 > > 02:28:54 AM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s > 02:28:59 AM all 0.00 0.00 0.00 12.48 0.00 0.00 2.78 84.74 1317.43 > 02:28:59 AM 0 0.00 0.00 0.00 98.80 0.00 0.00 0.80 0.40 885.17 > 02:28:59 AM 1 0.00 0.00 0.00 0.00 0.00 0.00 11.38 88.62 151.30 > 02:28:59 AM 2 0.00 0.00 0.00 0.00 0.00 0.00 0.20 99.80 2.81 > 02:28:59 AM 3 0.00 0.00 0.00 0.00 0.00 0.00 7.21 92.79 94.79 > 02:28:59 AM 4 0.00 0.00 0.00 0.00 0.00 0.00 2.20 97.80 170.34 > 02:28:59 AM 5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 4.41 > 02:28:59 AM 6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 5.81 > 02:28:59 AM 7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 2.81 > > > I have even tries changing the cpus mapped to the DomU with out no effect. > > Name CPU CPU Affinity > 4pulse 1 1 > 2music 3 3 > Domain-0 0 0 > Domain-0 1 1 > Domain-0 2 2 > Domain-0 3 3 > Domain-0 4 4 > Domain-0 5 5 > Domain-0 6 6 > Domain-0 7 7 > analshah 6 any cpu > arunvelayudhan 7 any cpu > backup 7 any cpu > crickruns 3 1-3 > crickruns 2 1-3 > crickruns 1 1-3 > crickruns 2 1-3 > crickruns 1 1-3 > crickruns 1 1-3 > dedicatedjv 7 any cpu > yeluthu 4 3-5 > yeluthu 3 3-5 > yeluthu 3 3-5 > yeluthu 3 3-5 > yeluthu 3 3-5 > yeluthu 3 3-5 > freshnfresh 3 any cpu > monitoring 7 any cpu > reporter 6 5-7 > reporter 7 5-7 > reporter 6 5-7 > reporter 6 5-7 > reporter 7 5-7 > reporter 7 5-7 > reporter 5 5-7 > reporter 7 5-7 > radio03 7 any cpu > saampeter 2 1-2 > saampeter 2 1-2 > > > Thanks, > Rajesh > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xen.org > http://lists.xen.org/xen-users
Hi, Thank You. Your response makes a lot of sense to me. I will try to move all domU away from cpu0. On 11-Jul-2012, at 7:26 PM, Matthias wrote:> Hi, > > I experianced the same. > > I figured the reason behind this has nothing to do with what (v)cpus > you assign to the domUs. In fact, I think that the real reason is that > all I/O emulation from the domUs (like HDD, Network and Stuff) is all > handled by CPU0 of dom0, even in a multi-core-architecture. > > You can test this with trying something I/O intensive like a hdd > benchmark in the domU and you will see the dom0-cpu0 utilization > rising. > > Using PVHVM or GPLPV-Driver only reduces the amount of utilization but > it is still handled by cpu0 only. Due to that fact i leave cpu0 to my > dom0 alone and only assign the other cpu-cores to the domUs. > > > Note: This is only an assumption I made after seeing my machines > behaviour. If the above isn''t true, please correct me. > > > 2012/7/11 Rajesh Kumar <rajesh@hiox.com>: >> Hi All, >> >> We have a xen server and using 8 core processor. >> >> I can see that there is 99% iowait on only core 0. >> >> >> 02:28:49 AM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s >> 02:28:54 AM all 0.00 0.00 0.00 12.65 0.00 0.02 2.24 85.08 1359.88 >> 02:28:54 AM 0 0.00 0.00 0.00 96.21 0.00 0.20 3.19 0.40 847.11 >> 02:28:54 AM 1 0.00 0.00 0.00 6.41 0.00 0.00 9.42 84.17 219.56 >> 02:28:54 AM 2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 2.59 >> 02:28:54 AM 3 0.00 0.00 0.00 0.00 0.00 0.00 2.12 97.88 76.25 >> 02:28:54 AM 4 0.00 0.00 0.00 0.00 0.00 0.00 1.20 98.80 118.56 >> 02:28:54 AM 5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 3.59 >> 02:28:54 AM 6 0.00 0.00 0.00 0.00 0.00 0.00 2.02 97.98 89.62 >> 02:28:54 AM 7 0.00 0.00 0.00 0.00 0.00 0.00 0.20 99.80 2.59 >> >> 02:28:54 AM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s >> 02:28:59 AM all 0.00 0.00 0.00 12.48 0.00 0.00 2.78 84.74 1317.43 >> 02:28:59 AM 0 0.00 0.00 0.00 98.80 0.00 0.00 0.80 0.40 885.17 >> 02:28:59 AM 1 0.00 0.00 0.00 0.00 0.00 0.00 11.38 88.62 151.30 >> 02:28:59 AM 2 0.00 0.00 0.00 0.00 0.00 0.00 0.20 99.80 2.81 >> 02:28:59 AM 3 0.00 0.00 0.00 0.00 0.00 0.00 7.21 92.79 94.79 >> 02:28:59 AM 4 0.00 0.00 0.00 0.00 0.00 0.00 2.20 97.80 170.34 >> 02:28:59 AM 5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 4.41 >> 02:28:59 AM 6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 5.81 >> 02:28:59 AM 7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 2.81 >> >> >> I have even tries changing the cpus mapped to the DomU with out no effect. >> >> Name CPU CPU Affinity >> 4pulse 1 1 >> 2music 3 3 >> Domain-0 0 0 >> Domain-0 1 1 >> Domain-0 2 2 >> Domain-0 3 3 >> Domain-0 4 4 >> Domain-0 5 5 >> Domain-0 6 6 >> Domain-0 7 7 >> analshah 6 any cpu >> arunvelayudhan 7 any cpu >> backup 7 any cpu >> crickruns 3 1-3 >> crickruns 2 1-3 >> crickruns 1 1-3 >> crickruns 2 1-3 >> crickruns 1 1-3 >> crickruns 1 1-3 >> dedicatedjv 7 any cpu >> yeluthu 4 3-5 >> yeluthu 3 3-5 >> yeluthu 3 3-5 >> yeluthu 3 3-5 >> yeluthu 3 3-5 >> yeluthu 3 3-5 >> freshnfresh 3 any cpu >> monitoring 7 any cpu >> reporter 6 5-7 >> reporter 7 5-7 >> reporter 6 5-7 >> reporter 6 5-7 >> reporter 7 5-7 >> reporter 7 5-7 >> reporter 5 5-7 >> reporter 7 5-7 >> radio03 7 any cpu >> saampeter 2 1-2 >> saampeter 2 1-2 >> >> >> Thanks, >> Rajesh >> _______________________________________________ >> Xen-users mailing list >> Xen-users@lists.xen.org >> http://lists.xen.org/xen-users
On Wed, 2012-07-11 at 09:56 -0400, Matthias wrote:> Hi, > > I experianced the same. > > I figured the reason behind this has nothing to do with what (v)cpus > you assign to the domUs. In fact, I think that the real reason is that > all I/O emulation from the domUs (like HDD, Network and Stuff) is all > handled by CPU0 of dom0, even in a multi-core-architecture.This should not be the case. I/O emulation is done by qemu running in domain 0 and should be scheduled on any dom0 vcpu. If it is not then this is something to investigate. Normally this would require explicit admin action to pin the affinity of the process though. One easy thing to look at would be /proc/interrupts to check that the irq associated with ioreq upcalls for each guest are being properly balanced (if not the installing irqbalanced might help) Ian.> > You can test this with trying something I/O intensive like a hdd > benchmark in the domU and you will see the dom0-cpu0 utilization > rising. > > Using PVHVM or GPLPV-Driver only reduces the amount of utilization but > it is still handled by cpu0 only. Due to that fact i leave cpu0 to my > dom0 alone and only assign the other cpu-cores to the domUs. > > > Note: This is only an assumption I made after seeing my machines > behaviour. If the above isn''t true, please correct me. > > > 2012/7/11 Rajesh Kumar <rajesh@hiox.com>: > > Hi All, > > > > We have a xen server and using 8 core processor. > > > > I can see that there is 99% iowait on only core 0. > > > > > > 02:28:49 AM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s > > 02:28:54 AM all 0.00 0.00 0.00 12.65 0.00 0.02 2.24 85.08 1359.88 > > 02:28:54 AM 0 0.00 0.00 0.00 96.21 0.00 0.20 3.19 0.40 847.11 > > 02:28:54 AM 1 0.00 0.00 0.00 6.41 0.00 0.00 9.42 84.17 219.56 > > 02:28:54 AM 2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 2.59 > > 02:28:54 AM 3 0.00 0.00 0.00 0.00 0.00 0.00 2.12 97.88 76.25 > > 02:28:54 AM 4 0.00 0.00 0.00 0.00 0.00 0.00 1.20 98.80 118.56 > > 02:28:54 AM 5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 3.59 > > 02:28:54 AM 6 0.00 0.00 0.00 0.00 0.00 0.00 2.02 97.98 89.62 > > 02:28:54 AM 7 0.00 0.00 0.00 0.00 0.00 0.00 0.20 99.80 2.59 > > > > 02:28:54 AM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s > > 02:28:59 AM all 0.00 0.00 0.00 12.48 0.00 0.00 2.78 84.74 1317.43 > > 02:28:59 AM 0 0.00 0.00 0.00 98.80 0.00 0.00 0.80 0.40 885.17 > > 02:28:59 AM 1 0.00 0.00 0.00 0.00 0.00 0.00 11.38 88.62 151.30 > > 02:28:59 AM 2 0.00 0.00 0.00 0.00 0.00 0.00 0.20 99.80 2.81 > > 02:28:59 AM 3 0.00 0.00 0.00 0.00 0.00 0.00 7.21 92.79 94.79 > > 02:28:59 AM 4 0.00 0.00 0.00 0.00 0.00 0.00 2.20 97.80 170.34 > > 02:28:59 AM 5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 4.41 > > 02:28:59 AM 6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 5.81 > > 02:28:59 AM 7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 2.81 > > > > > > I have even tries changing the cpus mapped to the DomU with out no effect. > > > > Name CPU CPU Affinity > > 4pulse 1 1 > > 2music 3 3 > > Domain-0 0 0 > > Domain-0 1 1 > > Domain-0 2 2 > > Domain-0 3 3 > > Domain-0 4 4 > > Domain-0 5 5 > > Domain-0 6 6 > > Domain-0 7 7 > > analshah 6 any cpu > > arunvelayudhan 7 any cpu > > backup 7 any cpu > > crickruns 3 1-3 > > crickruns 2 1-3 > > crickruns 1 1-3 > > crickruns 2 1-3 > > crickruns 1 1-3 > > crickruns 1 1-3 > > dedicatedjv 7 any cpu > > yeluthu 4 3-5 > > yeluthu 3 3-5 > > yeluthu 3 3-5 > > yeluthu 3 3-5 > > yeluthu 3 3-5 > > yeluthu 3 3-5 > > freshnfresh 3 any cpu > > monitoring 7 any cpu > > reporter 6 5-7 > > reporter 7 5-7 > > reporter 6 5-7 > > reporter 6 5-7 > > reporter 7 5-7 > > reporter 7 5-7 > > reporter 5 5-7 > > reporter 7 5-7 > > radio03 7 any cpu > > saampeter 2 1-2 > > saampeter 2 1-2 > > > > > > Thanks, > > Rajesh > > _______________________________________________ > > Xen-users mailing list > > Xen-users@lists.xen.org > > http://lists.xen.org/xen-users > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xen.org > http://lists.xen.org/xen-users
Just checked my /proc/interrupts: EVERY xen related thing (blkif-backend,the domUs, xhci_hcd, all xen-interfaces, evtchn:xenstored, evtchn:qemu-dm is all bound to cpu0, only zeros for the other cpu-cores.. Sys is debian wheezy, xen is current testing from hg, kernel is openSuse3.4.2 with the xen patches they ship with it.. will try your irqbalanced-suggestion next.. 2012/7/12 Ian Campbell <ian.campbell@citrix.com>:> On Wed, 2012-07-11 at 09:56 -0400, Matthias wrote: >> Hi, >> >> I experianced the same. >> >> I figured the reason behind this has nothing to do with what (v)cpus >> you assign to the domUs. In fact, I think that the real reason is that >> all I/O emulation from the domUs (like HDD, Network and Stuff) is all >> handled by CPU0 of dom0, even in a multi-core-architecture. > > This should not be the case. I/O emulation is done by qemu running in > domain 0 and should be scheduled on any dom0 vcpu. If it is not then > this is something to investigate. Normally this would require explicit > admin action to pin the affinity of the process though. > > One easy thing to look at would be /proc/interrupts to check that the > irq associated with ioreq upcalls for each guest are being properly > balanced (if not the installing irqbalanced might help) > > Ian. > >> >> You can test this with trying something I/O intensive like a hdd >> benchmark in the domU and you will see the dom0-cpu0 utilization >> rising. >> >> Using PVHVM or GPLPV-Driver only reduces the amount of utilization but >> it is still handled by cpu0 only. Due to that fact i leave cpu0 to my >> dom0 alone and only assign the other cpu-cores to the domUs. >> >> >> Note: This is only an assumption I made after seeing my machines >> behaviour. If the above isn''t true, please correct me. >> >> >> 2012/7/11 Rajesh Kumar <rajesh@hiox.com>: >> > Hi All, >> > >> > We have a xen server and using 8 core processor. >> > >> > I can see that there is 99% iowait on only core 0. >> > >> > >> > 02:28:49 AM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s >> > 02:28:54 AM all 0.00 0.00 0.00 12.65 0.00 0.02 2.24 85.08 1359.88 >> > 02:28:54 AM 0 0.00 0.00 0.00 96.21 0.00 0.20 3.19 0.40 847.11 >> > 02:28:54 AM 1 0.00 0.00 0.00 6.41 0.00 0.00 9.42 84.17 219.56 >> > 02:28:54 AM 2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 2.59 >> > 02:28:54 AM 3 0.00 0.00 0.00 0.00 0.00 0.00 2.12 97.88 76.25 >> > 02:28:54 AM 4 0.00 0.00 0.00 0.00 0.00 0.00 1.20 98.80 118.56 >> > 02:28:54 AM 5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 3.59 >> > 02:28:54 AM 6 0.00 0.00 0.00 0.00 0.00 0.00 2.02 97.98 89.62 >> > 02:28:54 AM 7 0.00 0.00 0.00 0.00 0.00 0.00 0.20 99.80 2.59 >> > >> > 02:28:54 AM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s >> > 02:28:59 AM all 0.00 0.00 0.00 12.48 0.00 0.00 2.78 84.74 1317.43 >> > 02:28:59 AM 0 0.00 0.00 0.00 98.80 0.00 0.00 0.80 0.40 885.17 >> > 02:28:59 AM 1 0.00 0.00 0.00 0.00 0.00 0.00 11.38 88.62 151.30 >> > 02:28:59 AM 2 0.00 0.00 0.00 0.00 0.00 0.00 0.20 99.80 2.81 >> > 02:28:59 AM 3 0.00 0.00 0.00 0.00 0.00 0.00 7.21 92.79 94.79 >> > 02:28:59 AM 4 0.00 0.00 0.00 0.00 0.00 0.00 2.20 97.80 170.34 >> > 02:28:59 AM 5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 4.41 >> > 02:28:59 AM 6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 5.81 >> > 02:28:59 AM 7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 2.81 >> > >> > >> > I have even tries changing the cpus mapped to the DomU with out no effect. >> > >> > Name CPU CPU Affinity >> > 4pulse 1 1 >> > 2music 3 3 >> > Domain-0 0 0 >> > Domain-0 1 1 >> > Domain-0 2 2 >> > Domain-0 3 3 >> > Domain-0 4 4 >> > Domain-0 5 5 >> > Domain-0 6 6 >> > Domain-0 7 7 >> > analshah 6 any cpu >> > arunvelayudhan 7 any cpu >> > backup 7 any cpu >> > crickruns 3 1-3 >> > crickruns 2 1-3 >> > crickruns 1 1-3 >> > crickruns 2 1-3 >> > crickruns 1 1-3 >> > crickruns 1 1-3 >> > dedicatedjv 7 any cpu >> > yeluthu 4 3-5 >> > yeluthu 3 3-5 >> > yeluthu 3 3-5 >> > yeluthu 3 3-5 >> > yeluthu 3 3-5 >> > yeluthu 3 3-5 >> > freshnfresh 3 any cpu >> > monitoring 7 any cpu >> > reporter 6 5-7 >> > reporter 7 5-7 >> > reporter 6 5-7 >> > reporter 6 5-7 >> > reporter 7 5-7 >> > reporter 7 5-7 >> > reporter 5 5-7 >> > reporter 7 5-7 >> > radio03 7 any cpu >> > saampeter 2 1-2 >> > saampeter 2 1-2 >> > >> > >> > Thanks, >> > Rajesh >> > _______________________________________________ >> > Xen-users mailing list >> > Xen-users@lists.xen.org >> > http://lists.xen.org/xen-users >> >> _______________________________________________ >> Xen-users mailing list >> Xen-users@lists.xen.org >> http://lists.xen.org/xen-users > >
After installing and starting irqbalance (from debian package), the
situation is exactly the same, even after restarting the domUs:
[snippet from /pro/interrupts]
# cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
83: 5824430 0 0 0 0
0 Phys-fasteoi ahci
84: 1 0 0 0 0
0 Phys-fasteoi xhci_hcd
85: 0 0 0 0 0
0 Phys-fasteoi xhci_hcd
86: 0 0 0 0 0
0 Phys-fasteoi xhci_hcd
87: 0 0 0 0 0
0 Phys-fasteoi xhci_hcd
88: 0 0 0 0 0
0 Phys-fasteoi xhci_hcd
89: 0 0 0 0 0
0 Phys-fasteoi xhci_hcd
90: 0 0 0 0 0
0 Phys-fasteoi xhci_hcd
91: 1559973 0 0 0 0
0 Phys-fasteoi eth0
92: 29 0 0 0 0
0 Phys-fasteoi snd_hda_intel
93: 8202 0 0 0 0
0 Dynamic-fasteoi evtchn:xenstored[2827]
94: 14362 0 0 0 0
0 Dynamic-fasteoi xenbus
95: 12 0 0 0 0
0 Dynamic-fasteoi evtchn:xenstored[2827]
96: 729 0 0 0 0
0 Dynamic-fasteoi evtchn:xenstored[2827]
97: 345357 0 0 0 0
0 Dynamic-fasteoi evtchn:qemu-dm[3725]
98: 15239731 0 0 0 0
0 Dynamic-fasteoi evtchn:qemu-dm[3725]
99: 602773 0 0 0 0
0 Dynamic-fasteoi fw11
100: 25609 0 0 0 0
0 Dynamic-fasteoi blkif-backend
101: 41651 0 0 0 0
0 Dynamic-fasteoi fw12
102: 1 0 0 0 0
0 Dynamic-fasteoi fw14
103: 1 0 0 0 0
0 Dynamic-fasteoi fw15
104: 1 0 0 0 0
0 Dynamic-fasteoi fw16
105: 960520 0 0 0 0
0 Dynamic-fasteoi fw17
106: 408 0 0 0 0
0 Dynamic-fasteoi evtchn:xenstored[2827]
107: 226434 0 0 0 0
0 Dynamic-fasteoi evtchn:qemu-dm[4535]
108: 15237929 0 0 0 0
0 Dynamic-fasteoi evtchn:qemu-dm[4535]
109: 77709 0 0 0 0
0 Dynamic-fasteoi blkif-backend
110: 592222 0 0 0 0
0 Dynamic-fasteoi fw21
111: 820394 0 0 0 0
0 Dynamic-fasteoi fw23
112: 205 0 0 0 0
0 Dynamic-fasteoi evtchn:xenstored[2827]
113: 14548772 0 0 0 0
0 Dynamic-fasteoi evtchn:qemu-dm[24468]
114: 14105231 0 0 0 0
0 Dynamic-fasteoi evtchn:qemu-dm[24468]
115: 14016654 0 0 0 0
0 Dynamic-fasteoi evtchn:qemu-dm[24468]
116: 13903196 0 0 0 0
0 Dynamic-fasteoi evtchn:qemu-dm[24468]
117: 298 0 0 0 0
0 Dynamic-fasteoi evtchn:xenstored[2827]
118: 3070868 0 0 0 0
0 Dynamic-fasteoi evtchn:qemu-dm[5658]
119: 15303878 0 0 0 0
0 Dynamic-fasteoi evtchn:qemu-dm[5658]
120: 51825 0 0 0 0
0 Dynamic-fasteoi evtchn:qemu-dm[5658]
121: 562778 0 0 0 0
0 Dynamic-fasteoi blkif-backend
122: 1412 0 0 0 0
0 Dynamic-fasteoi blkif-backend
123: 1 0 0 0 0
0 Dynamic-fasteoi blkif-backend
124: 102196 0 0 0 0
0 Dynamic-fasteoi work
125: 1 0 0 0 0
0 Dynamic-fasteoi usbif-backend
126: 73329 0 0 0 0
0 Dynamic-fasteoi blkif-backend
127: 28622 0 0 0 0
0 Dynamic-fasteoi web
128: 272 0 0 0 0
0 Dynamic-fasteoi evtchn:xenstored[2827]
129: 272913 0 0 0 0
0 Dynamic-fasteoi evtchn:qemu-dm[13834]
130: 3040 0 0 0 0
0 Dynamic-fasteoi evtchn:qemu-dm[13834]
131: 4550 0 0 0 0
0 Dynamic-fasteoi evtchn:qemu-dm[13834]
132: 106866 0 0 0 0
0 Dynamic-fasteoi blkif-backend
133: 25 0 0 0 0
0 Dynamic-fasteoi dev
dev (the last one) was the one i tested restarting.. i ran a bonnie++
for testing
The only things which are devided uppon irqs are:
72: 1714878151 61618469 345335242 35614003 27627578
29681845 Dynamic-percpu timer
73: 2150718 1139885 4941560 618991 490155
491626 Dynamic-percpu ipi
RES: 2189003 1105537 4803667 577681 466929
444842 Rescheduling interrupts
CAL: 1846 34636 139105 41611 23420
46919 Function call interrupts
LCK: 133 89 299 70 60
61 Spinlock wakeups
MCP: 1 1 1 1 1
1 Machine check polls
Any other idea how we can make xen utilize the other (v)cpus for it''s
I/O stuff?
2012/7/12 Matthias
<matthias.kannenberg@googlemail.com>:> Just checked my /proc/interrupts:
>
> EVERY xen related thing (blkif-backend,the domUs, xhci_hcd, all
> xen-interfaces, evtchn:xenstored, evtchn:qemu-dm is all bound to cpu0,
> only zeros for the other cpu-cores..
>
> Sys is debian wheezy, xen is current testing from hg, kernel is
> openSuse3.4.2 with the xen patches they ship with it..
>
> will try your irqbalanced-suggestion next..
>
> 2012/7/12 Ian Campbell <ian.campbell@citrix.com>:
>> On Wed, 2012-07-11 at 09:56 -0400, Matthias wrote:
>>> Hi,
>>>
>>> I experianced the same.
>>>
>>> I figured the reason behind this has nothing to do with what
(v)cpus
>>> you assign to the domUs. In fact, I think that the real reason is
that
>>> all I/O emulation from the domUs (like HDD, Network and Stuff) is
all
>>> handled by CPU0 of dom0, even in a multi-core-architecture.
>>
>> This should not be the case. I/O emulation is done by qemu running in
>> domain 0 and should be scheduled on any dom0 vcpu. If it is not then
>> this is something to investigate. Normally this would require explicit
>> admin action to pin the affinity of the process though.
>>
>> One easy thing to look at would be /proc/interrupts to check that the
>> irq associated with ioreq upcalls for each guest are being properly
>> balanced (if not the installing irqbalanced might help)
>>
>> Ian.
>>
>>>
>>> You can test this with trying something I/O intensive like a hdd
>>> benchmark in the domU and you will see the dom0-cpu0 utilization
>>> rising.
>>>
>>> Using PVHVM or GPLPV-Driver only reduces the amount of utilization
but
>>> it is still handled by cpu0 only. Due to that fact i leave cpu0 to
my
>>> dom0 alone and only assign the other cpu-cores to the domUs.
>>>
>>>
>>> Note: This is only an assumption I made after seeing my machines
>>> behaviour. If the above isn''t true, please correct me.
>>>
>>>
>>> 2012/7/11 Rajesh Kumar <rajesh@hiox.com>:
>>> > Hi All,
>>> >
>>> > We have a xen server and using 8 core processor.
>>> >
>>> > I can see that there is 99% iowait on only core 0.
>>> >
>>> >
>>> > 02:28:49 AM CPU %user %nice %sys %iowait %irq
%soft %steal %idle intr/s
>>> > 02:28:54 AM all 0.00 0.00 0.00 12.65 0.00
0.02 2.24 85.08 1359.88
>>> > 02:28:54 AM 0 0.00 0.00 0.00 96.21 0.00
0.20 3.19 0.40 847.11
>>> > 02:28:54 AM 1 0.00 0.00 0.00 6.41 0.00
0.00 9.42 84.17 219.56
>>> > 02:28:54 AM 2 0.00 0.00 0.00 0.00 0.00
0.00 0.00 100.00 2.59
>>> > 02:28:54 AM 3 0.00 0.00 0.00 0.00 0.00
0.00 2.12 97.88 76.25
>>> > 02:28:54 AM 4 0.00 0.00 0.00 0.00 0.00
0.00 1.20 98.80 118.56
>>> > 02:28:54 AM 5 0.00 0.00 0.00 0.00 0.00
0.00 0.00 100.00 3.59
>>> > 02:28:54 AM 6 0.00 0.00 0.00 0.00 0.00
0.00 2.02 97.98 89.62
>>> > 02:28:54 AM 7 0.00 0.00 0.00 0.00 0.00
0.00 0.20 99.80 2.59
>>> >
>>> > 02:28:54 AM CPU %user %nice %sys %iowait %irq
%soft %steal %idle intr/s
>>> > 02:28:59 AM all 0.00 0.00 0.00 12.48 0.00
0.00 2.78 84.74 1317.43
>>> > 02:28:59 AM 0 0.00 0.00 0.00 98.80 0.00
0.00 0.80 0.40 885.17
>>> > 02:28:59 AM 1 0.00 0.00 0.00 0.00 0.00
0.00 11.38 88.62 151.30
>>> > 02:28:59 AM 2 0.00 0.00 0.00 0.00 0.00
0.00 0.20 99.80 2.81
>>> > 02:28:59 AM 3 0.00 0.00 0.00 0.00 0.00
0.00 7.21 92.79 94.79
>>> > 02:28:59 AM 4 0.00 0.00 0.00 0.00 0.00
0.00 2.20 97.80 170.34
>>> > 02:28:59 AM 5 0.00 0.00 0.00 0.00 0.00
0.00 0.00 100.00 4.41
>>> > 02:28:59 AM 6 0.00 0.00 0.00 0.00 0.00
0.00 0.00 100.00 5.81
>>> > 02:28:59 AM 7 0.00 0.00 0.00 0.00 0.00
0.00 0.00 100.00 2.81
>>> >
>>> >
>>> > I have even tries changing the cpus mapped to the DomU with
out no effect.
>>> >
>>> > Name CPU CPU Affinity
>>> > 4pulse 1 1
>>> > 2music 3 3
>>> > Domain-0 0 0
>>> > Domain-0 1 1
>>> > Domain-0 2 2
>>> > Domain-0 3 3
>>> > Domain-0 4 4
>>> > Domain-0 5 5
>>> > Domain-0 6 6
>>> > Domain-0 7 7
>>> > analshah 6 any cpu
>>> > arunvelayudhan 7 any cpu
>>> > backup 7 any cpu
>>> > crickruns 3 1-3
>>> > crickruns 2 1-3
>>> > crickruns 1 1-3
>>> > crickruns 2 1-3
>>> > crickruns 1 1-3
>>> > crickruns 1 1-3
>>> > dedicatedjv 7 any cpu
>>> > yeluthu 4 3-5
>>> > yeluthu 3 3-5
>>> > yeluthu 3 3-5
>>> > yeluthu 3 3-5
>>> > yeluthu 3 3-5
>>> > yeluthu 3 3-5
>>> > freshnfresh 3 any cpu
>>> > monitoring 7 any cpu
>>> > reporter 6 5-7
>>> > reporter 7 5-7
>>> > reporter 6 5-7
>>> > reporter 6 5-7
>>> > reporter 7 5-7
>>> > reporter 7 5-7
>>> > reporter 5 5-7
>>> > reporter 7 5-7
>>> > radio03 7 any cpu
>>> > saampeter 2 1-2
>>> > saampeter 2 1-2
>>> >
>>> >
>>> > Thanks,
>>> > Rajesh
>>> > _______________________________________________
>>> > Xen-users mailing list
>>> > Xen-users@lists.xen.org
>>> > http://lists.xen.org/xen-users
>>>
>>> _______________________________________________
>>> Xen-users mailing list
>>> Xen-users@lists.xen.org
>>> http://lists.xen.org/xen-users
>>
>>
On Thu, 2012-07-12 at 13:04 -0400, Matthias wrote:> > Any other idea how we can make xen utilize the other (v)cpus for it''s > I/O stuff?Are you sure irqbalanced is running? Some versions had a bug and would crash on a Xen system (they crash if there is no irq 0 or something like that). Even if it is running it can take some time for irqbalanced to realise that things are unbalanced and start moving stuff around. There are ways in Linux to manually balance IRQs. You have to much around with /proc/irq/*/smp_affinity* Really irqbalanced should be doing this for you though. Ian.
Hi Ian, sorry, but it''s still not working.. I rebooted the server, then manually started irqbalanced just to be sure, then started xen and the domUs.. still the same as above: every I/O related is done by cpu0 (even after hours of running 5 domUs). i see some spikes on the other cpu''s, but i think this is due to domUs using their assigned vcpus and xentop shows a distribution of cputime, too.. I started irqbalance in debug modus once and it complained that my hardware is not numa compatible. Might this be an issue? From what I read, xen should support proper scheduling on non-numa hardware and only the numa-support is new and might be a bit quirky.. checked my smp_affinity stuff then: currently, it shows the following for a domU smp_affinity: 01 smp_affinity_list: 0 tried to change smp_affinity to 3f (=111111 for my 6vcpu) and it was changed immediately back to 01.. thought that was irqbalance going rogue but after stopd the deamon, this still happens.. so my take is: something is setting the irq to only use cpu0 and changes i do manually or which are done by irqbalance are overwritten constantly making irqbalance useless.. Any idea what this can be? supposently something within xen? 2012/7/12 Ian Campbell <ian.campbell@citrix.com>:> On Thu, 2012-07-12 at 13:04 -0400, Matthias wrote: >> >> Any other idea how we can make xen utilize the other (v)cpus for it''s >> I/O stuff? > > Are you sure irqbalanced is running? Some versions had a bug and would > crash on a Xen system (they crash if there is no irq 0 or something like > that). Even if it is running it can take some time for irqbalanced to > realise that things are unbalanced and start moving stuff around. > > There are ways in Linux to manually balance IRQs. You have to much > around with /proc/irq/*/smp_affinity* > > Really irqbalanced should be doing this for you though. > > Ian. > >
Found a hint somewhere that my problem might be related to the CONFIG_HOTPLUG_CPU kernel option.. recompiling the kernel right now and will update tomorrow if this resolved the issue.. Am 13.07.2012 00:27 schrieb "Matthias" <matthias.kannenberg@googlemail.com>:> Hi Ian, > > sorry, but it''s still not working.. > > I rebooted the server, then manually started irqbalanced just to be > sure, then started xen and the domUs.. still the same as above: every > I/O related is done by cpu0 (even after hours of running 5 domUs). i > see some spikes on the other cpu''s, but i think this is due to domUs > using their assigned vcpus and xentop shows a distribution of cputime, > too.. > > I started irqbalance in debug modus once and it complained that my > hardware is not numa compatible. Might this be an issue? From what I > read, xen should support proper scheduling on non-numa hardware and > only the numa-support is new and might be a bit quirky.. > > checked my smp_affinity stuff then: currently, it shows the following for > a domU > smp_affinity: 01 > smp_affinity_list: 0 > > tried to change smp_affinity to 3f (=111111 for my 6vcpu) and it was > changed immediately back to 01.. thought that was irqbalance going > rogue but after stopd the deamon, this still happens.. > > so my take is: something is setting the irq to only use cpu0 and > changes i do manually or which are done by irqbalance are overwritten > constantly making irqbalance useless.. > > Any idea what this can be? supposently something within xen? > > > > 2012/7/12 Ian Campbell <ian.campbell@citrix.com>: > > On Thu, 2012-07-12 at 13:04 -0400, Matthias wrote: > >> > >> Any other idea how we can make xen utilize the other (v)cpus for it''s > >> I/O stuff? > > > > Are you sure irqbalanced is running? Some versions had a bug and would > > crash on a Xen system (they crash if there is no irq 0 or something like > > that). Even if it is running it can take some time for irqbalanced to > > realise that things are unbalanced and start moving stuff around. > > > > There are ways in Linux to manually balance IRQs. You have to much > > around with /proc/irq/*/smp_affinity* > > > > Really irqbalanced should be doing this for you though. > > > > Ian. > > > > >_______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
Some updates: - Deactivating CONFIg_HOTPlUG_CPU and suspend to ram in kernel didn''T change anything. - I tried to change the smp_affinity to just a single other cpu and this worked. So basically I was wrong that there is something changing the settings back constantly but linux is simply discarding every assignment to more then one cpu (1,2,4, etc works, 3,5,etc is discarded) so the problem is that my linux can not assign irqs to multiple cpus. I think this is also why irqbalance does not make a difference because it tries the same. I will check if i can reproduce this behaviour with a stock kernel without xen or if this is really xen related in the evening. But if you have any other idea what could cause this, I''m open for suggestions @Rajesh can you check with your setup if you have the same case or if this is a different problem? simply do a ''cat /proc/interrupts'' to check the cpu affinity and if everything is done on cpu0 try a ''echo "3" > /proc/irq/<some irq number from the other command>/smp_affinity'' and afterwards check if the smp_affinity now really has ''3'' as content. 2012/7/13 Matthias <matthias.kannenberg@googlemail.com>:> Found a hint somewhere that my problem might be related to the > CONFIG_HOTPLUG_CPU kernel option.. recompiling the kernel right now and will > update tomorrow if this resolved the issue.. > > Am 13.07.2012 00:27 schrieb "Matthias" <matthias.kannenberg@googlemail.com>: > >> Hi Ian, >> >> sorry, but it''s still not working.. >> >> I rebooted the server, then manually started irqbalanced just to be >> sure, then started xen and the domUs.. still the same as above: every >> I/O related is done by cpu0 (even after hours of running 5 domUs). i >> see some spikes on the other cpu''s, but i think this is due to domUs >> using their assigned vcpus and xentop shows a distribution of cputime, >> too.. >> >> I started irqbalance in debug modus once and it complained that my >> hardware is not numa compatible. Might this be an issue? From what I >> read, xen should support proper scheduling on non-numa hardware and >> only the numa-support is new and might be a bit quirky.. >> >> checked my smp_affinity stuff then: currently, it shows the following for >> a domU >> smp_affinity: 01 >> smp_affinity_list: 0 >> >> tried to change smp_affinity to 3f (=111111 for my 6vcpu) and it was >> changed immediately back to 01.. thought that was irqbalance going >> rogue but after stopd the deamon, this still happens.. >> >> so my take is: something is setting the irq to only use cpu0 and >> changes i do manually or which are done by irqbalance are overwritten >> constantly making irqbalance useless.. >> >> Any idea what this can be? supposently something within xen? >> >> >> >> 2012/7/12 Ian Campbell <ian.campbell@citrix.com>: >> > On Thu, 2012-07-12 at 13:04 -0400, Matthias wrote: >> >> >> >> Any other idea how we can make xen utilize the other (v)cpus for it''s >> >> I/O stuff? >> > >> > Are you sure irqbalanced is running? Some versions had a bug and would >> > crash on a Xen system (they crash if there is no irq 0 or something like >> > that). Even if it is running it can take some time for irqbalanced to >> > realise that things are unbalanced and start moving stuff around. >> > >> > There are ways in Linux to manually balance IRQs. You have to much >> > around with /proc/irq/*/smp_affinity* >> > >> > Really irqbalanced should be doing this for you though. >> > >> > Ian. >> > >> >
Hello, On 07/13/12 10:14, Matthias wrote:> Some updates: > > - Deactivating CONFIg_HOTPlUG_CPU and suspend to ram in kernel didn''T > change anything. > - I tried to change the smp_affinity to just a single other cpu and > this worked. So basically I was wrong that there is something changing > the settings back constantly but linux is simply discarding every > assignment to more then one cpu (1,2,4, etc works, 3,5,etc is > discarded) > > so the problem is that my linux can not assign irqs to multiple cpus. > I think this is also why irqbalance does not make a difference because > it tries the same. > > I will check if i can reproduce this behaviour with a stock kernel > without xen or if this is really xen related in the evening. But if > you have any other idea what could cause this, I''m open for > suggestions > > @Rajesh can you check with your setup if you have the same case or if > this is a different problem? simply do a ''cat /proc/interrupts'' to > check the cpu affinity and if everything is done on cpu0 try a ''echo > "3" > /proc/irq/<some irq number from the other command>/smp_affinity'' > and afterwards check if the smp_affinity now really has ''3'' as > content.I have some experience with some irq affinity problems, but never had such behaviour. What I have seen : - some old server that really balance single irq between all CPUs. (No idea how, just happy that this server does... could not reproduce it on other servers I tryed to work with... no irqbalance...) - all my other servers (VM or not) do NOT balance irq between multiple CPU when configured to. I can configure a single irq with smp_affinity that *should* send it to multiple CPU, but it just goes to one of those (I managed to find servers where it was first, some where it was last... main difference at the time was Intel/AMD cpu... not sure if it still stands with more recent servers). From what irqbalance seems to do on our servers, it just check cpu loads, and irq loads... trying to re-balance them explicitly so often. The idea is nice, and would probably be a nice thing on a server having behaviour that often changes... not that great on servers that have always the same IRQ activity. For our servers with high IRQ activity, we made a script that explicitly balance the irq/smp_affinity to something static we want... getting the heavy interrupts alone on their cpu, and letting the tons of light irq on the default. My suggestions would be : - try to observe CPU usage for each kind of IRQ on your server - probably a good idea to pin your domU out of dom0 intensive cpu (and maybe pin your dom0 to just a few CPU... where you would never send domU?) - balance staticly your IRQ Now, if you do get massive CPU usage from just ONE irq, you have a problem, and need to clearly identify what is generating that usage, and find a way to get more distinct IRQ to handle the same work. We actually did get that kind of problem with our gateway, it was rx/tx from our network cards... we changed server, getting network cards that do get more IRQ per interface. Regards, -- Adrien Urban
Here is my "cat /proc/interrupts" output
CPU0 CPU1 CPU2 CPU3
CPU4 CPU5 CPU6 CPU7
1: 8 0 0 0 0 0
0 0 Phys-irq i8042
8: 0 0 0 0 0 0
0 0 Phys-irq rtc
9: 0 0 0 0 0 0
0 0 Phys-irq acpi
12: 105 0 0 0 0 0
0 0 Phys-irq i8042
17: 0 0 0 0 0 0
0 0 Phys-irq uhci_hcd:usb3
18: 0 0 0 0 0 0
0 0 Phys-irq ehci_hcd:usb1, uhci_hcd:usb8
19: 89 0 0 0 0 0
0 0 Phys-irq ehci_hcd:usb2, uhci_hcd:usb6
20: 0 0 0 0 0 0
0 0 Phys-irq uhci_hcd:usb4
21: 0 0 0 0 0 0
0 0 Phys-irq uhci_hcd:usb5, uhci_hcd:usb7
22: 80057205 0 0 0 0 0
40990 149827 Phys-irq aacraid
248: 45804593 0 0 0 0 0
0 223 Phys-irq peth0
249: 0 0 0 0 0 0
0 0 Phys-irq ahci
256: 34539669 0 0 0 0 0
0 0 Dynamic-irq timer0
257: 41413620 0 0 0 0 0
0 0 Dynamic-irq resched0
258: 85 0 0 0 0 0
0 0 Dynamic-irq callfunc0
259: 0 41166287 0 0 0 0
0 0 Dynamic-irq resched1
260: 0 933 0 0 0 0
0 0 Dynamic-irq callfunc1
261: 0 19504489 0 0 0 0
0 0 Dynamic-irq timer1
262: 0 0 31251343 0 0 0
0 0 Dynamic-irq resched2
263: 0 0 932 0 0 0
0 0 Dynamic-irq callfunc2
264: 0 0 11054510 0 0 0
0 0 Dynamic-irq timer2
265: 0 0 0 27386916 0 0
0 0 Dynamic-irq resched3
266: 0 0 0 932 0 0
0 0 Dynamic-irq callfunc3
267: 0 0 0 11444950 0 0
0 0 Dynamic-irq timer3
268: 0 0 0 0 58130277 0
0 0 Dynamic-irq resched4
269: 0 0 0 0 910 0
0 0 Dynamic-irq callfunc4
270: 0 0 0 0 28261822 0
0 0 Dynamic-irq timer4
271: 0 0 0 0 0 43997038
0 0 Dynamic-irq resched5
272: 0 0 0 0 0 923
0 0 Dynamic-irq callfunc5
273: 0 0 0 0 0 15338193
0 0 Dynamic-irq timer5
274: 0 0 0 0 0 0
48900209 0 Dynamic-irq resched6
275: 0 0 0 0 0 0
923 0 Dynamic-irq callfunc6
276: 0 0 0 0 0 0
15407236 0 Dynamic-irq timer6
277: 0 0 0 0 0 0
0 50136483 Dynamic-irq resched7
278: 0 0 0 0 0 0
0 900 Dynamic-irq callfunc7
279: 0 0 0 0 0 0
0 16601995 Dynamic-irq timer7
280: 15653 3318 8 0 0 0
0 0 Dynamic-irq xenbus
281: 0 0 0 0 0 0
0 0 Dynamic-irq console
282: 148849 101244 11648 8306 5326 2336
220 93 Dynamic-irq blkif-backend
283: 6 0 0 0 0 0
0 0 Dynamic-irq blkif-backend
284: 65895 47506 23079 29645 23784 11350
246 147 Dynamic-irq vif1.0
285: 1863994 691202 63261 36091 15258 4937
437 193 Dynamic-irq blkif-backend
286: 213220 119178 7103 5451 2557 517
19 0 Dynamic-irq blkif-backend
287: 930007 387710 35444 16389 11486 4540
198 229 Dynamic-irq vif19.0
288: 635545 409092 69409 40913 24484 8576
741 136 Dynamic-irq blkif-backend
289: 7137631 1115254 173115 71776 29668 12043
1008 2289 Dynamic-irq blkif-backend
290: 31508 30372 21679 16886 8057 2840
74 252 Dynamic-irq vif3.0
291: 5246 16885 29753 36947 28554 12073
406 1 Dynamic-irq vif4.0
292: 47906 36257 8923 10614 7908 2954
3 6 Dynamic-irq blkif-backend
293: 1 0 0 0 0 0
0 0 Dynamic-irq blkif-backend
294: 17554 15779 10298 10228 7353 4006
114 58 Dynamic-irq blkif-backend
295: 23 0 0 0 0 0
0 0 Dynamic-irq blkif-backend
296: 21802 13356 19719 24348 19953 8304
113 0 Dynamic-irq vif5.0
297: 273660 179974 15464 8548 4237 1913
21 34 Dynamic-irq blkif-backend
298: 3 0 0 0 0 0
0 0 Dynamic-irq blkif-backend
299: 7652040 1568396 290845 170934 76196 35670
3622 4062 Dynamic-irq vif6.0
300: 329614 141967 17636 4723 2181 696
43 48 Dynamic-irq blkif-backend
301: 183112 77217 7104 4301 2413 257
0 2 Dynamic-irq blkif-backend
302: 328166 181810 18838 8180 4534 1045
256 108 Dynamic-irq vif7.0
303: 166185 117000 16767 12453 10239 4296
96 16 Dynamic-irq blkif-backend
304: 18 0 0 0 0 0
0 0 Dynamic-irq blkif-backend
305: 74460 48748 4299 2838 1670 592
47 18 Dynamic-irq blkif-backend
306: 17 0 0 0 0 0
0 0 Dynamic-irq blkif-backend
307: 849646 410362 42829 20108 18286 2718
147 73 Dynamic-irq blkif-backend
308: 3031 4122 413 118 280 1
0 0 Dynamic-irq blkif-backend
309: 454672 211984 11278 13235 4483 1949
106 307 Dynamic-irq blkif-backend
310: 54 2 0 0 1 0
0 0 Dynamic-irq blkif-backend
311: 151460 100790 10438 8542 3677 1836
130 91 Dynamic-irq blkif-backend
312: 17 0 0 0 0 0
0 0 Dynamic-irq blkif-backend
313: 289222 147029 13023 6293 3359 997
32 38 Dynamic-irq blkif-backend
314: 15919 13241 297 219 45 67
1 234 Dynamic-irq blkif-backend
315: 171629 139277 11409 8646 6430 2920
82 170 Dynamic-irq vif17.0
316: 1684637 884290 70555 43332 22190 6997
1874 2528 Dynamic-irq vif20.0
317: 5997001 1669698 253862 124739 53491 14573
3931 2018 Dynamic-irq vif13.0
318: 196460 112218 7607 4688 3110 959
203 283 Dynamic-irq vif18.0
319: 70632 50590 19755 22438 18394 7733
336 156 Dynamic-irq vif14.0
320: 430659 270256 20899 13806 7980 2998
260 146 Dynamic-irq vif15.0
NMI: 0 0 0 0 0 0
0 0
LOC: 0 0 0 0 0 0
0 0
ERR: 0
Also here are few smp_affinity values
root@hiox-vps ~]# cat /proc/irq/316/smp_affinity
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000004
[root@hiox-vps ~]# cat /proc/irq/316/smp_affinity
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000004
[root@hiox-vps ~]# cat /proc/irq/312/smp_affinity
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
[root@hiox-vps ~]# cat /proc/irq/317/smp_affinity
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000002
[root@hiox-vps ~]# cat /proc/irq/318/smp_affinity
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000002
[root@hiox-vps ~]# cat /proc/irq/300/smp_affinity
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000002
[root@hiox-vps ~]# cat /proc/irq/260/smp_affinity
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000002
On 13-Jul-2012, at 1:44 PM, Matthias wrote:
> Some updates:
>
> - Deactivating CONFIg_HOTPlUG_CPU and suspend to ram in kernel
didn''T
> change anything.
> - I tried to change the smp_affinity to just a single other cpu and
> this worked. So basically I was wrong that there is something changing
> the settings back constantly but linux is simply discarding every
> assignment to more then one cpu (1,2,4, etc works, 3,5,etc is
> discarded)
>
> so the problem is that my linux can not assign irqs to multiple cpus.
> I think this is also why irqbalance does not make a difference because
> it tries the same.
>
> I will check if i can reproduce this behaviour with a stock kernel
> without xen or if this is really xen related in the evening. But if
> you have any other idea what could cause this, I''m open for
> suggestions
>
> @Rajesh can you check with your setup if you have the same case or if
> this is a different problem? simply do a ''cat
/proc/interrupts'' to
> check the cpu affinity and if everything is done on cpu0 try a
''echo
> "3" > /proc/irq/<some irq number from the other
command>/smp_affinity''
> and afterwards check if the smp_affinity now really has
''3'' as
> content.
>
> 2012/7/13 Matthias <matthias.kannenberg@googlemail.com>:
>> Found a hint somewhere that my problem might be related to the
>> CONFIG_HOTPLUG_CPU kernel option.. recompiling the kernel right now and
will
>> update tomorrow if this resolved the issue..
>>
>> Am 13.07.2012 00:27 schrieb "Matthias"
<matthias.kannenberg@googlemail.com>:
>>
>>> Hi Ian,
>>>
>>> sorry, but it''s still not working..
>>>
>>> I rebooted the server, then manually started irqbalanced just to be
>>> sure, then started xen and the domUs.. still the same as above:
every
>>> I/O related is done by cpu0 (even after hours of running 5 domUs).
i
>>> see some spikes on the other cpu''s, but i think this is
due to domUs
>>> using their assigned vcpus and xentop shows a distribution of
cputime,
>>> too..
>>>
>>> I started irqbalance in debug modus once and it complained that my
>>> hardware is not numa compatible. Might this be an issue? From what
I
>>> read, xen should support proper scheduling on non-numa hardware and
>>> only the numa-support is new and might be a bit quirky..
>>>
>>> checked my smp_affinity stuff then: currently, it shows the
following for
>>> a domU
>>> smp_affinity: 01
>>> smp_affinity_list: 0
>>>
>>> tried to change smp_affinity to 3f (=111111 for my 6vcpu) and it
was
>>> changed immediately back to 01.. thought that was irqbalance going
>>> rogue but after stopd the deamon, this still happens..
>>>
>>> so my take is: something is setting the irq to only use cpu0 and
>>> changes i do manually or which are done by irqbalance are
overwritten
>>> constantly making irqbalance useless..
>>>
>>> Any idea what this can be? supposently something within xen?
>>>
>>>
>>>
>>> 2012/7/12 Ian Campbell <ian.campbell@citrix.com>:
>>>> On Thu, 2012-07-12 at 13:04 -0400, Matthias wrote:
>>>>>
>>>>> Any other idea how we can make xen utilize the other
(v)cpus for it''s
>>>>> I/O stuff?
>>>>
>>>> Are you sure irqbalanced is running? Some versions had a bug
and would
>>>> crash on a Xen system (they crash if there is no irq 0 or
something like
>>>> that). Even if it is running it can take some time for
irqbalanced to
>>>> realise that things are unbalanced and start moving stuff
around.
>>>>
>>>> There are ways in Linux to manually balance IRQs. You have to
much
>>>> around with /proc/irq/*/smp_affinity*
>>>>
>>>> Really irqbalanced should be doing this for you though.
>>>>
>>>> Ian.
>>>>
>>>>
Hi Rajesh, thank you for the input. This is quite interesting! It seems that your IRQs are also only set to fixt vcpus and not using the "use one of the following.."-options, but that there is something managing or loadbalancing your IRQs from time to time and splitting them between cores. So it seems that we have different problems and my initially suggestions might not be valid in your case. In the meantime i could pinpoint my porblem further down xomparing Rajesh''s output with mine and running irqbalance in debug mode: irqbalance actually works on my system and also i''m able to set an IRQ for using all 6 cores (with the 3f-mask in smp_affinity), but only for IRQs who are not of type ''Dynamic-fasteoi'' which all xen-related are. Wenn i run irqbalance in debug mode it lists all irqs it controls and I see that it ignores all dynamic-fasteoi. Rajesh''s xen IRQs are of type ''Dynamic-irq'' and there it works. From the limited information I have that was the only difference I could come up which seems to makes sense, even though i can''t find any hint on what ''Dynamic-fasteoi'' is. Neither google, nor the xen or my kernel sources show any hint of it. I would love to get some pointers on that one because at the moment i''m a little out of ideas..