Hi,
Was doing some netperf tests and noticed that all the interrupts (for
network) were being serviced by pcp0 although dom0 was configured to use all
the pcpus [4vcpus].
Questions:
1) Is there a way dom0 can be configured to process the interrupts using
4pcpus instead of just one?
2) What is the recommended scheduling policy for a network I/O intensive
workload?
Sample output:
cat /proc/xen/interrupts
 cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3
  1:          8          0          0          0        Phys-irq  i8042
  8:          1          0          0          0        Phys-irq  rtc
  9:          0          0          0          0        Phys-irq  acpi
 11:          0          0          0          0        Phys-irq
ohci_hcd:usb1
 12:        105          0          0          0        Phys-irq  i8042
 14:     302432          0          0          0        Phys-irq  ide0
 16:        375          0          0          0        Phys-irq  aic7xxx
 17:      34719          0          0          0        Phys-irq  cciss0
 18:      53158          0          0          0        Phys-irq  eth0
 19:          2          0          0          0        Phys-irq  peth1
 20:    1062076          0          0          0        Phys-irq  peth2
<<< --------Was using this
 21:      25189          0          0          0        Phys-irq  peth3
 22:      18846          0          0          0        Phys-irq  peth4
 23:      18682          0          0          0        Phys-irq  eth5
256:    1456444          0          0          0     Dynamic-irq  timer0
257:      52873          0          0          0     Dynamic-irq  resched0
258:        282          0          0          0     Dynamic-irq  callfunc0
259:          0       7935          0          0     Dynamic-irq  resched1
260:          0      33665          0          0     Dynamic-irq  callfunc1
261:          0     508258          0          0     Dynamic-irq  timer1
262:          0          0       3827          0     Dynamic-irq  resched2
263:          0          0      33835          0     Dynamic-irq  callfunc2
264:          0          0     390316          0     Dynamic-irq  timer2
265:          0          0          0      43953     Dynamic-irq  resched3
266:          0          0          0      33870     Dynamic-irq  callfunc3
267:          0          0          0     311447     Dynamic-irq  timer3
268:       5091          0          0          0     Dynamic-irq  xenbus
269:          0          0          0          0     Dynamic-irq  console
270:      31532          0          0          0     Dynamic-irq
blkif-backend
271:    1107498          0          0          0     Dynamic-irq  vif7.0
NMI:          0          0          0          0
LOC:          0          0          0          0
ERR:          0
MIS:          0
Ran multiple while loops to confirm that dom0 can use all the 4pcpus if
required. And so believe it must be something to do with the way the
interrupts are being handled.
Thanks in advance,
hmv
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Keir Fraser
2006-May-22  16:52 UTC
Re: [Xen-devel] Question Also regarding interrupt balancing
On 22 May 2006, at 17:43, harish wrote:> Hi, > Was doing some netperf tests and noticed that all the interrupts (for > network) were being serviced by pcp0 although dom0 was configured to > use all the pcpus [4vcpus]. > Questions: > 1) Is there a way dom0 can be configured to process the interrupts > using 4pcpus instead of just one?Running irqbalance daemon in dom0 should do the trick. If there''s no other load in domain0 though, irqbalance may decide not to change irq affinity. There''s no way to do fine-grained interrupt balancing (e.g., round-robin interrupts).> 2) What is the recommended scheduling policy for a network I/O > intensive workload?If delivering to a domU, you want dom0 and domU running on different CPUs or CPU will be the bottleneck. -- Keir> Sample output: > > cat /proc/xen/interrupts > > cat /proc/interrupts > CPU0 CPU1 CPU2 CPU3 > 1: 8 0 0 0 Phys-irq > i8042 > 8: 1 0 0 0 Phys-irq rtc > 9: 0 0 0 0 Phys-irq acpi > 11: 0 0 0 0 Phys-irq > ohci_hcd:usb1 > 12: 105 0 0 0 Phys-irq > i8042 > 14: 302432 0 0 0 Phys-irq ide0 > 16: 375 0 0 0 Phys-irq > aic7xxx > 17: 34719 0 0 0 Phys-irq > cciss0 > 18: 53158 0 0 0 Phys-irq eth0 > 19: 2 0 0 0 Phys-irq > peth1 > 20: 1062076 0 0 0 Phys-irq > peth2 <<< --------Was using this > 21: 25189 0 0 0 Phys-irq > peth3 > 22: 18846 0 0 0 Phys-irq > peth4 > 23: 18682 0 0 0 Phys-irq eth5 > 256: 1456444 0 0 0 Dynamic-irq > timer0 > 257: 52873 0 0 0 Dynamic-irq > resched0 > 258: 282 0 0 0 Dynamic-irq > callfunc0 > 259: 0 7935 0 0 Dynamic-irq > resched1 > 260: 0 33665 0 0 Dynamic-irq > callfunc1 > 261: 0 508258 0 0 Dynamic-irq > timer1 > 262: 0 0 3827 0 Dynamic-irq > resched2 > 263: 0 0 33835 0 Dynamic-irq > callfunc2 > 264: 0 0 390316 0 Dynamic-irq > timer2 > 265: 0 0 0 43953 Dynamic-irq > resched3 > 266: 0 0 0 33870 Dynamic-irq > callfunc3 > 267: 0 0 0 311447 Dynamic-irq > timer3 > 268: 5091 0 0 0 Dynamic-irq > xenbus > 269: 0 0 0 0 Dynamic-irq > console > 270: 31532 0 0 0 Dynamic-irq > blkif-backend > 271: 1107498 0 0 0 Dynamic-irq > vif7.0 > NMI: 0 0 0 0 > LOC: 0 0 0 0 > ERR: 0 > MIS: 0 > > Ran multiple while loops to confirm that dom0 can use all the 4pcpus > if required. And so believe it must be something to do with the way > the interrupts are being handled. > > Thanks in advance, > hmv > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi Keir, Apologize for the delayed response. Is it possible to use smp_afffinity to pin the interrupts to specific pcpus. In my machine: cat /proc/irq/23/smp_affinity 0f I tried echo "2" > /proc/irq/23/smp_affinity [with the hope that the interrupts get routed to pcpu1] But do not see the new value taking effect. I am missing something. thanks in advance, harish On 5/22/06, Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote:> > > On 22 May 2006, at 17:43, harish wrote: > > > Hi, > > Was doing some netperf tests and noticed that all the interrupts (for > > network) were being serviced by pcp0 although dom0 was configured to > > use all the pcpus [4vcpus]. > > Questions: > > 1) Is there a way dom0 can be configured to process the interrupts > > using 4pcpus instead of just one? > > Running irqbalance daemon in dom0 should do the trick. If there''s no > other load in domain0 though, irqbalance may decide not to change irq > affinity. There''s no way to do fine-grained interrupt balancing (e.g., > round-robin interrupts). > > > 2) What is the recommended scheduling policy for a network I/O > > intensive workload? > > If delivering to a domU, you want dom0 and domU running on different > CPUs or CPU will be the bottleneck. > > -- Keir > > > Sample output: > > > > cat /proc/xen/interrupts > > > > cat /proc/interrupts > > CPU0 CPU1 CPU2 CPU3 > > 1: 8 0 0 0 Phys-irq > > i8042 > > 8: 1 0 0 0 Phys-irq rtc > > 9: 0 0 0 0 Phys-irq acpi > > 11: 0 0 0 0 Phys-irq > > ohci_hcd:usb1 > > 12: 105 0 0 0 Phys-irq > > i8042 > > 14: 302432 0 0 0 Phys-irq ide0 > > 16: 375 0 0 0 Phys-irq > > aic7xxx > > 17: 34719 0 0 0 Phys-irq > > cciss0 > > 18: 53158 0 0 0 Phys-irq eth0 > > 19: 2 0 0 0 Phys-irq > > peth1 > > 20: 1062076 0 0 0 Phys-irq > > peth2 <<< --------Was using this > > 21: 25189 0 0 0 Phys-irq > > peth3 > > 22: 18846 0 0 0 Phys-irq > > peth4 > > 23: 18682 0 0 0 Phys-irq eth5 > > 256: 1456444 0 0 0 Dynamic-irq > > timer0 > > 257: 52873 0 0 0 Dynamic-irq > > resched0 > > 258: 282 0 0 0 Dynamic-irq > > callfunc0 > > 259: 0 7935 0 0 Dynamic-irq > > resched1 > > 260: 0 33665 0 0 Dynamic-irq > > callfunc1 > > 261: 0 508258 0 0 Dynamic-irq > > timer1 > > 262: 0 0 3827 0 Dynamic-irq > > resched2 > > 263: 0 0 33835 0 Dynamic-irq > > callfunc2 > > 264: 0 0 390316 0 Dynamic-irq > > timer2 > > 265: 0 0 0 43953 Dynamic-irq > > resched3 > > 266: 0 0 0 33870 Dynamic-irq > > callfunc3 > > 267: 0 0 0 311447 Dynamic-irq > > timer3 > > 268: 5091 0 0 0 Dynamic-irq > > xenbus > > 269: 0 0 0 0 Dynamic-irq > > console > > 270: 31532 0 0 0 Dynamic-irq > > blkif-backend > > 271: 1107498 0 0 0 Dynamic-irq > > vif7.0 > > NMI: 0 0 0 0 > > LOC: 0 0 0 0 > > ERR: 0 > > MIS: 0 > > > > Ran multiple while loops to confirm that dom0 can use all the 4pcpus > > if required. And so believe it must be something to do with the way > > the interrupts are being handled. > > > > Thanks in advance, > > hmv > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Jun-10  08:39 UTC
Re: [Xen-devel] Question Also regarding interrupt balancing
On 9 Jun 2006, at 19:39, harish wrote:> Is it possible to use smp_afffinity to pin the interrupts to specific > pcpus. > > In my machine: > > cat /proc/irq/23/smp_affinity > 0f > > I tried > echo "2" > /proc/irq/23/smp_affinity [with the hope that the > interrupts get routed to pcpu1] > > But do not see the new value taking effect. I am missing something. > thanks in advance,That should definitely work and have an effect on /proc/interrupts. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi Keir, echo "2" > /proc/irq/23/smp_affinity does not seem to change the value in smp_affinity cat /proc/irq/23/smp_affinity still shows 0f Could there some bug or configuration problem that you can think of? thanks hmv On 6/10/06, Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote:> > > On 9 Jun 2006, at 19:39, harish wrote: > > > Is it possible to use smp_afffinity to pin the interrupts to specific > > pcpus. > > > > In my machine: > > > > cat /proc/irq/23/smp_affinity > > 0f > > > > I tried > > echo "2" > /proc/irq/23/smp_affinity [with the hope that the > > interrupts get routed to pcpu1] > > > > But do not see the new value taking effect. I am missing something. > > thanks in advance, > > That should definitely work and have an effect on /proc/interrupts. > > -- Keir > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Jun-11  09:09 UTC
Re: [Xen-devel] Question Also regarding interrupt balancing
On 10 Jun 2006, at 17:58, harish wrote:> echo "2" > /proc/irq/23/smp_affinity does not seem to change the > value in smp_affinity > cat /proc/irq/23/smp_affinity still shows 0f > > Could there some bug or configuration problem that you can think of?It was a bug, which I''ve just fixed in -unstable and -testing staging trees. When that reaches the public trees you should find that writing to smp_affinity has the usual effect, but note: 1. As when running on native, a request to change affinity is not processed until the next interrupt occurs for that irq. If the interrupt rate on that irq is very low, you may be able to observe old value in smp_affinity before it is changed. 2. You cannot change affinity of CPU-local interrupts (timer, resched, callfunc). Requests to do so are silently ignored. 3. If you try to set a multi-cpu cpumask for affinity, it will be changed to a single-cpu cpumask automatically. Linux-on-Xen does not automatically balance irq load across cpus -- that has to be done by a user-space daemon (e.g., irqbalance). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi Keir, Thanks a lot. I shall sync up to the latest unstable then. Shall keep you posted on how it goes. thanks harish On 6/11/06, Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote:> > > On 10 Jun 2006, at 17:58, harish wrote: > > > echo "2" > /proc/irq/23/smp_affinity does not seem to change the > > value in smp_affinity > > cat /proc/irq/23/smp_affinity still shows 0f > > > > Could there some bug or configuration problem that you can think of? > > It was a bug, which I''ve just fixed in -unstable and -testing staging > trees. When that reaches the public trees you should find that writing > to smp_affinity has the usual effect, but note: > 1. As when running on native, a request to change affinity is not > processed until the next interrupt occurs for that irq. If the > interrupt rate on that irq is very low, you may be able to observe old > value in smp_affinity before it is changed. > 2. You cannot change affinity of CPU-local interrupts (timer, resched, > callfunc). Requests to do so are silently ignored. > 3. If you try to set a multi-cpu cpumask for affinity, it will be > changed to a single-cpu cpumask automatically. Linux-on-Xen does not > automatically balance irq load across cpus -- that has to be done by a > user-space daemon (e.g., irqbalance). > > -- Keir > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi Keir, I used -unstable tree and noticed the following: cat /proc/interrupts | grep eth 18: 7580 0 0 0 Phys-irq eth0 19: 1 0 0 0 Phys-irq eth1 20: 1982 78 117 0 Phys-irq peth2 21: 18 0 0 1129 Phys-irq eth3 22: 67 1077 0 0 Phys-irq eth4 23: 12 1135 0 0 Phys-irq eth5 cat /proc/irq/20/smp_affinity 00000001 echo 2 > /proc/irq/20/smp_affinity [...works..] echo 4 > /proc/irq/20/smp_affinity [...works..] echo 8 > /proc/irq/20/smp_affinity [...works..] But, a cumulative does not work...meaning... echo 3> echo 5> echo f> etc.... do not work. Is that a bug or is it by design? thanks, harish On 6/11/06, Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote:> > > On 10 Jun 2006, at 17:58, harish wrote: > > > echo "2" > /proc/irq/23/smp_affinity does not seem to change the > > value in smp_affinity > > cat /proc/irq/23/smp_affinity still shows 0f > > > > Could there some bug or configuration problem that you can think of? > > It was a bug, which I''ve just fixed in -unstable and -testing staging > trees. When that reaches the public trees you should find that writing > to smp_affinity has the usual effect, but note: > 1. As when running on native, a request to change affinity is not > processed until the next interrupt occurs for that irq. If the > interrupt rate on that irq is very low, you may be able to observe old > value in smp_affinity before it is changed. > 2. You cannot change affinity of CPU-local interrupts (timer, resched, > callfunc). Requests to do so are silently ignored. > 3. If you try to set a multi-cpu cpumask for affinity, it will be > changed to a single-cpu cpumask automatically. Linux-on-Xen does not > automatically balance irq load across cpus -- that has to be done by a > user-space daemon (e.g., irqbalance). > > -- Keir > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Jun-13  09:05 UTC
Re: [Xen-devel] Question Also regarding interrupt balancing
On 13 Jun 2006, at 00:42, harish wrote:> echo 2 > /proc/irq/20/smp_affinity [...works..] > echo 4 > /proc/irq/20/smp_affinity [...works..] > echo 8 > /proc/irq/20/smp_affinity [...works..] > > But, a cumulative does not work...meaning... > echo 3> > echo 5> > echo f> etc.... do not work. > > Is that a bug or is it by design?You should find it locks onto the first CPU in the mask that you specify. As I said, the kernel does not load-balance IRQs so it currently does not make sense to specify multi-cpu cpumasks. So this is by design, for now. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi Keir, I had tried the following experiment on a 4-way machine: 1) Pin the physical interrupts for the physical nic to pcpu1 2) Pin the domU to pcpu3 And ran a quick netperf test. Noticed that the cpu utilization was around ~50% on pcpu0 although my interrupts were being pinned to pcpu2 and domU on pcpu3. That is when I noticed that vif#id.0 has a dynamic irq which is serviced by pcpu0. Does this irq always run on pcpu0? Considering that it is dynamic, I understand that we cannot change the affinity and so am wondering if there some other configuration related to it. Any suggestions/help would be great. Thanks, harish On 6/13/06, Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote:> > > On 13 Jun 2006, at 00:42, harish wrote: > > > echo 2 > /proc/irq/20/smp_affinity [...works..] > > echo 4 > /proc/irq/20/smp_affinity [...works..] > > echo 8 > /proc/irq/20/smp_affinity [...works..] > > > > But, a cumulative does not work...meaning... > > echo 3> > > echo 5> > > echo f> etc.... do not work. > > > > Is that a bug or is it by design? > > You should find it locks onto the first CPU in the mask that you > specify. As I said, the kernel does not load-balance IRQs so it > currently does not make sense to specify multi-cpu cpumasks. So this is > by design, for now. > > -- Keir > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Jun-29  17:20 UTC
Re: [Xen-devel] Question Also regarding interrupt balancing
On 29 Jun 2006, at 16:51, harish wrote:> And ran a quick netperf test. Noticed that the cpu utilization was > around ~50% on pcpu0 although my interrupts were being pinned to pcpu2 > and domU on pcpu3. That is when I noticed that vif#id.0 has a dynamic > irq which is serviced by pcpu0. Does this irq always run on pcpu0? > Considering that it is dynamic, I understand that we cannot change the > affinity and so am wondering if there some other configuration related > to it.Again, we don''t load balance in the kernel, but you could change its affinity to some other single CPU via the proc interface. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
FWD...forgot to cc the group. -I was not clear in my last email and so trying to rephrase the question with an example: peth0 --> pcpu2 [mapped this using the proc interface] domU1 --> pcpu3 Noticed that the physical interrupts were getting routed to pcpu2 and so the mapping worked. However noticed that there were significant number of virtual interrupts corresponding to vif1.0 [used by domU1]. The interrupts were serviced by pcpu0. It is this interrupts that I was talking about in my earlier question. I thought interrupts of type "Dynamic-irq" in general cannot be set using proc interface. Please correct me if I am wrong but from your response below it looks like we can. Anyways I tried setting it using the /proc interface for the virtual interrupts corresponding to vif1.0 and it did not work. How are the virtual interrupts mapped? Yet to see a test where the virtual interrupts run on a pcpu other than pcpu0 and so wondering if there is a explicit mapping in the code. Hope my question makes sense now? -hmv On 6/29/06, Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote:> > > On 29 Jun 2006, at 16:51, harish wrote: > > > And ran a quick netperf test. Noticed that the cpu utilization was > > around ~50% on pcpu0 although my interrupts were being pinned to pcpu2 > > and domU on pcpu3. That is when I noticed that vif#id.0 has a dynamic > > irq which is serviced by pcpu0. Does this irq always run on pcpu0? > > Considering that it is dynamic, I understand that we cannot change the > > affinity and so am wondering if there some other configuration related > > to it. > > Again, we don''t load balance in the kernel, but you could change its > affinity to some other single CPU via the proc interface. > > -- Keir > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel