Running into a mysterious situation here... I have a Xen3 setup running SEDF measuring bandwidth offnode on a non-VMM machine. Running one guest domain bridged through domain 0, transfers from memory (/dev/zero) to the non-VMM machine. Another domain runs pure CPU bound activity, as much as it can (this avoids the non-interrupt-batching context-switching-heavy situation that exists otherwise). On these computers, getting roughly 12.3 MB/s with this distribution of CPU consumption (measured with XenMon): dom0: 17.79%, dom1 (net): 11.41%, dom2 (cpu): 69.48% (these are means from many runs) I get an extremely similar bandwidth reading in another scenario with two net transferring domains (their bandwidths are halved, talking about the *sum* of the two). However, the CPU consumption did not scale very well: dom0: 19.38%, dom1 (net): 9.61%, dom2 (net): 9.55%, dom3 (cpu): 60.18% In the previous scenario dom1 used around 11.5 percent of the CPU to accomplish what here takes roughly 19 percent. Any ideas what this extra CPU work is? There is extra context switching in the second situation (which could account for the dom0 increase), but the network bound guest disparity seems a little extreme to only be that? Thanks, Tim _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Sorry I don''t have full concrete answers, but you might be able to get some ideas from either of these papers if you haven''t seen them: XenMon - http://www.hpl.hp.com/techreports/2005/HPL-2005-187.html Measuring CPU Overhead for IO - http://www.hpl.hp.com/personal/Lucy_Cherkasova/projects/papers/final-perf-study-usenix.pdf They show how the scheduling parameters can have a pretty large effect on performance mainly due to wasted context switches between domains. On 5/31/06, Tim Freeman <tfreeman@mcs.anl.gov> wrote:> > Running into a mysterious situation here... > > I have a Xen3 setup running SEDF measuring bandwidth offnode on a > non-VMM machine. Running one guest domain bridged through domain > 0, transfers from memory (/dev/zero) to the non-VMM machine. Another > domain runs pure CPU bound activity, as much as it can (this avoids the > non-interrupt-batching context-switching-heavy situation that exists > otherwise). > > On these computers, getting roughly 12.3 MB/s with this distribution > of CPU consumption (measured with XenMon): > > dom0: 17.79%, dom1 (net): 11.41%, dom2 (cpu): 69.48% > > (these are means from many runs) > > I get an extremely similar bandwidth reading in another scenario with > two net transferring domains (their bandwidths are halved, talking > about the *sum* of the two). However, the CPU consumption did not > scale very well: > > dom0: 19.38%, dom1 (net): 9.61%, dom2 (net): 9.55%, dom3 (cpu): 60.18% > > In the previous scenario dom1 used around 11.5 percent of the CPU to > accomplish what here takes roughly 19 percent. > > Any ideas what this extra CPU work is? There is extra context > switching in the second situation (which could account for the dom0 > increase), but the network bound guest disparity seems a little extreme > to only be that? > > Thanks, > Tim > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Fri, 2 Jun 2006 08:51:14 -0400 "Tim Wood" <twwood@gmail.com> wrote:> Sorry I don''t have full concrete answers, but you might be able to get some > ideas from either of these papers if you haven''t seen them: > > XenMon - http://www.hpl.hp.com/techreports/2005/HPL-2005-187.html > Measuring CPU Overhead for IO - > http://www.hpl.hp.com/personal/Lucy_Cherkasova/projects/papers/final-perf-study-usenix.pdf > > They show how the scheduling parameters can have a pretty large effect on > performance mainly due to wasted context switches between domains.Thanks, I am aware of both the papers. Since I have the XenMon output I looked at the ex/s and did not see anything to explain the leap with context switching. The two I/O bound guests together are ''charged'' roughly 19% of the CPU usage to process the same amt of packets as the one I/O guest could process with roughly 11.5%. It''s my assumption that context switching between any two domains results in CPU usage being increased for both of those domains. Isn''t that true? (I remember being roughly convinced of this by data at one point, but it would take me some time to find that). If true, in the situation I reported I think it would mean that the two I/O guest domains (dom1/dom2) in scenario 2 would together have to have had a relative increase in ex/s over dom0 to account for the extra CPU usage seen for dom1/dom2 but not dom0. (i.e., maybe dom1 and dom2 are being switched between themselves far more often and that is causing extra load for just these two domains). But in scenario one, dom0 and dom1 read ~1300 each, in scenario two dom0, dom1 and dom2 each read ~380. So it''s far less in fact... (totals including cpu domain: scenario 1 = ~4000 ex/s, scenario 2 ~1900 ex/s). Thanks, Tim> > On 5/31/06, Tim Freeman <tfreeman@mcs.anl.gov> wrote: > > > > Running into a mysterious situation here... > > > > I have a Xen3 setup running SEDF measuring bandwidth offnode on a > > non-VMM machine. Running one guest domain bridged through domain > > 0, transfers from memory (/dev/zero) to the non-VMM machine. Another > > domain runs pure CPU bound activity, as much as it can (this avoids the > > non-interrupt-batching context-switching-heavy situation that exists > > otherwise). > > > > On these computers, getting roughly 12.3 MB/s with this distribution > > of CPU consumption (measured with XenMon): > > > > dom0: 17.79%, dom1 (net): 11.41%, dom2 (cpu): 69.48% > > > > (these are means from many runs) > > > > I get an extremely similar bandwidth reading in another scenario with > > two net transferring domains (their bandwidths are halved, talking > > about the *sum* of the two). However, the CPU consumption did not > > scale very well: > > > > dom0: 19.38%, dom1 (net): 9.61%, dom2 (net): 9.55%, dom3 (cpu): 60.18% > > > > In the previous scenario dom1 used around 11.5 percent of the CPU to > > accomplish what here takes roughly 19 percent. > > > > Any ideas what this extra CPU work is? There is extra context > > switching in the second situation (which could account for the dom0 > > increase), but the network bound guest disparity seems a little extreme > > to only be that? > > > > Thanks, > > Tim > > > > > > _______________________________________________ > > Xen-users mailing list > > Xen-users@lists.xensource.com > > http://lists.xensource.com/xen-users > > >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Fri, 2 Jun 2006 10:41:31 -0500 Tim Freeman <tfreeman@mcs.anl.gov> wrote:> On Fri, 2 Jun 2006 08:51:14 -0400 > "Tim Wood" <twwood@gmail.com> wrote: > > > Sorry I don''t have full concrete answers, but you might be able to get some > > ideas from either of these papers if you haven''t seen them: > > > > XenMon - http://www.hpl.hp.com/techreports/2005/HPL-2005-187.html > > Measuring CPU Overhead for IO - > > http://www.hpl.hp.com/personal/Lucy_Cherkasova/projects/papers/final-perf-study-usenix.pdf > > > > They show how the scheduling parameters can have a pretty large effect on > > performance mainly due to wasted context switches between domains. > > Thanks, I am aware of both the papers. Since I have the XenMon output > I looked at the ex/s and did not see anything to explain the leap with > context switching. The two I/O bound guests together are ''charged'' > roughly 19% of the CPU usage to process the same amt of packets as the > one I/O guest could process with roughly 11.5%. > > It''s my assumption that context switching between any two domains > results in CPU usage being increased for both of those domains. Isn''t > that true? (I remember being roughly convinced of this by data at one > point, but it would take me some time to find that). > > If true, in the situation I reported I think it would mean that the two > I/O guest domains (dom1/dom2) in scenario 2 would together have to have > had a relative increase in ex/s over dom0 to account for the extra CPU > usage seen for dom1/dom2 but not dom0. (i.e., maybe dom1 and dom2 are > being switched between themselves far more often and that is causing > extra load for just these two domains). > > But in scenario one, dom0 and dom1 read ~1300 each, in scenario two > dom0, dom1 and dom2 each read ~380. So it''s far less in fact... > (totals including cpu domain: scenario 1 = ~4000 ex/s, scenario 2 > ~1900 ex/s).And this is intuititive to me, the more domains that are running (and assuming somewhat fair scheduling parameters) the less frequently they have a chance for the CPU and the more virtual interrupts are stored up, i.e., more work to do = domain doesn''t block as quickly (when it''s run out of work). Less blocking means less context switching given the same workloads. In the data, the overall blocked % does indeed plummet in scenario 2. Tim> > Thanks, > Tim > > > > > > On 5/31/06, Tim Freeman <tfreeman@mcs.anl.gov> wrote: > > > > > > Running into a mysterious situation here... > > > > > > I have a Xen3 setup running SEDF measuring bandwidth offnode on a > > > non-VMM machine. Running one guest domain bridged through domain > > > 0, transfers from memory (/dev/zero) to the non-VMM machine. Another > > > domain runs pure CPU bound activity, as much as it can (this avoids the > > > non-interrupt-batching context-switching-heavy situation that exists > > > otherwise). > > > > > > On these computers, getting roughly 12.3 MB/s with this distribution > > > of CPU consumption (measured with XenMon): > > > > > > dom0: 17.79%, dom1 (net): 11.41%, dom2 (cpu): 69.48% > > > > > > (these are means from many runs) > > > > > > I get an extremely similar bandwidth reading in another scenario with > > > two net transferring domains (their bandwidths are halved, talking > > > about the *sum* of the two). However, the CPU consumption did not > > > scale very well: > > > > > > dom0: 19.38%, dom1 (net): 9.61%, dom2 (net): 9.55%, dom3 (cpu): 60.18% > > > > > > In the previous scenario dom1 used around 11.5 percent of the CPU to > > > accomplish what here takes roughly 19 percent. > > > > > > Any ideas what this extra CPU work is? There is extra context > > > switching in the second situation (which could account for the dom0 > > > increase), but the network bound guest disparity seems a little extreme > > > to only be that? > > > > > > Thanks, > > > Tim > > > > > > > > > _______________________________________________ > > > Xen-users mailing list > > > Xen-users@lists.xensource.com > > > http://lists.xensource.com/xen-users > > > > > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users