Hi all (and Mr. Dunlap in particular), I have a question about the credit (and ultimately credit2) scheduler that I hope you can help me with. I have read the white paper "Scheduler development update" and as much material on the credit scheduler as I can find, but I am still not completely clear on how I should think about the cap. Example scenario: Server hardware: 2 sockets, 8-cores per socket, 2 hardware threads per core (total of 32 hardware threads) Test VM: a single virtual machine with a single vCPU, weight=256 and cap=100% In this scenario, from what I understand, I should be able to load the Test VM with traffic to a maximum of approximately 1/32 of the aggregate compute capacity of the server. The total CPU utilization of the server hardware should be approximately 3.4%, plus the overhead of dom0 (say 1-2). The credits available to any vCPU capped at 100% should be equal to 1/32 of the aggregate compute available for the whole server, correct? Put simply, is there a way to constrain a VM with 1 vCPU to consume no more than 0.5 of a physical core (hyper-threaded) on the server hardware mentioned below? Does the cap help in that respect? I have been struggling to understand how the scheduler can deal with the uncertainty that hyperthreading introduces, however. I know this is an issue that you are tackling in the credit2 scheduler, but I would like to know what your thoughts are on this problem (if you are able to share). Any insight or assistance you could offer would be greatly appreciated. Thanks very much and best regards, - Mike Michael Palmeter | Sr. Director of Product Management, Oracle Oracle Development 200 Oracle Parkway | Redwood Shores, California 94065 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On 15/11/12 15:43, Michael Palmeter wrote:> > Hi all (and Mr. Dunlap in particular), >Haha -- please don''t call me "Mr"; I prefer "George", but if you want a title, use "Dr" (since I have PhD). :-)> Example scenario: > > * Server hardware: 2 sockets, 8-cores per socket, 2 hardware threads > per core (total of 32 hardware threads) > * Test VM: a single virtual machine with a single vCPU, weight=256 > and cap=100% > > In this scenario, from what I understand, I should be able to load the > Test VM with traffic to a maximum of approximately 1/32 of the > aggregate compute capacity of the server. The total CPU utilization > of the server hardware should be approximately 3.4%, plus the overhead > of dom0 (say 1-2). The credits available to any vCPU capped at 100% > should be equal to 1/32 of the aggregate compute available for the > whole server, correct? >I think to really be precise, you should say, "1/32nd of the logical cpu time available", where "logical cpu time" simply means, "time processing on one logical CPU". At the moment, that is all that either the credit1 or credit2 schedulers look at. As I''m sure you''re aware, not all "logical cpu time" is equal. If one thread of a hyperthread pair is running but the other idle, it will get significantly higher performance than if the other thread is busy. How much is highly unpredictable, and depends very much on exactly what units are shared with the other hyperthread, and the workload running on each unit. But even when both threads are busy, it should (in theory) be rare for both threads to get a throughput of 50%; the whole idea of HT is that threads typically get 70-80% of the full performance of the core (so the overall throughput is increased). But if course, while this is particularly extreme in the case of hyperthreads, it''s also true on a smaller scale even without that -- cores share caches, NUMA nodes share memory bandwidth, and so on. No attempt is made to compensate VMs for cache misses or extra memory latency due to sharing either. :-)> Put simply, is there a way to constrain a VM with 1 vCPU to consume no > more than 0.5 of a physical core (hyper-threaded) on the server > hardware mentioned below? Does the cap help in that respect? >You can use "cap" to make the VM in question get 50% of logical vcpu time, which on an idle system will give it 0.5 of the capacity of a physical core (if we don''t consider Intel''s Turbo Boost technology). But if the system becomes busy, it will get less than 0.5 of the processing capacity of a physical core.> I have been struggling to understand how the scheduler can deal with > the uncertainty that hyperthreading introduces, however. I know this > is an issue that you are tackling in the credit2 scheduler, but I > would like to know what your thoughts are on this problem (if you are > able to share). Any insight or assistance you could offer would be > greatly appreciated. >At the moment it does not attempt to; the only thing it does is try not to schedule two hyperthreads that share a core if there is an idle core. But if there are more active vcpus than cores, then some will share; and the ones that share a core with another vcpu will be charged the same as the ones that have the core all to themselves. Could you explain why you your question is important to you -- i.e,. what are you trying to accomplish? It sounds a bit like you''re more concerned with accuracy in reporting, and control of resources, rather than fairness, for instance. -George --------------080804000802020308040205 Content-Type: text/html; charset="windows-1252" Content-Transfer-Encoding: 8bit <html> <head> <meta content="text/html; charset=windows-1252" http-equiv="Content-Type"> </head> <body text="#000000" bgcolor="#FFFFFF"> <div class="moz-cite-prefix">On 15/11/12 15:43, Michael Palmeter wrote:<br> </div> <blockquote cite="mid:c58a9d3a-99e4-42ac-86c9-fbec600dee14@default" type="cite"> <meta http-equiv="Content-Type" content="text/html; charset=windows-1252"> <meta name="Generator" content="Microsoft Word 12 (filtered medium)"> <style><!-- /* Font Definitions */ @font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4;} @font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4;} @font-face {font-family:Tahoma; panose-1:2 11 6 4 3 5 4 4 2 4;} @font-face {font-family:Verdana; panose-1:2 11 6 4 3 5 4 4 2 4;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0in; margin-bottom:.0001pt; font-size:11.0pt; font-family:"Calibri","sans-serif";} a:link, span.MsoHyperlink {mso-style-priority:99; color:blue; text-decoration:underline;} a:visited, span.MsoHyperlinkFollowed {mso-style-priority:99; color:purple; text-decoration:underline;} p.MsoAcetate, li.MsoAcetate, div.MsoAcetate {mso-style-priority:99; mso-style-link:"Balloon Text Char"; margin:0in; margin-bottom:.0001pt; font-size:8.0pt; font-family:"Tahoma","sans-serif";} span.EmailStyle17 {mso-style-type:personal-compose; font-family:"Calibri","sans-serif"; color:windowtext;} span.BalloonTextChar {mso-style-name:"Balloon Text Char"; mso-style-priority:99; mso-style-link:"Balloon Text"; font-family:"Tahoma","sans-serif";} .MsoChpDefault {mso-style-type:export-only;} @page WordSection1 {size:8.5in 11.0in; margin:1.0in 1.0in 1.0in 1.0in;} div.WordSection1 {page:WordSection1;} /* List Definitions */ @list l0 {mso-list-id:599684598; mso-list-type:hybrid; mso-list-template-ids:1243763380 67698689 67698691 67698693 67698689 67698691 67698693 67698689 67698691 67698693;} @list l0:level1 {mso-level-number-format:bullet; mso-level-text:\F0B7; mso-level-tab-stop:none; mso-level-number-position:left; text-indent:-.25in; font-family:Symbol;} @list l0:level2 {mso-level-tab-stop:1.0in; mso-level-number-position:left; text-indent:-.25in;} @list l0:level3 {mso-level-tab-stop:1.5in; mso-level-number-position:left; text-indent:-.25in;} @list l0:level4 {mso-level-tab-stop:2.0in; mso-level-number-position:left; text-indent:-.25in;} @list l0:level5 {mso-level-tab-stop:2.5in; mso-level-number-position:left; text-indent:-.25in;} @list l0:level6 {mso-level-tab-stop:3.0in; mso-level-number-position:left; text-indent:-.25in;} @list l0:level7 {mso-level-tab-stop:3.5in; mso-level-number-position:left; text-indent:-.25in;} @list l0:level8 {mso-level-tab-stop:4.0in; mso-level-number-position:left; text-indent:-.25in;} @list l0:level9 {mso-level-tab-stop:4.5in; mso-level-number-position:left; text-indent:-.25in;} ol {margin-bottom:0in;} ul {margin-bottom:0in;} --></style><!--[if gte mso 9]><xml> <o:shapedefaults v:ext="edit" spidmax="1026" /> </xml><![endif]--><!--[if gte mso 9]><xml> <o:shapelayout v:ext="edit"> <o:idmap v:ext="edit" data="1" /> </o:shapelayout></xml><![endif]--> <div class="WordSection1"> <p class="MsoNormal">Hi all (and Mr. Dunlap in particular),</p> </div> </blockquote> <br> Haha -- please don''t call me "Mr"; I prefer "George", but if you want a title, use "Dr" (since I have PhD). :-)<br> <br> <blockquote cite="mid:c58a9d3a-99e4-42ac-86c9-fbec600dee14@default" type="cite"> <div class="WordSection1"> <p class="MsoNormal"><o:p></o:p></p> <p class="MsoNormal"><o:p> </o:p>Example scenario:<o:p></o:p></p> <p class="MsoNormal"> <o:p></o:p></p> <ul style="margin-top:0in" type="disc"> <li class="MsoNormal" style="mso-list:l0 level1 lfo1">Server hardware: 2 sockets, 8-cores per socket, 2 hardware threads per core (total of 32 hardware threads)<o:p></o:p></li> <li class="MsoNormal" style="mso-list:l0 level1 lfo1">Test VM: a single virtual machine with a single vCPU, weight=256 and cap=100%<o:p></o:p></li> </ul> <p class="MsoNormal"> <o:p></o:p></p> <p class="MsoNormal">In this scenario, from what I understand, I should be able to load the Test VM with traffic to a maximum of approximately 1/32 of the aggregate compute capacity of the server. The total CPU utilization of the server hardware should be approximately 3.4%, plus the overhead of dom0 (say 1-2). The credits available to any vCPU capped at 100% should be equal to 1/32 of the aggregate compute available for the whole server, correct?<o:p></o:p></p> </div> </blockquote> <br> I think to really be precise, you should say, "1/32nd of the logical cpu time available", where "logical cpu time" simply means, "time processing on one logical CPU". At the moment, that is all that either the credit1 or credit2 schedulers look at.<br> <br> As I''m sure you''re aware, not all "logical cpu time" is equal. If one thread of a hyperthread pair is running but the other idle, it will get significantly higher performance than if the other thread is busy. How much is highly unpredictable, and depends very much on exactly what units are shared with the other hyperthread, and the workload running on each unit. But even when both threads are busy, it should (in theory) be rare for both threads to get a throughput of 50%; the whole idea of HT is that threads typically get 70-80% of the full performance of the core (so the overall throughput is increased).<br> <br> But if course, while this is particularly extreme in the case of hyperthreads, it''s also true on a smaller scale even without that -- cores share caches, NUMA nodes share memory bandwidth, and so on. No attempt is made to compensate VMs for cache misses or extra memory latency due to sharing either. :-)<br> <br> <blockquote cite="mid:c58a9d3a-99e4-42ac-86c9-fbec600dee14@default" type="cite"> <div class="WordSection1"> <p class="MsoNormal">Put simply, is there a way to constrain a VM with 1 vCPU to consume no more than 0.5 of a physical core (hyper-threaded) on the server hardware mentioned below? Does the cap help in that respect?<o:p></o:p></p> </div> </blockquote> <br> You can use "cap" to make the VM in question get 50% of logical vcpu time, which on an idle system will give it 0.5 of the capacity of a physical core (if we don''t consider Intel''s Turbo Boost technology). But if the system becomes busy, it will get less than 0.5 of the processing capacity of a physical core.<br> <br> <blockquote cite="mid:c58a9d3a-99e4-42ac-86c9-fbec600dee14@default" type="cite"> <div class="WordSection1"> <p class="MsoNormal"><o:p></o:p>I have been struggling to understand how the scheduler can deal with the uncertainty that hyperthreading introduces, however. I know this is an issue that you are tackling in the credit2 scheduler, but I would like to know what your thoughts are on this problem (if you are able to share). Any insight or assistance you could offer would be greatly appreciated. </p> </div> </blockquote> <br> At the moment it does not attempt to; the only thing it does is try not to schedule two hyperthreads that share a core if there is an idle core. But if there are more active vcpus than cores, then some will share; and the ones that share a core with another vcpu will be charged the same as the ones that have the core all to themselves.<br> <br> Could you explain why you your question is important to you -- i.e,. what are you trying to accomplish? It sounds a bit like you''re more concerned with accuracy in reporting, and control of resources, rather than fairness, for instance.<br> <br> -George<br> </body> </html> --------------080804000802020308040205-- --===============5721675100836511393=Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============5721675100836511393==--
On 15/11/12 18:29, George Dunlap wrote:>> >> Put simply, is there a way to constrain a VM with 1 vCPU to consume >> no more than 0.5 of a physical core (hyper-threaded) on the server >> hardware mentioned below? Does the cap help in that respect? >> > > You can use "cap" to make the VM in question get 50% of logical vcpu time,This should be "logical CPU time"... -George --------------010906050907030102020802 Content-Type: text/html; charset="windows-1252" Content-Transfer-Encoding: 8bit <html> <head> <meta content="text/html; charset=windows-1252" http-equiv="Content-Type"> </head> <body text="#000000" bgcolor="#FFFFFF"> <div class="moz-cite-prefix">On 15/11/12 18:29, George Dunlap wrote:<br> </div> <blockquote cite="mid:50A53479.5050901@eu.citrix.com" type="cite"> <blockquote cite="mid:c58a9d3a-99e4-42ac-86c9-fbec600dee14@default" type="cite"> <div class="WordSection1"> <p class="MsoNormal">Put simply, is there a way to constrain a VM with 1 vCPU to consume no more than 0.5 of a physical core (hyper-threaded) on the server hardware mentioned below? Does the cap help in that respect?<o:p></o:p></p> </div> </blockquote> <br> You can use "cap" to make the VM in question get 50% of logical vcpu time,</blockquote> <br> This should be "logical CPU time"...<br> <br> -George<br> </body> </html> --------------010906050907030102020802-- --===============0778104893750546398=Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============0778104893750546398==--
Thank you for your answer, George. The origin of my question is more of a business concern than a technical one. Many software products are licensed based on a cost per processor core. It is desirable to sometimes allow customers to pay a fraction of software license costs in exchange for running that software using only a commensurate fraction of available compute power (capacity sub-licensing). If the cap is a means of making a vCPU more-or-less deterministic (in terms of its effective computational capacity) then that would be useful as a programmatic means of enabling capacity sub-licensing. My example below was based on a case where I have a customer that would like to use ''cap'' to constrain their single vCPU VM to only ½ of a core worth of compute capacity (logically 1/32 of the compute power) in exchange for only paying 1/32 of the license cost for the physical server. Below you answered: "You can use ''cap'' to make the VM in question get 50% of logical vcpu time, which on an idle system will give it 0.5 of the capacity of a physical core (if we don''t consider Intel''s Turbo Boost technology). But if the system becomes busy, it will get less than 0.5 of the processing capacity of a physical core." Are you saying that cap would be able to CONSTRAIN a vCPU to an effective compute capacity equal to 50% of a physical core, but it does not GUARANTEE effective compute capacity equal to 50% of a physical core? Can you offer any guidance regarding real-world scheduler overhead (when cap>0 is used) and precision (how variable is actual compute power for a vCPU with a cap of 100%, for example)? - Mike HYPERLINK "http://www.oracle.com/" \nOracle Michael Palmeter | Sr. Director of Product Management, Oracle Exalogic Elastic Cloud Phone: HYPERLINK "tel:+14153736497"+14153736497 | Mobile: HYPERLINK "tel:+14156949573"+14156949573 | VOIP: HYPERLINK "tel:+14154027422"+14154027422 Oracle Exalogic Development 200 Oracle Parkway | Redwood Shores, California 94065 HYPERLINK "http://www.oracle.com/commitment" \nGreen Oracle Oracle is committed to developing practices and products that help protect the environment HYPERLINK "http://www.oracle.com/pls/ebn/swf_viewer.load?p_shows_id=12641667&p_referred=0&p_width=1000&p_height=675"Watch the Exalogic 5-minute Demo at http://www.oracle.com/pls/ebn/swf_viewer.load?p_shows_id=12641667 From: George Dunlap [mailto:george.dunlap@eu.citrix.com] Sent: November 15, 2012 10:29 AM To: Michael Palmeter Cc: Dario Faggioli; xen-devel@lists.xen.org Subject: Re: [Xen-devel] Xen credit scheduler question On 15/11/12 15:43, Michael Palmeter wrote: Hi all (and Mr. Dunlap in particular), Haha -- please don''t call me "Mr"; I prefer "George", but if you want a title, use "Dr" (since I have PhD). :-) Example scenario: Server hardware: 2 sockets, 8-cores per socket, 2 hardware threads per core (total of 32 hardware threads) Test VM: a single virtual machine with a single vCPU, weight=256 and cap=100% In this scenario, from what I understand, I should be able to load the Test VM with traffic to a maximum of approximately 1/32 of the aggregate compute capacity of the server. The total CPU utilization of the server hardware should be approximately 3.4%, plus the overhead of dom0 (say 1-2). The credits available to any vCPU capped at 100% should be equal to 1/32 of the aggregate compute available for the whole server, correct? I think to really be precise, you should say, "1/32nd of the logical cpu time available", where "logical cpu time" simply means, "time processing on one logical CPU". At the moment, that is all that either the credit1 or credit2 schedulers look at. As I''m sure you''re aware, not all "logical cpu time" is equal. If one thread of a hyperthread pair is running but the other idle, it will get significantly higher performance than if the other thread is busy. How much is highly unpredictable, and depends very much on exactly what units are shared with the other hyperthread, and the workload running on each unit. But even when both threads are busy, it should (in theory) be rare for both threads to get a throughput of 50%; the whole idea of HT is that threads typically get 70-80% of the full performance of the core (so the overall throughput is increased). But if course, while this is particularly extreme in the case of hyperthreads, it''s also true on a smaller scale even without that -- cores share caches, NUMA nodes share memory bandwidth, and so on. No attempt is made to compensate VMs for cache misses or extra memory latency due to sharing either. :-) Put simply, is there a way to constrain a VM with 1 vCPU to consume no more than 0.5 of a physical core (hyper-threaded) on the server hardware mentioned below? Does the cap help in that respect? You can use "cap" to make the VM in question get 50% of logical vcpu time, which on an idle system will give it 0.5 of the capacity of a physical core (if we don''t consider Intel''s Turbo Boost technology). But if the system becomes busy, it will get less than 0.5 of the processing capacity of a physical core. I have been struggling to understand how the scheduler can deal with the uncertainty that hyperthreading introduces, however. I know this is an issue that you are tackling in the credit2 scheduler, but I would like to know what your thoughts are on this problem (if you are able to share). Any insight or assistance you could offer would be greatly appreciated. At the moment it does not attempt to; the only thing it does is try not to schedule two hyperthreads that share a core if there is an idle core. But if there are more active vcpus than cores, then some will share; and the ones that share a core with another vcpu will be charged the same as the ones that have the core all to themselves. Could you explain why you your question is important to you -- i.e,. what are you trying to accomplish? It sounds a bit like you''re more concerned with accuracy in reporting, and control of resources, rather than fairness, for instance. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On 15/11/12 19:03, Michael Palmeter wrote:> > Thank you for your answer, George. > > The origin of my question is more of a business concern than a > technical one. Many software products are licensed based on a cost > per processor core. It is desirable to sometimes allow customers to > pay a fraction of software license costs in exchange for running that > software using only a commensurate fraction of available compute power > (capacity sub-licensing). If the cap is a means of making a vCPU > more-or-less deterministic (in terms of its effective computational > capacity) then that would be useful as a programmatic means of > enabling capacity sub-licensing. My example below was based on a case > where I have a customer that would like to use ‘cap’ to constrain > their single vCPU VM to only ½ of a core worth of compute capacity > (logically 1/32 of the compute power) in exchange for only paying 1/32 > of the license cost for the physical server. >Right -- I''ve seen the "limit cpu power for licensing purposes" thing before, but I think that only went down to cores, not sub-core.> Below you answered: > > “You can use ‘cap’ to make the VM in question get 50% of logical vcpu > time, which on an idle system will give it 0.5 of the capacity of a > physical core (if we don''t consider Intel''s Turbo Boost technology). > But if the system becomes busy, it will get less than 0.5 of the > processing capacity of a physical core.” > > Are you saying that cap would be able to CONSTRAIN a vCPU to an > effective compute capacity equal to 50% of a physical core, but it > does not GUARANTEE effective compute capacity equal to 50% of a > physical core? >Theoretically, a cap at 50 will give your single-vcpu VM 50% of the time of one hyperthread. So if C is "typicall throughput of a single non-hyperthreaded core running at standard requency", and we factor out Turbo Boost, then there are two cases to consider: * Other thread is idle. In that case, the VM will get 0.5C. * The other thread is busy. In this case, assuming a 0.7 factor, the VM will get 0.5 * (0.7 * C), or about 0.35C So the total computing power available to the VM should be <= 0.5C (satisfying the licensing requirements), but on a busy system it may be significantly less than 0.5C (perhaps not so satisfying to the owner of the VM). I don''t think it should be terribly difficult to put a simple "shared hyperthread" multiplier on the credit burned -- if someone at Oracle wanted to help implement this, we''d be happy to point you in the right direction. :-) If you have Turbo Boost, then (as I understand it) the CPU can raise the clock speed of the processor when threads or cores are idle; the wikipedia article seems to think some processors can increase the clock speed up to 1.6x over the baseline frequency. That would throw a bit of a wrench in the works, as you might end up with 0.5 * 1.6 * C = 0.8 C > 0.5 C; however, looking at Intel''s website, it looks like only 2- and 4-core processors have TurboBoost, so maybe on 8-core processors we can punt on that thorny issue for a little while yet. :-)> Can you offer any guidance regarding real-world scheduler overhead > (when cap>0 is used) and precision (how variable is actual compute > power for a vCPU with a cap of 100%, for example)? >I have not done extensive testing with the cap; I mainly know the mechanism by which it works. There is no extra accounting done in the scheduler for having a cap: all vcpus are assigned credit every 30ms according to their weight and cap. The difference is that if a non-capped vcpu uses up its credits, it is allowed to go negative; whereas a capped vcpu will be paused until it receives more credits. So there should be no extra hypervisor overhead from using a cap. The cap fundamentally works by locking out a vcpu for very small amounts of time within the 30ms accounting window. But this same effect might happen just by having other VMs competing for the cpu; so in theory shouldn''t be any riskier than virtualizing in the first place. Executive summary: Factoring out Turbo Boost, "cap" should be able to set a sub-core upper-bound on processing power. But on a busy system, it may result in the VM getting less than its upper-bound in processing power. However, scheduling is a very complex and dynamic system, and like economics, very simple changes can have unpredictable results. So it''s probably a good idea to do some testing before recommending it to customers. :-) BTW, are you familiar with Xen''s cpupool functionality? The guys at Fujitsu wrote it so that a customer could rent a fixed number of cores to a customer, who could then run as many VMs on those cores as they wanted. I think licensing restrictions had something to do with that as well. More about that here, if you''re interested: http://blog.xen.org/index.php/2012/04/23/xen-4-2-cpupools/ -George --------------090604080201040700060707 Content-Type: text/html; charset="windows-1252" Content-Transfer-Encoding: 8bit <html> <head> <meta content="text/html; charset=windows-1252" http-equiv="Content-Type"> </head> <body text="#000000" bgcolor="#FFFFFF"> <div class="moz-cite-prefix">On 15/11/12 19:03, Michael Palmeter wrote:<br> </div> <blockquote cite="mid:27449f60-0433-4e5f-b1fb-06914b84c6f1@default" type="cite"> <meta http-equiv="Content-Type" content="text/html; charset=windows-1252"> <meta name="Generator" content="Microsoft Word 12 (filtered medium)"> <!--[if !mso]><style>v\:* {behavior:url(#default#VML);} o\:* {behavior:url(#default#VML);} w\:* {behavior:url(#default#VML);} .shape {behavior:url(#default#VML);} </style><![endif]--> <style><!-- /* Font Definitions */ @font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4;} @font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4;} @font-face {font-family:Tahoma; panose-1:2 11 6 4 3 5 4 4 2 4;} @font-face {font-family:Verdana; panose-1:2 11 6 4 3 5 4 4 2 4;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0in; margin-bottom:.0001pt; font-size:11.0pt; font-family:"Calibri","sans-serif"; color:black;} a:link, span.MsoHyperlink {mso-style-priority:99; color:blue; text-decoration:underline;} a:visited, span.MsoHyperlinkFollowed {mso-style-priority:99; color:purple; text-decoration:underline;} p.MsoAcetate, li.MsoAcetate, div.MsoAcetate {mso-style-priority:99; mso-style-link:"Balloon Text Char"; margin:0in; margin-bottom:.0001pt; font-size:8.0pt; font-family:"Tahoma","sans-serif"; color:black;} span.BalloonTextChar {mso-style-name:"Balloon Text Char"; mso-style-priority:99; mso-style-link:"Balloon Text"; font-family:"Tahoma","sans-serif";} span.EmailStyle19 {mso-style-type:personal; font-family:"Calibri","sans-serif"; color:windowtext;} span.EmailStyle20 {mso-style-type:personal-reply; font-family:"Calibri","sans-serif"; color:#1F497D;} .MsoChpDefault {mso-style-type:export-only; font-size:10.0pt;} @page WordSection1 {size:8.5in 11.0in; margin:1.0in 1.0in 1.0in 1.0in;} div.WordSection1 {page:WordSection1;} /* List Definitions */ @list l0 {mso-list-id:599684598; mso-list-type:hybrid; mso-list-template-ids:1243763380 67698689 67698691 67698693 67698689 67698691 67698693 67698689 67698691 67698693;} @list l0:level1 {mso-level-number-format:bullet; mso-level-text:\F0B7; mso-level-tab-stop:none; mso-level-number-position:left; text-indent:-.25in; font-family:Symbol;} @list l0:level2 {mso-level-tab-stop:1.0in; mso-level-number-position:left; text-indent:-.25in;} @list l0:level3 {mso-level-tab-stop:1.5in; mso-level-number-position:left; text-indent:-.25in;} @list l0:level4 {mso-level-tab-stop:2.0in; mso-level-number-position:left; text-indent:-.25in;} @list l0:level5 {mso-level-tab-stop:2.5in; mso-level-number-position:left; text-indent:-.25in;} @list l0:level6 {mso-level-tab-stop:3.0in; mso-level-number-position:left; text-indent:-.25in;} @list l0:level7 {mso-level-tab-stop:3.5in; mso-level-number-position:left; text-indent:-.25in;} @list l0:level8 {mso-level-tab-stop:4.0in; mso-level-number-position:left; text-indent:-.25in;} @list l0:level9 {mso-level-tab-stop:4.5in; mso-level-number-position:left; text-indent:-.25in;} @list l1 {mso-list-id:1244485665; mso-list-template-ids:2139918180;} @list l1:level1 {mso-level-number-format:bullet; mso-level-text:\F0B7; mso-level-tab-stop:.5in; mso-level-number-position:left; text-indent:-.25in; mso-ansi-font-size:10.0pt; font-family:Symbol;} ol {margin-bottom:0in;} ul {margin-bottom:0in;} --></style><!--[if gte mso 9]><xml> <o:shapedefaults v:ext="edit" spidmax="2050" /> </xml><![endif]--><!--[if gte mso 9]><xml> <o:shapelayout v:ext="edit"> <o:idmap v:ext="edit" data="1" /> </o:shapelayout></xml><![endif]--> <div class="WordSection1"> <p class="MsoNormal"><span style="color:#1F497D">Thank you for your answer, George.<o:p></o:p></span></p> <p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p> <p class="MsoNormal"><span style="color:#1F497D">The origin of my question is more of a business concern than a technical one. Many software products are licensed based on a cost per processor core. It is desirable to sometimes allow customers to pay a fraction of software license costs in exchange for running that software using only a commensurate fraction of available compute power (capacity sub-licensing). If the cap is a means of making a vCPU more-or-less deterministic (in terms of its effective computational capacity) then that would be useful as a programmatic means of enabling capacity sub-licensing. My example below was based on a case where I have a customer that would like to use ‘cap’ to constrain their single vCPU VM to only ½ of a core worth of compute capacity (logically 1/32 of the compute power) in exchange for only paying 1/32 of the license cost for the physical server.</span></p> </div> </blockquote> <br> Right -- I''ve seen the "limit cpu power for licensing purposes" thing before, but I think that only went down to cores, not sub-core.<br> <br> <blockquote cite="mid:27449f60-0433-4e5f-b1fb-06914b84c6f1@default" type="cite"> <div class="WordSection1"> <p class="MsoNormal"><span style="color:#1F497D"><o:p></o:p></span></p> <p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p> <p class="MsoNormal"><span style="color:#1F497D">Below you answered:<o:p></o:p></span></p> <p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p> <p class="MsoNormal"><span style="color:#1F497D">“You can use ‘cap’ to make the VM in question get 50% of logical vcpu time, which on an idle system will give it 0.5 of the capacity of a physical core (if we don''t consider Intel''s Turbo Boost technology). But if the system becomes busy, it will get less than 0.5 of the processing capacity of a physical core.”<o:p></o:p></span></p> <p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p> <p class="MsoNormal"><span style="color:#1F497D">Are you saying that cap would be able to CONSTRAIN a vCPU to an effective compute capacity equal to 50% of a physical core, but it does not GUARANTEE effective compute capacity equal to 50% of a physical core? </span></p> </div> </blockquote> <br> Theoretically, a cap at 50 will give your single-vcpu VM 50% of the time of one hyperthread.<br> <br> So if C is "typicall throughput of a single non-hyperthreaded core running at standard requency", and we factor out Turbo Boost, then there are two cases to consider:<br> <br> * Other thread is idle. In that case, the VM will get 0.5C.<br> * The other thread is busy. In this case, assuming a 0.7 factor, the VM will get 0.5 * (0.7 * C), or about 0.35C<br> <br> So the total computing power available to the VM should be < 0.5C (satisfying the licensing requirements), but on a busy system it may be significantly less than 0.5C (perhaps not so satisfying to the owner of the VM).<br> <br> I don''t think it should be terribly difficult to put a simple "shared hyperthread" multiplier on the credit burned -- if someone at Oracle wanted to help implement this, we''d be happy to point you in the right direction. :-)<br> <br> If you have Turbo Boost, then (as I understand it) the CPU can raise the clock speed of the processor when threads or cores are idle; the wikipedia article seems to think some processors can increase the clock speed up to 1.6x over the baseline frequency. That would throw a bit of a wrench in the works, as you might end up with 0.5 * 1.6 * C = 0.8 C > 0.5 C; however, looking at Intel''s website, it looks like only 2- and 4-core processors have TurboBoost, so maybe on 8-core processors we can punt on that thorny issue for a little while yet. :-)<br> <br> <blockquote cite="mid:27449f60-0433-4e5f-b1fb-06914b84c6f1@default" type="cite"> <div class="WordSection1"> <p class="MsoNormal"><span style="color:#1F497D"><o:p></o:p></span></p> <p class="MsoNormal"><span style="color:#1F497D"><o:p></o:p>Can you offer any guidance regarding real-world scheduler overhead (when cap>0 is used) and precision (how variable is actual compute power for a vCPU with a cap of 100%, for example)?</span></p> </div> </blockquote> <br> I have not done extensive testing with the cap; I mainly know the mechanism by which it works. There is no extra accounting done in the scheduler for having a cap: all vcpus are assigned credit every 30ms according to their weight and cap. The difference is that if a non-capped vcpu uses up its credits, it is allowed to go negative; whereas a capped vcpu will be paused until it receives more credits. So there should be no extra hypervisor overhead from using a cap.<br> <br> The cap fundamentally works by locking out a vcpu for very small amounts of time within the 30ms accounting window. But this same effect might happen just by having other VMs competing for the cpu; so in theory shouldn''t be any riskier than virtualizing in the first place.<br> <br> Executive summary: Factoring out Turbo Boost, "cap" should be able to set a sub-core upper-bound on processing power. But on a busy system, it may result in the VM getting less than its upper-bound in processing power.<br> <br> However, scheduling is a very complex and dynamic system, and like economics, very simple changes can have unpredictable results. So it''s probably a good idea to do some testing before recommending it to customers. :-)<br> <br> BTW, are you familiar with Xen''s cpupool functionality? The guys at Fujitsu wrote it so that a customer could rent a fixed number of cores to a customer, who could then run as many VMs on those cores as they wanted. I think licensing restrictions had something to do with that as well. More about that here, if you''re interested:<br> <a class="moz-txt-link-freetext" href="http://blog.xen.org/index.php/2012/04/23/xen-4-2-cpupools/">http://blog.xen.org/index.php/2012/04/23/xen-4-2-cpupools/</a><br> <br> -George<br> </body> </html> --------------090604080201040700060707-- --===============5647842968013421696=Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============5647842968013421696==--
Thanks very much George. Very helpful indeed. - Mike HYPERLINK "http://www.oracle.com/" \nOracle Michael Palmeter | Sr. Director of Product Management, Oracle Exalogic Elastic Cloud Phone: HYPERLINK "tel:+14153736497"+14153736497 | Mobile: HYPERLINK "tel:+14156949573"+14156949573 | VOIP: HYPERLINK "tel:+14154027422"+14154027422 Oracle Exalogic Development 200 Oracle Parkway | Redwood Shores, California 94065 HYPERLINK "http://www.oracle.com/commitment" \nGreen Oracle Oracle is committed to developing practices and products that help protect the environment HYPERLINK "http://www.oracle.com/pls/ebn/swf_viewer.load?p_shows_id=12641667&p_referred=0&p_width=1000&p_height=675"Watch the Exalogic 5-minute Demo at http://www.oracle.com/pls/ebn/swf_viewer.load?p_shows_id=12641667 From: George Dunlap [mailto:george.dunlap@eu.citrix.com] Sent: November 15, 2012 11:53 AM To: Michael Palmeter Cc: Ashok Aletty; Dario Faggioli; xen-devel@lists.xen.org Subject: Re: [Xen-devel] Xen credit scheduler question On 15/11/12 19:03, Michael Palmeter wrote: Thank you for your answer, George. The origin of my question is more of a business concern than a technical one. Many software products are licensed based on a cost per processor core. It is desirable to sometimes allow customers to pay a fraction of software license costs in exchange for running that software using only a commensurate fraction of available compute power (capacity sub-licensing). If the cap is a means of making a vCPU more-or-less deterministic (in terms of its effective computational capacity) then that would be useful as a programmatic means of enabling capacity sub-licensing. My example below was based on a case where I have a customer that would like to use ''cap'' to constrain their single vCPU VM to only ½ of a core worth of compute capacity (logically 1/32 of the compute power) in exchange for only paying 1/32 of the license cost for the physical server. Right -- I''ve seen the "limit cpu power for licensing purposes" thing before, but I think that only went down to cores, not sub-core. Below you answered: "You can use ''cap'' to make the VM in question get 50% of logical vcpu time, which on an idle system will give it 0.5 of the capacity of a physical core (if we don''t consider Intel''s Turbo Boost technology). But if the system becomes busy, it will get less than 0.5 of the processing capacity of a physical core." Are you saying that cap would be able to CONSTRAIN a vCPU to an effective compute capacity equal to 50% of a physical core, but it does not GUARANTEE effective compute capacity equal to 50% of a physical core? Theoretically, a cap at 50 will give your single-vcpu VM 50% of the time of one hyperthread. So if C is "typicall throughput of a single non-hyperthreaded core running at standard requency", and we factor out Turbo Boost, then there are two cases to consider: * Other thread is idle. In that case, the VM will get 0.5C. * The other thread is busy. In this case, assuming a 0.7 factor, the VM will get 0.5 * (0.7 * C), or about 0.35C So the total computing power available to the VM should be <= 0.5C (satisfying the licensing requirements), but on a busy system it may be significantly less than 0.5C (perhaps not so satisfying to the owner of the VM). I don''t think it should be terribly difficult to put a simple "shared hyperthread" multiplier on the credit burned -- if someone at Oracle wanted to help implement this, we''d be happy to point you in the right direction. :-) If you have Turbo Boost, then (as I understand it) the CPU can raise the clock speed of the processor when threads or cores are idle; the wikipedia article seems to think some processors can increase the clock speed up to 1.6x over the baseline frequency. That would throw a bit of a wrench in the works, as you might end up with 0.5 * 1.6 * C = 0.8 C > 0.5 C; however, looking at Intel''s website, it looks like only 2- and 4-core processors have TurboBoost, so maybe on 8-core processors we can punt on that thorny issue for a little while yet. :-) Can you offer any guidance regarding real-world scheduler overhead (when cap>0 is used) and precision (how variable is actual compute power for a vCPU with a cap of 100%, for example)? I have not done extensive testing with the cap; I mainly know the mechanism by which it works. There is no extra accounting done in the scheduler for having a cap: all vcpus are assigned credit every 30ms according to their weight and cap. The difference is that if a non-capped vcpu uses up its credits, it is allowed to go negative; whereas a capped vcpu will be paused until it receives more credits. So there should be no extra hypervisor overhead from using a cap. The cap fundamentally works by locking out a vcpu for very small amounts of time within the 30ms accounting window. But this same effect might happen just by having other VMs competing for the cpu; so in theory shouldn''t be any riskier than virtualizing in the first place. Executive summary: Factoring out Turbo Boost, "cap" should be able to set a sub-core upper-bound on processing power. But on a busy system, it may result in the VM getting less than its upper-bound in processing power. However, scheduling is a very complex and dynamic system, and like economics, very simple changes can have unpredictable results. So it''s probably a good idea to do some testing before recommending it to customers. :-) BTW, are you familiar with Xen''s cpupool functionality? The guys at Fujitsu wrote it so that a customer could rent a fixed number of cores to a customer, who could then run as many VMs on those cores as they wanted. I think licensing restrictions had something to do with that as well. More about that here, if you''re interested: http://blog.xen.org/index.php/2012/04/23/xen-4-2-cpupools/ -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Thu, 2012-11-15 at 19:52 +0000, George Dunlap wrote:> > BTW, are you familiar with Xen''s cpupool functionality? The guys at > Fujitsu wrote it so that a customer could rent a fixed number of cores > to a customer, who could then run as many VMs on those cores as they > wanted. I think licensing restrictions had something to do with that > as well. More about that here, if you''re interested: > http://blog.xen.org/index.php/2012/04/23/xen-4-2-cpupools/ >That is true, and I was right about to suggest considering cpupools for this discussion. However, since it seems you''re interested in the difference between ''core'' and ''hyperthread'', cpupools also see hyperthreads as cpus (as almost every other piece of Xen, with the only exception of that small bit of the load balancer, as explained by George). So, if cpu0 and cpu1 are hyperthreads of the same core, and you put them in the same pool, you''re back to square 1 and you''ve got to take the 0.7 factor into account. It is probably possible to differentiate, during accounting, the time spent on a (busy?) hyperthread wrt the time spent on a "regular" core, but not without modifying the scheduler. Otherwise, if HT is disturbing too much, I''ve seen people turning it off (different scope and purposes, i.e., real-time, but still), provided the BIOS offers such an option. Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://retis.sssup.it/people/faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel