I've been looking into a report that vcpu pinning doesn't get preserved across a save/restore (or migrate) and, debugging this, it's starting to look like the "cpus" config file parameter doesn't work very well -- on 3.1 or unstable! If anything is specified other than a single integer, the code reverts to "any cpu". I think I found this specific problem but there seems to be some bad bit rot hiding behind that. So before I go any further, I thought I'd ask a few questions: 1) Is the "cpus" parameter expected to work in a config file or is it somehow deprecated? (I see there is an "xm vcpu-pin" command so perhaps this is the accepted way to pin cpu's?) 2) Pinning via the "cpus" parameter calls vcpu_set_affinity() but I've always thought the term "affinity" expresses a preference not a restriction. If the call to setaffinity did get made properly, would the scheduler really restrict the vcpu to certain pcpu's? And what happens if the vcpu is ready to schedule but none of the restricted set of pcpu's is available? 3) Does "cpus" really have any real-world usage anyhow? E.g. are most uses probably just user misunderstanding where "vcpu_avail" should be used instead? Thanks, Dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> 1) Is the "cpus" parameter expected to work in a config file or is it > somehow deprecated? > (I see there is an "xm vcpu-pin" command so perhaps this is the > accepted way to > pin cpu's?)It's expected to work.> 2) Pinning via the "cpus" parameter calls vcpu_set_affinity() but I've > always thought the term > "affinity" expresses a preference not a restriction. If the call > to setaffinity did get > made properly, would the scheduler really restrict the vcpu to > certain pcpu's? And > what happens if the vcpu is ready to schedule but none of the > restricted set of pcpu's is available?It's a restriction. Each of the values in the mask is processed modulo the number of physical CPUs.> 3) Does "cpus" really have any real-world usage anyhow? E.g. are most > uses probably just > user misunderstanding where "vcpu_avail" should be used instead?I'm sure some admins use it to good effect in hand placing domains on CPUs, especially in a NUMA context. In most cases its typically best to be fully work conserving and give Xen's scheduler full flexibility. There was an extension to the cpus= syntax proposed at one point that I'm not sure whether it ever got checked in. The idea was to allow the cpus= parameter to be a list of strings, enabling a different mask to specified for each VCPU. This would enable an admin to pin individual VCPUs to CPUs rather than just at a domain level. I'm not a huge fan of the cpus= mechanism. It would likely be more user friendly to allow physical CPUs to be put into groups then allow domains to be assigned to CPU groups. It would also be better if you could specify physical CPUs by a node.socket.core.thread hierarchy rather than the enumerated CPU number. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Thanks for the reply and sorry for the delay in mine... I've been having email problems. Please note proposal and request for comments below (marked with >>>>>> Comments? <<<<<)> > 1) Is the "cpus" parameter expected to work in a config > file or is it > > somehow deprecated? > > (I see there is an "xm vcpu-pin" command so perhaps this is the > > accepted way to > > pin cpu's?) > > It's expected to work.Yes indeed it does work. There were some syntax variations in the cpus param that I didn't quite understand. However, my misunderstanding uncovered another interesting problem. See below.> > 3) Does "cpus" really have any real-world usage anyhow? > E.g. are most > > uses probably just > > user misunderstanding where "vcpu_avail" should be used instead? > > I'm sure some admins use it to good effect in hand placing > domains on CPUs, especially in a NUMA context. In most cases > its typically best to be fully work conserving and give Xen's > scheduler full flexibility.Yeah, I guess if you think of it as "poor man's hard partitioning" it makes a lot of sense. But if you think of it in a utility data center context, true affinity rather than restriction may make more sense. And vcpu_avail should cover most app licensing/pricing concerns.> > what happens if the vcpu is ready to schedule but none of the > > restricted set of pcpu's is available? > > It's a restriction. Each of the values in the mask is > processed modulo the number of physical CPUs.The output from "xm vcpu-list" observes the "modulo" but apparently the scheduler does not. For example on a 2 pcpu system launching a 2 vcpu guest with cpus=0,3 (noting that 3 mod 2 = 1), "xm vcpu-list" shows that each of the 2 vcpu's of the guest have "any cpu" in the "CPU Affinity" column, reflecting the fact that 0,3 is modulo the same as 0,1 which is the same as 0-1 which is the same as all. However, the cpu_mask is saved as 0,3 and the scheduler ignores any pcpu's other than 0 and 1. This can be observed in "xm vcpu-list" in the above example by seeing that both guest vcpus are sharing processor 0. So the results displayed by "xm vcpu-list" and the actual scheduler placement are different, but which one is the bug? Consider: If a 2 vcpu guest is running on an 8 pcpu machine and has been restricted to cpus="2,3,4,5" and this 2 vcpu guest gets migrated to a 4 pcpu system, to which pcpus should the migrated guest be restricted? Using the xm_vcpu-list logic it gets all 4 pcpus, but (if cpu_mask were preserved which it currently isn't) the scheduler logic would give it just two (2 and 3). And suppose this 2 vcpu guest on the 8 pcpu system were restricted to "5-8" and migrated to a 4 pcpu system. It wouldn't get any processor time at all (though xm_vcpu-list would say each vcpu's CPU Affinity is "any"). Because affinity/cpu_restriction is not currently preserved across save/restore or migration, this is a moot discussion. But if I were to "fix" it so it were preserved, the decision is important. My opinion: CPU affinity/restriction should NOT be preserved across migration. Or if it is, it should only be preserved when the source and target have the same number of pcpus (thus allowing save/restore to work OK). Or maybe it should only be preserved for save/restore and not for migration.>>>>>>>>>>>>>>>>> Comments? <<<<<<<<<<<<<<<<<<<<<<<<<<<<<Note that vcpu_avail would still work across migration. (Hmmm... have to look to see if vcpu_avail is currently preserved across save/restore/migration. If not, I will definitely need to find and fix that one.)> There was an extension to the cpus= syntax proposed at one > point that I'm not sure whether it ever got checked in. The > idea was to allow the cpus= parameter to be a list of > strings, enabling a different mask to specified for each > VCPU. This would enable an admin to pin individual VCPUs to > CPUs rather than just at a domain level.It looks like the internal vcpu data structure supports this and xm_vcpu-pin supports it, but afaict there's no way to specify per-vcpu-affinity at xm_create.> I'm not a huge fan of the cpus= mechanism. It would likely be > more user friendly to allow physical CPUs to be put into > groups then allow domains to be assigned to CPU groups. It > would also be better if you could specify physical CPUs by a > node.socket.core.thread hierarchy rather than the enumerated > CPU number.Agreed, though I'll bet that would take major scheduler surgery. And this would also further increase the confusion for migration! I'd also like to see affinity and restriction teased apart because they are separate concepts with different uses. Dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 9/1/08 18:40, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:> My opinion: CPU affinity/restriction should NOT be preserved > across migration. Or if it is, it should only be preserved > when the source and target have the same number of pcpus > (thus allowing save/restore to work OK). Or maybe it should > only be preserved for save/restore and not for migration. >>>>>>>>>>>> Comments? <<<<<<<<<<<<<<<<<<<<<<<<<<<<<I agree with that. Unless save/restore is on the same machine (identified in some way) or at least has identical CPU topology as far as we can see. Otherwise some higher-level entity needs to be smart enough to work out affinity during restore and issue the correct ''xm'' commands (or equivalent). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
As a logical consequence: - the v->cpu_affinity mask should never have bits set for processors that don''t exist on the current physical system (although all bits set == "any" is probably an OK exception) - the modulo behavior currently implemented in "xm vcpu-pin" and the config file "cpus" parameter should be removed, and - if cpu values are specified by "xm vcpu-pin" or "cpus" beyond the number of physical cpus, the xm command should fail. Agreed?> -----Original Message----- > From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] > Sent: Wednesday, January 09, 2008 12:17 PM > To: dan.magenheimer@oracle.com; Ian Pratt; > xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] "cpus" config parameter broken? > > > On 9/1/08 18:40, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote: > > > My opinion: CPU affinity/restriction should NOT be preserved > > across migration. Or if it is, it should only be preserved > > when the source and target have the same number of pcpus > > (thus allowing save/restore to work OK). Or maybe it should > > only be preserved for save/restore and not for migration. > >>>>>>>>>>>> Comments? <<<<<<<<<<<<<<<<<<<<<<<<<<<<< > > I agree with that. Unless save/restore is on the same machine > (identified in > some way) or at least has identical CPU topology as far as we can see. > Otherwise some higher-level entity needs to be smart enough > to work out > affinity during restore and issue the correct ''xm'' commands > (or equivalent). > > -- Keir > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 10/1/08 18:38, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:> As a logical consequence: > > - the v->cpu_affinity mask should never have bits set for > processors that don''t exist on the current physical system > (although all bits set == "any" is probably an OK exception)This is already the case.> - the modulo behavior currently implemented in "xm vcpu-pin" > and the config file "cpus" parameter should be removed, andPossibly.> - if cpu values are specified by "xm vcpu-pin" or "cpus" > beyond the number of physical cpus, the xm command should > fail.Again, possibly. I don''t see much wrong with a liberal interpretation of otherwise incorrect cpu config parameters though. If we tighten things up then we need to make it easier to access CPU topology info from within domain config files. -- Keir> Agreed? > >> -----Original Message----- >> From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] >> Sent: Wednesday, January 09, 2008 12:17 PM >> To: dan.magenheimer@oracle.com; Ian Pratt; >> xen-devel@lists.xensource.com >> Subject: Re: [Xen-devel] "cpus" config parameter broken? >> >> >> On 9/1/08 18:40, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote: >> >>> My opinion: CPU affinity/restriction should NOT be preserved >>> across migration. Or if it is, it should only be preserved >>> when the source and target have the same number of pcpus >>> (thus allowing save/restore to work OK). Or maybe it should >>> only be preserved for save/restore and not for migration. >>>>>>>>>>>> Comments? <<<<<<<<<<<<<<<<<<<<<<<<<<<<< >> >> I agree with that. Unless save/restore is on the same machine >> (identified in >> some way) or at least has identical CPU topology as far as we can see. >> Otherwise some higher-level entity needs to be smart enough >> to work out >> affinity during restore and issue the correct ''xm'' commands >> (or equivalent). >> >> -- Keir >> >> >> >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
I have blinders on since this discussion started with my trying to figure out the syntax and semantics for the "cpus" parameter as used in a config file, but:> > - the v->cpu_affinity mask should never have bits set for > > This is already the case.No, with the cpus parameter, it is currently possible to set bits in v->cpu_affinity mask for processors that don''t exist. Perhaps this is the real bug then. I will spin a patch to implement the modulo behavior from "xm vcpu-set" for the parsing of the cpus parameter and all will be well. Thanks, Dan> -----Original Message----- > From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] > Sent: Thursday, January 10, 2008 1:50 PM > To: dan.magenheimer@oracle.com; Ian Pratt; > xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] "cpus" config parameter broken? > > > > > > On 10/1/08 18:38, "Dan Magenheimer" > <dan.magenheimer@oracle.com> wrote: > > > As a logical consequence: > > > > - the v->cpu_affinity mask should never have bits set for > > processors that don''t exist on the current physical system > > (although all bits set == "any" is probably an OK exception) > > This is already the case. > > > - the modulo behavior currently implemented in "xm vcpu-pin" > > and the config file "cpus" parameter should be removed, and > > Possibly. > > > - if cpu values are specified by "xm vcpu-pin" or "cpus" > > beyond the number of physical cpus, the xm command should > > fail. > > Again, possibly. I don''t see much wrong with a liberal > interpretation of > otherwise incorrect cpu config parameters though. If we > tighten things up > then we need to make it easier to access CPU topology info from within > domain config files. > > -- Keir > > > Agreed? > > > >> -----Original Message----- > >> From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] > >> Sent: Wednesday, January 09, 2008 12:17 PM > >> To: dan.magenheimer@oracle.com; Ian Pratt; > >> xen-devel@lists.xensource.com > >> Subject: Re: [Xen-devel] "cpus" config parameter broken? > >> > >> > >> On 9/1/08 18:40, "Dan Magenheimer" > <dan.magenheimer@oracle.com> wrote: > >> > >>> My opinion: CPU affinity/restriction should NOT be preserved > >>> across migration. Or if it is, it should only be preserved > >>> when the source and target have the same number of pcpus > >>> (thus allowing save/restore to work OK). Or maybe it should > >>> only be preserved for save/restore and not for migration. > >>>>>>>>>>>> Comments? <<<<<<<<<<<<<<<<<<<<<<<<<<<<< > >> > >> I agree with that. Unless save/restore is on the same machine > >> (identified in > >> some way) or at least has identical CPU topology as far as > we can see. > >> Otherwise some higher-level entity needs to be smart enough > >> to work out > >> affinity during restore and issue the correct ''xm'' commands > >> (or equivalent). > >> > >> -- Keir > >> > >> > >> > > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 10/1/08 21:10, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:>>> - the v->cpu_affinity mask should never have bits set for >> >> This is already the case. > > No, with the cpus parameter, it is currently possible to > set bits in v->cpu_affinity mask for processors that don''t > exist.Ah yes. But then the offline CPUs get masked out in vcpu_set_affinity(), and the affinity mask is then rejected if the remaining CPU set is empty. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> >>> - the v->cpu_affinity mask should never have bits set for > >> > >> This is already the case. > > > > No, with the cpus parameter, it is currently possible to > > set bits in v->cpu_affinity mask for processors that don''t > > exist. > > Ah yes. But then the offline CPUs get masked out in > vcpu_set_affinity(), and > the affinity mask is then rejected if the remaining CPU set is empty.I see you are correct that the v->cpu_affinity bits never do get set. But the mask is not rejected -- but instead some bits are silently ignored -- if there are both online and offline cpus in the list. So: cpus="0,3" on a 2p machine will currently set only one bit (bit 0) on a 2p but xm vcpu-pin domid all "0,3" will set two bits. Whereas cpus="2-3" will cause an error on a 2p but xm vcpu-pin domid all "2-3" will not. This would become relevant if the "cpus" parameter were preserved across a migration (rather than v->cpu_affinity), which is what led to my original confusion. So modulo-izing the cpus parameter code will eliminate this case, but I still wonder if vcpu_set_affinity should reject any mask that has bits set beyond max_pcpu instead of silently ignoring those bits. Seems like an accident waiting to happen and indeed I got bitten by it. Which is why I proposed tightening the definition of all affinity masks (and strings representing masks) to "if you try to enable a bit in the cpumask that refers to a non-existent processor, you will get an error" Thanks, Dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 10/1/08 22:40, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:> So modulo-izing the cpus parameter code will eliminate this > case, but I still wonder if vcpu_set_affinity should reject any > mask that has bits set beyond max_pcpu instead of silently > ignoring those bits. Seems like an accident waiting to happen > and indeed I got bitten by it. > > Which is why I proposed tightening the definition of all affinity > masks (and strings representing masks) to "if you try to enable > a bit in the cpumask that refers to a non-existent processor, you > will get an error"That doesn''t play nicely with CPU hotplug (not supported yet, but could well be in future) where the online_map could be continually changing. The model I''m aiming for in Xen is to remember all the CPUs requested by the toolstack, but only schedule onto the subset that are actually online right now (obviously). The implementation of this is of course quite simple given the CPU hotplug is not supported right now. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > So modulo-izing the cpus parameter code will eliminate this > > case, but I still wonder if vcpu_set_affinity should reject any > > mask that has bits set beyond max_pcpu instead of silently > > ignoring those bits. Seems like an accident waiting to happen > > and indeed I got bitten by it. > > > > Which is why I proposed tightening the definition of all affinity > > masks (and strings representing masks) to "if you try to enable > > a bit in the cpumask that refers to a non-existent processor, you > > will get an error" > > That doesn''t play nicely with CPU hotplug (not supported yet, > but could well > be in future) where the online_map could be continually > changing. The model > I''m aiming for in Xen is to remember all the CPUs requested by the > toolstack, but only schedule onto the subset that are > actually online right > now (obviously). The implementation of this is of course > quite simple given > the CPU hotplug is not supported right now.Agreed, but even with CPU hotplug there will be some max_pcpu value on any given machine. That''s why I said "non-existent processor" in the proposal even though you said "offline processor". Dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 10/1/08 22:53, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:>> That doesn''t play nicely with CPU hotplug (not supported yet, >> but could well >> be in future) where the online_map could be continually >> changing. The model >> I''m aiming for in Xen is to remember all the CPUs requested by the >> toolstack, but only schedule onto the subset that are >> actually online right >> now (obviously). The implementation of this is of course >> quite simple given >> the CPU hotplug is not supported right now. > > Agreed, but even with CPU hotplug there will be some max_pcpu value > on any given machine. That''s why I said "non-existent processor" > in the proposal even though you said "offline processor".You mean CPUs beyond NR_CPUS? All the cpumask iterators are careful not to return values beyond NR_CPUS, regardless of what stray bits lie beyond that range in the longword bitmap. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> >> changing. The model > >> I''m aiming for in Xen is to remember all the CPUs requested by the > >> toolstack, but only schedule onto the subset that are > >> actually online right > >> now (obviously). The implementation of this is of course > >> quite simple given > >> the CPU hotplug is not supported right now. > > > > Agreed, but even with CPU hotplug there will be some max_pcpu value > > on any given machine. That''s why I said "non-existent processor" > > in the proposal even though you said "offline processor". > > You mean CPUs beyond NR_CPUS? All the cpumask iterators are > careful not to > return values beyond NR_CPUS, regardless of what stray bits > lie beyond that > range in the longword bitmap.I see... you are allowing for any future box to grow to NR_CPUS and I am assuming that, even with future hot-add processors, Xen will be told by the box the maximum number of processors that will ever be online (call this max_pcpu), and that max_pcpu is probably less than NR_CPUS. So for these NR_CPUS-max_pcpu processors that are "non-existent" (and especially for the foreseeable future on the vast majority of machines for which max_pcpu=npcpu=constant and ncpu << NR_CPUS), trying to set bits for non-existent processors should not be silently ignored and discarded, but should either be entirely disallowed or, at least, should be retained and ignored. I would propose "disallowed" for n > max_pcpu and retained and ignored for online_pcpu < n < max_pcpu. A related aside, for either model for hot-add (yours or mine), the current modulo mechanism in xm_vcpu_pin is not scaleable and imho should be removed now as well before anybody comes to depend on it. And lastly, this hot-add discussion reinforces in my mind the difference between affinity and restriction (and pinning) which are all muddled in the current hypervisor and tools. Dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
The current hypervisor interface has the advantage of flexibility. You can easily enforce various policies (including strict checking, or modulo arithmetic) in the toolstack on top of the current interface. But you can''t (easily) implement the current hypervisor policy in the toolstack on top of strict checking or modulo arithmetic (if one of those policies becomes hardcoded into the hypervisor). The current interface assumes the lowest levels of the toolstack know what they are doing, and presents a policy that is as permissive as possible. -- Keir On 10/1/08 23:46, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:>> You mean CPUs beyond NR_CPUS? All the cpumask iterators are >> careful not to >> return values beyond NR_CPUS, regardless of what stray bits >> lie beyond that >> range in the longword bitmap. > > I see... you are allowing for any future box to grow to NR_CPUS > and I am assuming that, even with future hot-add processors, > Xen will be told by the box the maximum number of processors > that will ever be online (call this max_pcpu), and that max_pcpu > is probably less than NR_CPUS. So for these NR_CPUS-max_pcpu > processors that are "non-existent" (and especially for the > foreseeable future on the vast majority of machines for which > max_pcpu=npcpu=constant and ncpu << NR_CPUS), trying to set > bits for non-existent processors should not be silently ignored > and discarded, but should either be entirely > disallowed or, at least, should be retained and ignored. > I would propose "disallowed" for n > max_pcpu and retained > and ignored for online_pcpu < n < max_pcpu. > > A related aside, for either model for hot-add (yours or mine), > the current modulo mechanism in xm_vcpu_pin is not scaleable > and imho should be removed now as well before anybody comes to > depend on it. > > And lastly, this hot-add discussion reinforces in my mind the > difference between affinity and restriction (and pinning) which > are all muddled in the current hypervisor and tools._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Sorry to belabo(u)r the point but I beg to differ: The current hypervisor interface is a strange mixture of flexibility and restriction (and policy and mechanism): Some mask parameters are left alone by vcpu_set_affinity, others are rejected entirely, and still others are silently modified. The advantage to the existing interface is of course that it preserves the downward interface to the schedulers, eg. schedulers can assume that any bit set represents a schedulable processor. So if the toolstack knows what it is doing, why does vcpu_set_affinity even look at the mask? IMHO either: 1) the policy belongs in the tools, in which case the and''ing of the mask should only be done by the scheduler whenever a vcpu is scheduled (thus allowing maximal flexibility for future highly dynamic hot-plug but ensuring a vcpu never gets scheduled on an offline or non-existent pcpu), or 2) the policy belongs in the hypervisor, in which case any attempt by the tools to allow scheduling (e.g. set affinity) on an offline or non-existent processor should be rejected (in which case the toolset is immediately notified that its understanding of the current online set is faulty). Though it could be argued academically that "policy" doesn''t belong in the hypervisor, rejecting an attempt by the tools to use a non-available processor isn''t much different than rejecting an SSE3 instruction on a non-SSE3 processor. (In other words, it''s really processor enforcement mechanism.) So I like #2. #1 would be OK too. I just don''t like the current muddle which has already led to misunderstandings and inconsistent implementations in the current toolchain. Dan> -----Original Message----- > From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] > Sent: Thursday, January 10, 2008 4:53 PM > To: dan.magenheimer@oracle.com; Ian Pratt; > xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] "cpus" config parameter broken? > > > The current hypervisor interface has the advantage of > flexibility. You can > easily enforce various policies (including strict checking, or modulo > arithmetic) in the toolstack on top of the current interface. > But you can''t > (easily) implement the current hypervisor policy in the > toolstack on top of > strict checking or modulo arithmetic (if one of those policies becomes > hardcoded into the hypervisor). > > The current interface assumes the lowest levels of the > toolstack know what > they are doing, and presents a policy that is as permissive > as possible. > > -- Keir > > On 10/1/08 23:46, "Dan Magenheimer" > <dan.magenheimer@oracle.com> wrote: > > >> You mean CPUs beyond NR_CPUS? All the cpumask iterators are > >> careful not to > >> return values beyond NR_CPUS, regardless of what stray bits > >> lie beyond that > >> range in the longword bitmap. > > > > I see... you are allowing for any future box to grow to NR_CPUS > > and I am assuming that, even with future hot-add processors, > > Xen will be told by the box the maximum number of processors > > that will ever be online (call this max_pcpu), and that max_pcpu > > is probably less than NR_CPUS. So for these NR_CPUS-max_pcpu > > processors that are "non-existent" (and especially for the > > foreseeable future on the vast majority of machines for which > > max_pcpu=npcpu=constant and ncpu << NR_CPUS), trying to set > > bits for non-existent processors should not be silently ignored > > and discarded, but should either be entirely > > disallowed or, at least, should be retained and ignored. > > I would propose "disallowed" for n > max_pcpu and retained > > and ignored for online_pcpu < n < max_pcpu. > > > > A related aside, for either model for hot-add (yours or mine), > > the current modulo mechanism in xm_vcpu_pin is not scaleable > > and imho should be removed now as well before anybody comes to > > depend on it. > > > > And lastly, this hot-add discussion reinforces in my mind the > > difference between affinity and restriction (and pinning) which > > are all muddled in the current hypervisor and tools. > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 11/1/08 00:43, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:> Though it could be argued academically that "policy" doesn''t > belong in the hypervisor, rejecting an attempt by the tools > to use a non-available processor isn''t much different than > rejecting an SSE3 instruction on a non-SSE3 processor. > (In other words, it''s really processor enforcement mechanism.) > So I like #2. #1 would be OK too. I just don''t like the > current muddle which has already led to misunderstandings > and inconsistent implementations in the current toolchain.Yes, probably we should not return an error if ANDing with online_map returns an empty set, and instead we should do some fallback (like ignore affinity altogether). This is what we would have to do in a cpu hot-unplug case, where that unplugged cpu was the only cpu in some vcpu''s affinity map. Either that or fail the CPU hot-unplug, I suppose. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel