Juergen Gross
2010-Aug-02 06:40 UTC
[Xen-devel] [Patch] use full-size cpumask for vcpu-pin
Hi, attached patch solves a problem with vcpu-pinning and hot-plug of cpus: If a vcpu is unpinned via xl vcpu-pin <domain> <vcpu> all on a system with 64 cpus and later other cpus are plugged in, this vcpu will be restricted to the first 64 cpus of the system. The reason is the allocation of the cpumap for pinning: the size is only for the ACTUAL number of physical cpus in the system, not the possible number. The solution is to allocate a cpumap for up to NR_CPUS. Repairing xm vcpu-pin is much harder and not covered by this patch, but the problem can be avoided by calling xm vcpu-pin <domain> <vcpu> 0-255 instead (all is hard-wired to 0-63 now). Juergen -- Juergen Gross Principal Developer Operating Systems TSP ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967 Fujitsu Technology Solutions e-mail: juergen.gross@ts.fujitsu.com Domagkstr. 28 Internet: ts.fujitsu.com D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2010-Aug-10 14:11 UTC
Re: [Xen-devel] [Patch] use full-size cpumask for vcpu-pin
Juergen Gross writes ("[Xen-devel] [Patch] use full-size cpumask for vcpu-pin"):> The reason is the allocation of the cpumap for pinning: the size is only for > the ACTUAL number of physical cpus in the system, not the possible number. > The solution is to allocate a cpumap for up to NR_CPUS.Thanks for this patch. However, this doesn''t seem to be even slightly backwards-compatible, because it changes the layout of the physinfo sysctl response. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Aug-10 14:27 UTC
Re: [Xen-devel] [Patch] use full-size cpumask for vcpu-pin
On 10/08/2010 15:11, "Ian Jackson" <Ian.Jackson@eu.citrix.com> wrote:> Juergen Gross writes ("[Xen-devel] [Patch] use full-size cpumask for > vcpu-pin"): >> The reason is the allocation of the cpumap for pinning: the size is only for >> the ACTUAL number of physical cpus in the system, not the possible number. >> The solution is to allocate a cpumap for up to NR_CPUS. > > Thanks for this patch. However, this doesn''t seem to be even slightly > backwards-compatible, because it changes the layout of the physinfo > sysctl response.The domctl and sysctl hypercalls do not need to be backward compatible across major Xen releases. All other hypercalls absolutely do. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Juergen Gross
2010-Aug-11 04:32 UTC
Re: [Xen-devel] [Patch] use full-size cpumask for vcpu-pin
On 08/10/10 16:11, Ian Jackson wrote:> Juergen Gross writes ("[Xen-devel] [Patch] use full-size cpumask for vcpu-pin"): >> The reason is the allocation of the cpumap for pinning: the size is only for >> the ACTUAL number of physical cpus in the system, not the possible number. >> The solution is to allocate a cpumap for up to NR_CPUS. > > Thanks for this patch. However, this doesn''t seem to be even slightly > backwards-compatible, because it changes the layout of the physinfo > sysctl response.No, it doesn''t change the layout. The new structure member just fills a hole which was already there due to alignment of the next member. Juergen -- Juergen Gross Principal Developer Operating Systems TSP ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967 Fujitsu Technology Solutions e-mail: juergen.gross@ts.fujitsu.com Domagkstr. 28 Internet: ts.fujitsu.com D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2010-Aug-13 12:31 UTC
Re: [Xen-devel] [Patch] use full-size cpumask for vcpu-pin
Juergen Gross writes ("[Xen-devel] [Patch] use full-size cpumask for vcpu-pin"):> attached patch solves a problem with vcpu-pinning and hot-plug of cpus:Thanks. This is a mixed tools/hypervisor patch. We''ve discussed it and it seems good to me. Keir, would you care to apply it, or would you like it to go through the tools tree ? Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Aug-13 12:58 UTC
Re: [Xen-devel] [Patch] use full-size cpumask for vcpu-pin
On 13/08/2010 13:31, "Ian Jackson" <Ian.Jackson@eu.citrix.com> wrote:> Juergen Gross writes ("[Xen-devel] [Patch] use full-size cpumask for > vcpu-pin"): >> attached patch solves a problem with vcpu-pinning and hot-plug of cpus: > > Thanks. This is a mixed tools/hypervisor patch. We''ve discussed it > and it seems good to me. Keir, would you care to apply it, or would > you like it to go through the tools tree ? > > Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>Actually, now I look at it, I have to NACK the patch. Sorry I didn''t look closely enough earlier. I think the bug can be addressed without any hypervisor changes: when vcpu-pinning, the tools can quite correctly pass a cpumask to Xen just big enough to express just the CPUs in the new affinity set. If the resulting mask is too narrow to address all CPUs in the system, then Xen will pad it out with zeroes. If the resulting mask is too wide, Xen will simply truncate it. All this is done silently at the time of the setvcpuaffinity hypercall. Hence, Juergen''s hypervisor changes are really unnecessary, and a neater tools fix could probably be arrived at without the hypervisor changes. -- Keir> Ian. > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2010-Aug-13 13:11 UTC
Re: [Xen-devel] [Patch] use full-size cpumask for vcpu-pin
Keir Fraser writes ("Re: [Xen-devel] [Patch] use full-size cpumask for vcpu-pin"):> Actually, now I look at it, I have to NACK the patch. Sorry I didn''t look > closely enough earlier. I think the bug can be addressed without any > hypervisor changes: when vcpu-pinning, the tools can quite correctly pass a > cpumask to Xen just big enough to express just the CPUs in the new affinity > set. If the resulting mask is too narrow to address all CPUs in the system, > then Xen will pad it out with zeroes. If the resulting mask is too wide, Xen > will simply truncate it. All this is done silently at the time of the > setvcpuaffinity hypercall.The difficulty is, as I understand it, how to express the mask "all CPUs present and future". Since the number of pcpus may be very large. The tools presumably shouldn''t pass a kilobyte of 0xff :-). Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Aug-13 13:15 UTC
Re: [Xen-devel] [Patch] use full-size cpumask for vcpu-pin
On 13/08/2010 14:11, "Ian Jackson" <Ian.Jackson@eu.citrix.com> wrote:> Keir Fraser writes ("Re: [Xen-devel] [Patch] use full-size cpumask for > vcpu-pin"): >> Actually, now I look at it, I have to NACK the patch. Sorry I didn''t look >> closely enough earlier. I think the bug can be addressed without any >> hypervisor changes: when vcpu-pinning, the tools can quite correctly pass a >> cpumask to Xen just big enough to express just the CPUs in the new affinity >> set. If the resulting mask is too narrow to address all CPUs in the system, >> then Xen will pad it out with zeroes. If the resulting mask is too wide, Xen >> will simply truncate it. All this is done silently at the time of the >> setvcpuaffinity hypercall. > > The difficulty is, as I understand it, how to express the mask "all > CPUs present and future". Since the number of pcpus may be very > large. The tools presumably shouldn''t pass a kilobyte of 0xff :-).Since Xen discards any CPUs which are not online at the time of the setvcpuaffinity hypercall, expressing "all CPUs present and future" is a bit pointless. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Aug-13 13:21 UTC
Re: [Xen-devel] [Patch] use full-size cpumask for vcpu-pin
On 13/08/2010 14:15, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:> On 13/08/2010 14:11, "Ian Jackson" <Ian.Jackson@eu.citrix.com> wrote: > >> The difficulty is, as I understand it, how to express the mask "all >> CPUs present and future". Since the number of pcpus may be very >> large. The tools presumably shouldn''t pass a kilobyte of 0xff :-). > > Since Xen discards any CPUs which are not online at the time of the > setvcpuaffinity hypercall, expressing "all CPUs present and future" is a bit > pointless.Hm, no, I''m talking rubbish. Argh. Okay, his patch is fine then. :-D You can check it in to xen-unstable-tools.hg. -- Keir> -- Keir >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Aug-13 13:25 UTC
Re: [Xen-devel] [Patch] use full-size cpumask for vcpu-pin
On 13/08/2010 14:21, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:>> Since Xen discards any CPUs which are not online at the time of the >> setvcpuaffinity hypercall, expressing "all CPUs present and future" is a bit >> pointless. > > Hm, no, I''m talking rubbish. Argh. Okay, his patch is fine then. :-D > > You can check it in to xen-unstable-tools.hg.One suggestion: that we rename the syctl.physinfo field ''max_phys_cpus'' to ''max_possible_cpus''. The ''phys'' is kind of redundant since this is the physinfo sysctl, and ''possible'' provides a better hint that this fields indicates maximum possible supported CPUs now and forever on this boot of the system. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2010-Aug-13 13:29 UTC
Re: [Xen-devel] [Patch] use full-size cpumask for vcpu-pin
Keir Fraser writes ("Re: [Xen-devel] [Patch] use full-size cpumask for vcpu-pin"):> On 13/08/2010 14:21, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote: > > Hm, no, I''m talking rubbish. Argh. Okay, his patch is fine then. :-D > > You can check it in to xen-unstable-tools.hg.:-).> One suggestion: that we rename the syctl.physinfo field ''max_phys_cpus'' to > ''max_possible_cpus''. The ''phys'' is kind of redundant since this is the > physinfo sysctl, and ''possible'' provides a better hint that this fields > indicates maximum possible supported CPUs now and forever on this boot of > the system.Right, willdo. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Aug-13 13:30 UTC
Re: [Xen-devel] [Patch] use full-size cpumask for vcpu-pin
On 13/08/2010 14:25, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:> On 13/08/2010 14:21, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote: > >>> Since Xen discards any CPUs which are not online at the time of the >>> setvcpuaffinity hypercall, expressing "all CPUs present and future" is a bit >>> pointless. >> >> Hm, no, I''m talking rubbish. Argh. Okay, his patch is fine then. :-D >> >> You can check it in to xen-unstable-tools.hg. > > One suggestion: that we rename the syctl.physinfo field ''max_phys_cpus'' to > ''max_possible_cpus''. The ''phys'' is kind of redundant since this is the > physinfo sysctl, and ''possible'' provides a better hint that this fields > indicates maximum possible supported CPUs now and forever on this boot of > the system.Even better, let''s not introduce a new field at all, and let''s always set sysctl.phys_info.max_cpu_id to NR_CPUS-1. Then I don''t think any tools changes are needed. Yeah, that''s what we should do. I will make a patch for that. -- Keir> -- Keir >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Aug-13 14:09 UTC
Re: [Xen-devel] [Patch] use full-size cpumask for vcpu-pin
On 13/08/2010 14:30, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:>> One suggestion: that we rename the syctl.physinfo field ''max_phys_cpus'' to >> ''max_possible_cpus''. The ''phys'' is kind of redundant since this is the >> physinfo sysctl, and ''possible'' provides a better hint that this fields >> indicates maximum possible supported CPUs now and forever on this boot of >> the system. > > Even better, let''s not introduce a new field at all, and let''s always set > sysctl.phys_info.max_cpu_id to NR_CPUS-1. Then I don''t think any tools > changes are needed. > > Yeah, that''s what we should do. I will make a patch for that.Done as xen-unstable:f6e1a597a92f. At this late stage it''s probably a bit too subtle for backport to 4.0.1 unfortunately (I hate to wonder if there could be subtle side effects on xend''s usage of these max_*_id fields). -- Keir> -- Keir > >> -- Keir >> >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel