All, Jinsong''s patches having been in for nearly a month now, but not being in a shape that would make releasing in 4.4 or backporting to the older trees desirable, we need to come to a conclusion on which way to go. Currently it looks like we have three options, but of course I''ll be happy to see other (better!) ones proposed. 1) Stay with what we have. 2) Revert 86d60e85 ("VMX: flush cache when vmentry back to UC guest") in its entirety plus, perhaps, the change 62652c00 ("VMX: fix cr0.cd handling") did to vmx_ctxt_switch_to(). 3) Apply the attached patch that Andrew and I have been putting together, with the caveat that it''s still incomplete (see below). The latter two are based on the observation that the amount of cache flushing we do with what is in the master tree right now is more than what we did prior to that patch series but still insufficient. Hence the revert would get us back to the earlier state (and obviously eliminate the performance problems that were observed when doing too eager flushing), whereas applying the extra 5th patch would get us closer to a proper solution. The problem with that new patch is it''s - as said - still incomplete, yet setting ->arch.hvm_vcpu.hypervisor_access_uc_hvm_memory for all writes through map_domain_page_global() mappings would get rather ugly: - guest_walk_tables() (accumulating set_ad_bits() results) - all writes through vcpu_info() - event_fifo.c would need this scattered around - hvm_load_segment_selector()''s setting of the accessed bit - hvm_task_switch()''s handling of the busy bit - all the nested HVM code''s writes through ->nv_vvmcx Plus the performance effects of it aren''t really known yet (all we do know is that the too eager "flush always prior to VM entry" is causing unacceptable stalls). Otoh we have had no reports that the lack of proper cache flushing actually caused any problems, and hence reverting the partial flushing introduced isn''t likely to push new problems onto people. Yet if, ever, such a problem would surface, it can be expected to be very hard to diagnose. Please, everyone having an opinion here - voice it. Thanks, Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On 12/02/2013 02:28 PM, Jan Beulich wrote:> All, > > Jinsong''s patches having been in for nearly a month now, but not > being in a shape that would make releasing in 4.4 or backporting to > the older trees desirable, we need to come to a conclusion on > which way to go. Currently it looks like we have three options, but > of course I''ll be happy to see other (better!) ones proposed. > > 1) Stay with what we have. > > 2) Revert 86d60e85 ("VMX: flush cache when vmentry back to UC > guest") in its entirety plus, perhaps, the change 62652c00 ("VMX: > fix cr0.cd handling") did to vmx_ctxt_switch_to(). > > 3) Apply the attached patch that Andrew and I have been putting > together, with the caveat that it''s still incomplete (see below). > > The latter two are based on the observation that the amount of > cache flushing we do with what is in the master tree right now is > more than what we did prior to that patch series but still > insufficient. Hence the revert would get us back to the earlier > state (and obviously eliminate the performance problems that > were observed when doing too eager flushing), whereas > applying the extra 5th patch would get us closer to a proper > solution.What''s missing is a description of the pros and cons of 1 and 2. Do you have any links to threads describing the problem? -George
On 02/12/2013 19:43, George Dunlap wrote:> On 12/02/2013 02:28 PM, Jan Beulich wrote: >> All, >> >> Jinsong''s patches having been in for nearly a month now, but not >> being in a shape that would make releasing in 4.4 or backporting to >> the older trees desirable, we need to come to a conclusion on >> which way to go. Currently it looks like we have three options, but >> of course I''ll be happy to see other (better!) ones proposed. >> >> 1) Stay with what we have. >> >> 2) Revert 86d60e85 ("VMX: flush cache when vmentry back to UC >> guest") in its entirety plus, perhaps, the change 62652c00 ("VMX: >> fix cr0.cd handling") did to vmx_ctxt_switch_to(). >> >> 3) Apply the attached patch that Andrew and I have been putting >> together, with the caveat that it''s still incomplete (see below). >> >> The latter two are based on the observation that the amount of >> cache flushing we do with what is in the master tree right now is >> more than what we did prior to that patch series but still >> insufficient. Hence the revert would get us back to the earlier >> state (and obviously eliminate the performance problems that >> were observed when doing too eager flushing), whereas >> applying the extra 5th patch would get us closer to a proper >> solution. > > What''s missing is a description of the pros and cons of 1 and 2. Do > you have any links to threads describing the problem? > > -George >1) has the basic XSA-60 fixes and some wbinvd()s, which are a significant performance issue and insufficient to completely fix the problem at hand. As a result, 1) is the worst possible option to stay with as far as Xen is concerned (irrespective of the upcoming 4.4 release). 2) will revert us back to the basic XSA-60 with none of the wbinvd()s, which fixes the security issue and is no worse than before in terms of a correctness-in-the-case-of-uncachable-hvm-domains point of view. 3) as-is is still insufficient to fix the problem in 1), and would currently result in a further performance regression. FWIW, my vote is for option 2) which will ease the current performance regression, in favor of allowing us time to come up with a proper solution to the pre-existing problem of Xen and Qemu mappings of a UC domain''s memory. ~Andrew
Andrew Cooper wrote:> On 02/12/2013 19:43, George Dunlap wrote: >> On 12/02/2013 02:28 PM, Jan Beulich wrote: >>> All, >>> >>> Jinsong''s patches having been in for nearly a month now, but not >>> being in a shape that would make releasing in 4.4 or backporting to >>> the older trees desirable, we need to come to a conclusion on >>> which way to go. Currently it looks like we have three options, but >>> of course I''ll be happy to see other (better!) ones proposed. >>> >>> 1) Stay with what we have. >>> >>> 2) Revert 86d60e85 ("VMX: flush cache when vmentry back to UC >>> guest") in its entirety plus, perhaps, the change 62652c00 ("VMX: >>> fix cr0.cd handling") did to vmx_ctxt_switch_to(). >>> >>> 3) Apply the attached patch that Andrew and I have been putting >>> together, with the caveat that it''s still incomplete (see below). >>> >>> The latter two are based on the observation that the amount of >>> cache flushing we do with what is in the master tree right now is >>> more than what we did prior to that patch series but still >>> insufficient. Hence the revert would get us back to the earlier >>> state (and obviously eliminate the performance problems that >>> were observed when doing too eager flushing), whereas >>> applying the extra 5th patch would get us closer to a proper >>> solution. >> >> What''s missing is a description of the pros and cons of 1 and 2. Do >> you have any links to threads describing the problem? >> >> -George >> > > 1) has the basic XSA-60 fixes and some wbinvd()s, which are a > significant performance issue and insufficient to completely fix the > problem at hand. As a result, 1) is the worst possible option to stay > with as far as Xen is concerned (irrespective of the upcoming 4.4 > release). > > 2) will revert us back to the basic XSA-60 with none of the wbinvd()s, > which fixes the security issue and is no worse than before in terms > of a correctness-in-the-case-of-uncachable-hvm-domains point of view. > > 3) as-is is still insufficient to fix the problem in 1), and would > currently result in a further performance regression. > > FWIW, my vote is for option 2) which will ease the current performance > regression, in favor of allowing us time to come up with a proper > solution to the pre-existing problem of Xen and Qemu mappings of a UC > domain''s memory. > > ~AndrewI also vote option 2, but only revert 86d60e85, keeping 62652c00 (wbinvd at vmx_ctxt_switch_to) since it''s used to avoid being polluted when vcpu migrate to another cpu. Thanks, Jinsong
>>> On 03.12.13 at 03:21, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > 1) has the basic XSA-60 fixes and some wbinvd()s, which are a > significant performance issue and insufficient to completely fix the > problem at hand. As a result, 1) is the worst possible option to stay > with as far as Xen is concerned (irrespective of the upcoming 4.4 release).I should add that another aspect here is the current inconsistency of the code: In a few places we give the appearance of taking care of the required flushing, but we don''t really do so everywhere, so whoever is going to read the code in close detail will rightfully say "this can''t be correct". Whereas if we revert the wbinvd()s we''re at least back to a consistent (even if theoretically broken) state.> 2) will revert us back to the basic XSA-60 with none of the wbinvd()s, > which fixes the security issue and is no worse than before in terms of a > correctness-in-the-case-of-uncachable-hvm-domains point of view.And keep in mind that at present we have no knowledge of guests actually needing to run with caching disabled - to only known use case is that of the guest setting its MTRRs (where the specification requires caching to be disabled, but the OS [as well as correctness] doesn''t really require that).> 3) as-is is still insufficient to fix the problem in 1), and would > currently result in a further performance regression.Hmm, I''m not sure about the performance regression: During the sole known usage (see above) there shouldn''t be any guest memory writes, and hence the "needs-flushing" flag should rarely if ever get set.> FWIW, my vote is for option 2) which will ease the current performance > regression, in favor of allowing us time to come up with a proper > solution to the pre-existing problem of Xen and Qemu mappings of a UC > domain''s memory.Agreed - my preference too is 2 over 3 over 1. Jan
>>> On 03.12.13 at 04:06, "Liu, Jinsong" <jinsong.liu@intel.com> wrote: > I also vote option 2, but only revert 86d60e85, keeping 62652c00 (wbinvd at > vmx_ctxt_switch_to) since it''s used to avoid being polluted when vcpu migrate > to another cpu.Please explain this in more detail. Both Andrew and I are concerned about this extra, but pretty pointless (without being done so too in other cases) wbinvd(). In particular you''d have to explain what its counterpart was in the code prior to your four patch XSA-60 series. Jan
Jan Beulich wrote:>>>> On 03.12.13 at 04:06, "Liu, Jinsong" <jinsong.liu@intel.com> wrote: >> I also vote option 2, but only revert 86d60e85, keeping 62652c00 >> (wbinvd at vmx_ctxt_switch_to) since it''s used to avoid being >> polluted when vcpu migrate to another cpu. > > Please explain this in more detail. Both Andrew and I are concerned > about this extra, but pretty pointless (without being done so too in > other cases) wbinvd(). In particular you''d have to explain what its > counterpart was in the code prior to your four patch XSA-60 series. > > JanThe wbinvd at vmx_ctxt_switch_to is for case like 1. vcpu runs at cpu A, flushing cache at vmx_handle_cd; 2. then the vcpu may switch out and migrate to cpu B; 3. historically cpu B may has cacheline polluted; so when the vcpu is scheduled to cpu B, we need flush cache. Thanks, Jinsong
>>> On 03.12.13 at 15:30, "Liu, Jinsong" <jinsong.liu@intel.com> wrote: > Jan Beulich wrote: >>>>> On 03.12.13 at 04:06, "Liu, Jinsong" <jinsong.liu@intel.com> wrote: >>> I also vote option 2, but only revert 86d60e85, keeping 62652c00 >>> (wbinvd at vmx_ctxt_switch_to) since it''s used to avoid being >>> polluted when vcpu migrate to another cpu. >> >> Please explain this in more detail. Both Andrew and I are concerned >> about this extra, but pretty pointless (without being done so too in >> other cases) wbinvd(). In particular you''d have to explain what its >> counterpart was in the code prior to your four patch XSA-60 series. > > The wbinvd at vmx_ctxt_switch_to is for case like > 1. vcpu runs at cpu A, flushing cache at vmx_handle_cd; > 2. then the vcpu may switch out and migrate to cpu B; > 3. historically cpu B may has cacheline polluted; > so when the vcpu is scheduled to cpu B, we need flush cache.But you didn''t clarify whether/how this case was taken care of _before_ your XSA-60 patches. Jan
On 12/03/2013 03:09 PM, Jan Beulich wrote:>>>> On 03.12.13 at 15:30, "Liu, Jinsong" <jinsong.liu@intel.com> wrote: >> Jan Beulich wrote: >>>>>> On 03.12.13 at 04:06, "Liu, Jinsong" <jinsong.liu@intel.com> wrote: >>>> I also vote option 2, but only revert 86d60e85, keeping 62652c00 >>>> (wbinvd at vmx_ctxt_switch_to) since it''s used to avoid being >>>> polluted when vcpu migrate to another cpu. >>> >>> Please explain this in more detail. Both Andrew and I are concerned >>> about this extra, but pretty pointless (without being done so too in >>> other cases) wbinvd(). In particular you''d have to explain what its >>> counterpart was in the code prior to your four patch XSA-60 series. >> >> The wbinvd at vmx_ctxt_switch_to is for case like >> 1. vcpu runs at cpu A, flushing cache at vmx_handle_cd; >> 2. then the vcpu may switch out and migrate to cpu B; >> 3. historically cpu B may has cacheline polluted; >> so when the vcpu is scheduled to cpu B, we need flush cache. > > But you didn''t clarify whether/how this case was taken care of > _before_ your XSA-60 patches.Is this still about guests doing wbinvd? As Jan said, there is no point in doing wbinvd piecemeal: if it''s not 100% reliable (to the best of our knowledge), then the more unreliable the better really. -George
Jan Beulich wrote:>>>> On 03.12.13 at 15:30, "Liu, Jinsong" <jinsong.liu@intel.com> wrote: >> Jan Beulich wrote: >>>>>> On 03.12.13 at 04:06, "Liu, Jinsong" <jinsong.liu@intel.com> >>>>>> wrote: >>>> I also vote option 2, but only revert 86d60e85, keeping 62652c00 >>>> (wbinvd at vmx_ctxt_switch_to) since it''s used to avoid being >>>> polluted when vcpu migrate to another cpu. >>> >>> Please explain this in more detail. Both Andrew and I are concerned >>> about this extra, but pretty pointless (without being done so too in >>> other cases) wbinvd(). In particular you''d have to explain what its >>> counterpart was in the code prior to your four patch XSA-60 series. >> >> The wbinvd at vmx_ctxt_switch_to is for case like >> 1. vcpu runs at cpu A, flushing cache at vmx_handle_cd; >> 2. then the vcpu may switch out and migrate to cpu B; >> 3. historically cpu B may has cacheline polluted; >> so when the vcpu is scheduled to cpu B, we need flush cache. > > But you didn''t clarify whether/how this case was taken care of > _before_ your XSA-60 patches. >I didn''t understand your question. What do you mean by ''before my XSA-60 patches''? Thanks, Jinsong
>>> On 04.12.13 at 13:04, "Liu, Jinsong" <jinsong.liu@intel.com> wrote: > Jan Beulich wrote: >>>>> On 03.12.13 at 15:30, "Liu, Jinsong" <jinsong.liu@intel.com> wrote: >>> Jan Beulich wrote: >>>>>>> On 03.12.13 at 04:06, "Liu, Jinsong" <jinsong.liu@intel.com> >>>>>>> wrote: >>>>> I also vote option 2, but only revert 86d60e85, keeping 62652c00 >>>>> (wbinvd at vmx_ctxt_switch_to) since it''s used to avoid being >>>>> polluted when vcpu migrate to another cpu. >>>> >>>> Please explain this in more detail. Both Andrew and I are concerned >>>> about this extra, but pretty pointless (without being done so too in >>>> other cases) wbinvd(). In particular you''d have to explain what its >>>> counterpart was in the code prior to your four patch XSA-60 series. >>> >>> The wbinvd at vmx_ctxt_switch_to is for case like >>> 1. vcpu runs at cpu A, flushing cache at vmx_handle_cd; >>> 2. then the vcpu may switch out and migrate to cpu B; >>> 3. historically cpu B may has cacheline polluted; >>> so when the vcpu is scheduled to cpu B, we need flush cache. >> >> But you didn''t clarify whether/how this case was taken care of >> _before_ your XSA-60 patches. >> > > I didn''t understand your question. What do you mean by ''before my XSA-60 > patches''?Before your 4 patch series was applied (e.g. consider plain 4.3.1) - how was the situation taken care of that your change to vmx_ctxt_switch_to() is intended to deal with? Jan
On 12/04/2013 12:16 PM, Jan Beulich wrote:>>>> On 04.12.13 at 13:04, "Liu, Jinsong" <jinsong.liu@intel.com> wrote: >> Jan Beulich wrote: >>>>>> On 03.12.13 at 15:30, "Liu, Jinsong" <jinsong.liu@intel.com> wrote: >>>> Jan Beulich wrote: >>>>>>>> On 03.12.13 at 04:06, "Liu, Jinsong" <jinsong.liu@intel.com> >>>>>>>> wrote: >>>>>> I also vote option 2, but only revert 86d60e85, keeping 62652c00 >>>>>> (wbinvd at vmx_ctxt_switch_to) since it''s used to avoid being >>>>>> polluted when vcpu migrate to another cpu. >>>>> Please explain this in more detail. Both Andrew and I are concerned >>>>> about this extra, but pretty pointless (without being done so too in >>>>> other cases) wbinvd(). In particular you''d have to explain what its >>>>> counterpart was in the code prior to your four patch XSA-60 series. >>>> The wbinvd at vmx_ctxt_switch_to is for case like >>>> 1. vcpu runs at cpu A, flushing cache at vmx_handle_cd; >>>> 2. then the vcpu may switch out and migrate to cpu B; >>>> 3. historically cpu B may has cacheline polluted; >>>> so when the vcpu is scheduled to cpu B, we need flush cache. >>> But you didn''t clarify whether/how this case was taken care of >>> _before_ your XSA-60 patches. >>> >> I didn''t understand your question. What do you mean by ''before my XSA-60 >> patches''? > Before your 4 patch series was applied (e.g. consider plain > 4.3.1) - how was the situation taken care of that your change > to vmx_ctxt_switch_to() is intended to deal with?It sounds like Jan is saying: We would only consider a patch that would fix regressions in functionality caused by the 4-patch XSA-60 series. Was there the possibility for cacheline pollution in the scenario you describe above before XSA-60 was fixed? If not, then this is a regression and we might consider a patch to restore that functionality. If there was the possibility of the above scenario before the XSA-60 series, then it''s not a regression; and therefore probably not something we want to accept at this point. Do I understand you properly, Jan? -George
Jan Beulich wrote:>>>> On 04.12.13 at 13:04, "Liu, Jinsong" <jinsong.liu@intel.com> wrote: >> Jan Beulich wrote: >>>>>> On 03.12.13 at 15:30, "Liu, Jinsong" <jinsong.liu@intel.com> >>>>>> wrote: >>>> Jan Beulich wrote: >>>>>>>> On 03.12.13 at 04:06, "Liu, Jinsong" <jinsong.liu@intel.com> >>>>>>>> wrote: >>>>>> I also vote option 2, but only revert 86d60e85, keeping 62652c00 >>>>>> (wbinvd at vmx_ctxt_switch_to) since it''s used to avoid being >>>>>> polluted when vcpu migrate to another cpu. >>>>> >>>>> Please explain this in more detail. Both Andrew and I are >>>>> concerned about this extra, but pretty pointless (without being >>>>> done so too in other cases) wbinvd(). In particular you''d have to >>>>> explain what its counterpart was in the code prior to your four >>>>> patch XSA-60 series. >>>> >>>> The wbinvd at vmx_ctxt_switch_to is for case like >>>> 1. vcpu runs at cpu A, flushing cache at vmx_handle_cd; >>>> 2. then the vcpu may switch out and migrate to cpu B; >>>> 3. historically cpu B may has cacheline polluted; >>>> so when the vcpu is scheduled to cpu B, we need flush cache. >>> >>> But you didn''t clarify whether/how this case was taken care of >>> _before_ your XSA-60 patches. >>> >> >> I didn''t understand your question. What do you mean by ''before my >> XSA-60 patches''? > > Before your 4 patch series was applied (e.g. consider plain > 4.3.1) - how was the situation taken care of that your change > to vmx_ctxt_switch_to() is intended to deal with? >Before my 4 patches (in fact before the 3rd patch 62652c00), Xen just didn''t take care of the case. Notice that wbinvd at vmx_ctxt_switch_to is for a corner case (when uc mode). Thanks, Jinsong
>>> On 04.12.13 at 16:55, George Dunlap <george.dunlap@eu.citrix.com> wrote: > On 12/04/2013 12:16 PM, Jan Beulich wrote: >>>>> On 04.12.13 at 13:04, "Liu, Jinsong" <jinsong.liu@intel.com> wrote: >>> Jan Beulich wrote: >>>>>>> On 03.12.13 at 15:30, "Liu, Jinsong" <jinsong.liu@intel.com> wrote: >>>>> Jan Beulich wrote: >>>>>>>>> On 03.12.13 at 04:06, "Liu, Jinsong" <jinsong.liu@intel.com> >>>>>>>>> wrote: >>>>>>> I also vote option 2, but only revert 86d60e85, keeping 62652c00 >>>>>>> (wbinvd at vmx_ctxt_switch_to) since it''s used to avoid being >>>>>>> polluted when vcpu migrate to another cpu. >>>>>> Please explain this in more detail. Both Andrew and I are concerned >>>>>> about this extra, but pretty pointless (without being done so too in >>>>>> other cases) wbinvd(). In particular you''d have to explain what its >>>>>> counterpart was in the code prior to your four patch XSA-60 series. >>>>> The wbinvd at vmx_ctxt_switch_to is for case like >>>>> 1. vcpu runs at cpu A, flushing cache at vmx_handle_cd; >>>>> 2. then the vcpu may switch out and migrate to cpu B; >>>>> 3. historically cpu B may has cacheline polluted; >>>>> so when the vcpu is scheduled to cpu B, we need flush cache. >>>> But you didn''t clarify whether/how this case was taken care of >>>> _before_ your XSA-60 patches. >>>> >>> I didn''t understand your question. What do you mean by ''before my XSA-60 >>> patches''? >> Before your 4 patch series was applied (e.g. consider plain >> 4.3.1) - how was the situation taken care of that your change >> to vmx_ctxt_switch_to() is intended to deal with? > > It sounds like Jan is saying: We would only consider a patch that would > fix regressions in functionality caused by the 4-patch XSA-60 series. > Was there the possibility for cacheline pollution in the scenario you > describe above before XSA-60 was fixed? If not, then this is a > regression and we might consider a patch to restore that functionality. > If there was the possibility of the above scenario before the XSA-60 > series, then it''s not a regression; and therefore probably not something > we want to accept at this point. > > Do I understand you properly, Jan?Yes, thanks for wording it in yet another way. Jan
>>> On 04.12.13 at 17:03, "Liu, Jinsong" <jinsong.liu@intel.com> wrote: > Before my 4 patches (in fact before the 3rd patch 62652c00), Xen just didn''t > take care of the case. > Notice that wbinvd at vmx_ctxt_switch_to is for a corner case (when uc > mode).That''s what my thinking was. And hence the intention to revert it as being incomplete anyway. Once we find a need to do the flushing, we''d need to bite the bullet and fix all of the possible cases, not just some of them. Are we on the same page now? Jan
Jan Beulich wrote:>>>> On 04.12.13 at 17:03, "Liu, Jinsong" <jinsong.liu@intel.com> wrote: >> Before my 4 patches (in fact before the 3rd patch 62652c00), Xen >> just didn''t take care of the case. Notice that wbinvd at >> vmx_ctxt_switch_to is for a corner case (when uc mode). > > That''s what my thinking was. And hence the intention to revert it > as being incomplete anyway. Once we find a need to do the > flushing, we''d need to bite the bullet and fix all of the possible > cases, not just some of them. > > Are we on the same page now? >From this point, yes. I don''t insist keeping that wbinvd. Thanks, Jinsong