Hi, The following 2 patches add basic MTRR/PAT support into hypervisor. When vt-d enabled, direct IO and RAM could be mapped to different cache attribute, such as UC or WC, which will bring some trouble. xen.patch: some data structures of MTRR in xen are exported. mtrr_pat.patch: The basic idea is listed below: a. MTRR/PAT MSRs are per vcpu. The value of guest MTRR is initialized in host, after guest E820 table is build. The value of guest PAT is initialized as default. The host PAT is initialized to cover all the page attributes. b. The guest page attribute is virtualized through shadow page attribute. First, get the effective guest page attribute, by calculating from the combination of guest MTRR/PAT. Then the shadow page attribute is calculated from effective guest page attribute and host MTRR. If conflict occurs(e.g effective guest page attribute is WB, host MTRR of this range is UC, can''t set this page attribute as guest required), set this range as UC. Some lookup tables are added to accelerate above procedure. Signed-off-by: Disheng Su <disheng.su@intel.com> Best Regards, Disheng, Su _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
The second patch is a bit of a hack-and-slash job. New memory-type virtualisation code should go in a new file, instead of being put in host MTRR management files (which are mostly unmodified from Linux anyway, and shouldn''t deviate more than necessary). There are various other one-line changes in various MTRR source files that are unexplained (e.g., mask changes from |0xf00 to |0x7ff). Initialisation of guest MTRR state will need to be moved to hvmloader -- we do not call setvcpucontext for HVM guests any more, so the current hack won''t work with current xen-unstable. Coding style needs cleaning up (looks like there is some spaces vs. tabs mixing up going on). -- Keir On 8/10/07 08:34, "Su, Disheng" <disheng.su@intel.com> wrote:> Hi, > The following 2 patches add basic MTRR/PAT support into > hypervisor. When vt-d enabled, direct IO and RAM > could be mapped to different cache attribute, such as UC or WC, which > will bring some trouble. > xen.patch: some data structures of MTRR in xen are exported. > mtrr_pat.patch: The basic idea is listed below: > a. MTRR/PAT MSRs are per vcpu. The value of guest MTRR is > initialized in host, after guest E820 table is build. The value of guest > PAT is initialized as default. The host PAT is initialized to cover all > the page attributes. > b. The guest page attribute is virtualized through shadow page > attribute. First, get the effective guest page attribute, by calculating > from the combination of guest MTRR/PAT. Then the shadow page attribute > is calculated from effective guest page attribute and host MTRR. If > conflict occurs(e.g effective guest page attribute is WB, host MTRR of > this range is UC, can''t set this page attribute as guest required), set > this range as UC. Some lookup tables are added to accelerate above > procedure. > > Signed-off-by: Disheng Su <disheng.su@intel.com> > > Best Regards, > Disheng, Su > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Thanks for your comments. Keir Fraser wrote:> The second patch is a bit of a hack-and-slash job. New memory-type > virtualisation code should go in a new file, instead of being put in > host MTRR management files (which are mostly unmodified from Linux > anyway, and shouldn''t deviate more than necessary). There are variousHow about add a new file called mtrr.c under xen/arch/x86/hvm/ ?> other one-line changes in various MTRR source files that are > unexplained (e.g., mask changes from |0xf00 to |0x7ff).Ok, will add more comments about the changes.> Initialisation of guest MTRR state will need to be moved to hvmloader > -- we do not call setvcpucontext for HVM guests any more, so theThe problem here is to initialize MTRR MSRs of both BP and AP, in order to make sure MTRR MSRs are the same. Can do it in hvmloader? Another place to do it is in ROMBIOS, but there is lack of AP initialization logic. So what''s your opinion about it? Do it in hvmloader , ROMBIOS or other place?> current hack won''t work with current xen-unstable. Coding style needs > cleaning up (looks like there is some spaces vs. tabs mixing up going > on). > > -- Keir > > On 8/10/07 08:34, "Su, Disheng" <disheng.su@intel.com> wrote: > >> Hi, >> The following 2 patches add basic MTRR/PAT support into >> hypervisor. When vt-d enabled, direct IO and RAM >> could be mapped to different cache attribute, such as UC or WC, >> which will bring some trouble. xen.patch: some data structures of >> MTRR in xen are exported. mtrr_pat.patch: The basic idea is listed >> below: a. MTRR/PAT MSRs are per vcpu. The value of guest >> MTRR is >> initialized in host, after guest E820 table is build. The value of >> guest PAT is initialized as default. The host PAT is initialized to >> cover all the page attributes. b. The guest page attribute >> is virtualized through shadow page attribute. First, get the >> effective guest page attribute, by calculating from the combination >> of guest MTRR/PAT. Then the shadow page attribute is calculated from >> effective guest page attribute and host MTRR. If conflict occurs(e.g >> effective guest page attribute is WB, host MTRR of this range is UC, >> can''t set this page attribute as guest required), set this range as >> UC. Some lookup tables are added to accelerate above procedure. >> >> Signed-off-by: Disheng Su <disheng.su@intel.com> >> >> Best Regards, >> Disheng, Su >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-develBest Regards, Disheng, Su _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
A couple of other comments: 1. I don''t see any changes to shadow code to look at the _PAGE_PAT bit; only _PAGE_PCD and _PAGE_PWT. Is this correct? 2. Introducing non-WB types for normal RAM is potentially a problem due to attribute mismatches (E.g., mappings in Xen and mappings in qemu-dm). Do you disallow non-WB types for RAM, or handle this in some other way? Disallow would probably work fine except for a few cases like AGP GART. Given that this is probably only safe for non-RAM pages, and given that usually such mappings will be UC anyway, is there any advantage to this large patch except accelerated access to passed-through video framebuffer access? The current ''collapse all memory types to UC for non-RAM mappings'' is I believe always safe, and video framebuffer access is the only case I can think of where there would be a performance loss that we might care about. We could probably fix that with a much smaller patch with a higher chance of acceptance for 3.2.0. -- Keir On 8/10/07 08:57, "Keir Fraser" <Keir.Fraser@cl.cam.ac.uk> wrote:> The second patch is a bit of a hack-and-slash job. New memory-type > virtualisation code should go in a new file, instead of being put in host > MTRR management files (which are mostly unmodified from Linux anyway, and > shouldn''t deviate more than necessary). There are various other one-line > changes in various MTRR source files that are unexplained (e.g., mask > changes from |0xf00 to |0x7ff). Initialisation of guest MTRR state will need > to be moved to hvmloader -- we do not call setvcpucontext for HVM guests any > more, so the current hack won''t work with current xen-unstable. Coding style > needs cleaning up (looks like there is some spaces vs. tabs mixing up going > on). > > -- Keir > > On 8/10/07 08:34, "Su, Disheng" <disheng.su@intel.com> wrote: > >> Hi, >> The following 2 patches add basic MTRR/PAT support into >> hypervisor. When vt-d enabled, direct IO and RAM >> could be mapped to different cache attribute, such as UC or WC, which >> will bring some trouble. >> xen.patch: some data structures of MTRR in xen are exported. >> mtrr_pat.patch: The basic idea is listed below: >> a. MTRR/PAT MSRs are per vcpu. The value of guest MTRR is >> initialized in host, after guest E820 table is build. The value of guest >> PAT is initialized as default. The host PAT is initialized to cover all >> the page attributes. >> b. The guest page attribute is virtualized through shadow page >> attribute. First, get the effective guest page attribute, by calculating >> from the combination of guest MTRR/PAT. Then the shadow page attribute >> is calculated from effective guest page attribute and host MTRR. If >> conflict occurs(e.g effective guest page attribute is WB, host MTRR of >> this range is UC, can''t set this page attribute as guest required), set >> this range as UC. Some lookup tables are added to accelerate above >> procedure. >> >> Signed-off-by: Disheng Su <disheng.su@intel.com> >> >> Best Regards, >> Disheng, Su >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 8/10/07 09:59, "Su, Disheng" <disheng.su@intel.com> wrote:>> Initialisation of guest MTRR state will need to be moved to hvmloader >> -- we do not call setvcpucontext for HVM guests any more, so the > > The problem here is to initialize MTRR MSRs of both BP and AP, in order > to make sure MTRR MSRs are the same. Can do it in hvmloader? > Another place to do it is in ROMBIOS, but there is lack of > AP initialization logic. So what''s your opinion about it? Do it in > hvmloader > , ROMBIOS or other place?The best way, although a bit of a pain to implement, would be to bring up all APs in hvmloader, initialise them, and HLT them. Like a proper BIOS. I wonder whether full MTRR/PAT emulation is necessary or even particularly desirable, however. I''d like to know what scenarios you are testing where this improves correctness or performance. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
xen-devel-bounces@lists.xensource.com <> wrote:> A couple of other comments: > 1. I don''t see any changes to shadow code to look at the _PAGE_PATbit;> only _PAGE_PCD and _PAGE_PWT. Is this correct?I think the changes to shadow code is done in following hunk, which will update the shadow flag according to guest flag. @@ -703,6 +713,17 @@ _sh_propagate(struct vcpu *v, pass_thru_flags |= _PAGE_NX_BIT; sflags = gflags & pass_thru_flags; + if ( level == 1 ) { + /*XXXX:Don''t change pat attributes for VRAM spaces, + * as it is shared between qemu and domain*/ + sflags |= get_pat_flags(&v->arch.hvm_vcpu.mtrr, + NULL, + v->arch.hvm_vcpu.pat_cr, + gflags, + guest_l1e_get_paddr(*gp), + target_mfn<<PAGE_SHIFT); + } As the changes to _PAGE_PCD and _PAGE_PWT, not sure if you are talking about following hunk, which is for flags in super pages. @@ -267,6 +267,10 @@ guest_walk_tables(struct vcpu *v, unsign * us reflect l2 changes later without touching the l1s. */ int flags = (_PAGE_PRESENT|_PAGE_USER|_PAGE_RW| _PAGE_ACCESSED|_PAGE_DIRTY); + if ( (guest_l2e_get_flags(*gw->l2e) & _PAGE_PWT) ) + flags |= _PAGE_PWT; + if ( (guest_l2e_get_flags(*gw->l2e) & _PAGE_PCD) ) + flags |= _PAGE_PCD> 2. Introducing non-WB types for normal RAM is potentially a > problem due to > attribute mismatches (E.g., mappings in Xen and mappings in > qemu-dm). Do you > disallow non-WB types for RAM, or handle this in some other > way? Disallow > would probably work fine except for a few cases like AGP GART. > > Given that this is probably only safe for non-RAM pages, and giventhat> usually such mappings will be UC anyway, is there any advantage tothis> large patch except accelerated access to passed-through videoframebuffer> access? The current ''collapse all memory types to UC for > non-RAM mappings'' > is I believe always safe, and video framebuffer access is the > only case I > can think of where there would be a performance loss that we mightcare> about. We could probably fix that with a much smaller patch > with a higher > chance of acceptance for 3.2.0.For RAM assigned to guest, the patch will allow non-WB types. The reason is for following scenerio: A PCI-E device setting the non-snoop bit to 1 in TLP header when doing memory access transaction to RAM. and the driver/OS will access that RAM with UC attribute. In current implementation without this patch, WB type will be used by guest , then PCI-E device may get wrong data, becaues the data updated by CPU may still in cache, and the PCI-E device''s access is not snooped. This patch will virtualize the cache attribute through attribute in shadow page table. Not sure if this is similar to AGP GART. Of course, some issues left for this method: 1) For qemu''s cirrus VGA VRAM, even if guest set that memory range to be non-WB type like UC, the attribute in shadow table will be WB (precisely, it will be the type that domain0/qemu-dm will use) , since domain0 and guest will access them at the same time. This is commented in the patch, and we have a internal patch under review. 2) If Xen/domain0 try to map guest''s RAM that guest has set to be UC, problem may happen. This issue do exists, but we think it is safe currently, considering guest OS will usually only change DMA target RAM''s attribute, and Xen/domain0 should not try map those guest RAM. Of course, we need to consider this still. Yunhong Jiang> > -- Keir > > On 8/10/07 08:57, "Keir Fraser" <Keir.Fraser@cl.cam.ac.uk> wrote: > >> The second patch is a bit of a hack-and-slash job. New memory-type >> virtualisation code should go in a new file, instead of being put inhost>> MTRR management files (which are mostly unmodified from Linux anyway,and>> shouldn''t deviate more than necessary). There are various otherone-line>> changes in various MTRR source files that are unexplained (e.g., mask >> changes from |0xf00 to |0x7ff). Initialisation of guest MTRR statewill>> need to be moved to hvmloader -- we do not call setvcpucontext forHVM>> guests any more, so the current hack won''t work with currentxen-unstable.>> Coding style needs cleaning up (looks like there is some spaces vs.tabs>> mixing up going on). >> >> -- Keir >> >> On 8/10/07 08:34, "Su, Disheng" <disheng.su@intel.com> wrote: >> >>> Hi, >>> The following 2 patches add basic MTRR/PAT support into >>> hypervisor. When vt-d enabled, direct IO and RAM >>> could be mapped to different cache attribute, such as UC or WC,which>>> will bring some trouble. xen.patch: some data structures of MTRR inxen>>> are exported. mtrr_pat.patch: The basic idea is listed below: >>> a. MTRR/PAT MSRs are per vcpu. The value of guest MTRR is >>> initialized in host, after guest E820 table is build. The value ofguest>>> PAT is initialized as default. The host PAT is initialized to coverall>>> the page attributes. b. The guest page attribute isvirtualized>>> through shadow page attribute. First, get the effective guest page >>> attribute, by calculating from the combination of guest MTRR/PAT.Then>>> the shadow page attribute is calculated from effective guest page >>> attribute and host MTRR. If conflict occurs(e.g effective guest page >>> attribute is WB, host MTRR of this range is UC, can''t set this page >>> attribute as guest required), set this range as UC. Some lookuptables>>> are added to accelerate above procedure. >>> >>> Signed-off-by: Disheng Su <disheng.su@intel.com> >>> >>> Best Regards, >>> Disheng, Su >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel >> >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 8/10/07 16:29, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:> For RAM assigned to guest, the patch will allow non-WB types. The reason > is for following scenerio: A PCI-E device setting the non-snoop bit to > 1 in TLP header when doing memory access transaction to RAM. and the > driver/OS will access that RAM with UC attribute. > > In current implementation without this patch, WB type will be used by > guest, then PCI-E device may get wrong data, becaues the data updated > by CPU may still in cache, and the PCI-E device''s access is not snooped. > This patch will virtualize the cache attribute through attribute in > shadow page table.Won''t WBINVD and CLFLUSH also need to be virtualised? If there are reservations about how this will interact with mappings in qemu-dm, perhaps this new attribute mechanism should only be enabled for pass-thru domains? We get no benefit for non-pass-thru domains and some concern about correctness. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser <mailto:Keir.Fraser@cl.cam.ac.uk> wrote:> On 8/10/07 16:29, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote: > >> For RAM assigned to guest, the patch will allow non-WB types. Thereason>> is for following scenerio: A PCI-E device setting the non-snoop bitto>> 1 in TLP header when doing memory access transaction to RAM. and the >> driver/OS will access that RAM with UC attribute. >> >> In current implementation without this patch, WB type will be used by >> guest, then PCI-E device may get wrong data, becaues the data updated >> by CPU may still in cache, and the PCI-E device''s access is notsnooped.>> This patch will virtualize the cache attribute through attribute in >> shadow page table. > > Won''t WBINVD and CLFLUSH also need to be virtualised?Why? Can you give me more hints? If guest try to flush cache, just let it do it, right?> > If there are reservations about how this will interact with mappingsin> qemu-dm, perhaps this new attribute mechanism should only be > enabled for > pass-thru domains? We get no benefit for non-pass-thru domains andsome> concern about correctness.I think interact with qemu-dm should be ok, since we knows about which range is VRAM from QEMU. One method is, qemu notify xen, which guest RAM range''s attribute will set according to host type, not gust type. And yes, this patch is mainly for pass-thru domains. But for non-pass-thru domain, I think some work may still needed. For example, currently we always set _PAGE_PAT/_PAGE_PCD/_PAGE_PWT to be 0, and the real meaning of this combination depends on host PAT MTRR setting, and not always WB. Of course, that may be simpler.> > -- Keir_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 8/10/07 17:00, "Jiang, Yunhong" <yunhong.jiang@intel.com> wrote:>> Won''t WBINVD and CLFLUSH also need to be virtualised? > Why? Can you give me more hints? If guest try to flush cache, just let > it do it, right?If a VCPU is migrated across physical CPUs, it can leave multiple dirty caches in its wake.> And yes, this patch is mainly for pass-thru domains. But for > non-pass-thru domain, I think some work may still needed. For example, > currently we always set _PAGE_PAT/_PAGE_PCD/_PAGE_PWT to be 0, and the > real meaning of this combination depends on host PAT MTRR setting, and > not always WB. > Of course, that may be simpler.The host PAT is under our control, and we know that 0/0/0 -> WB. The host MTRR is mostly under dom0 control, but it had better not screw with the types for E820_RAM areas, as that could crash the whole system. So we can assume that any RAM regions are WB in the MTRR (i.e., should be left as the BIOS set it up). So we do *not* need to add any extra logic for non-pass-thru domains, as they map nothing but ordinary RAM. If these basic assumptions are wrong then we would need to worry about PV domain mappings, and Xen''s own mappings, as well! I don''t think you''re concerned enough about interaction with qemu-dm''s mapcache. Just because a page that is mapped UC for real hardware interaction by a pass-thru domain should not need to be mapped by qemu-dm does not mean that either: 1. qemu-dm *already* has it mapped WB in its mapcache; or 2. qemu-dm may map it WB in future because the mapcache operates at the granularity of multi-page blocks. Also you should be worried about Xen''s own full 1:1 WB mapping of all RAM (x86/64 Xen only of course). In light of all this, messing with attribute logic for non-pass-thru domains before 3.2.0 is released is not going to fly. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 8/10/07 17:19, "Keir Fraser" <Keir.Fraser@cl.cam.ac.uk> wrote:> I don''t think you''re concerned enough about interaction with qemu-dm''s > mapcache. Just because a page that is mapped UC for real hardware > interaction by a pass-thru domain should not need to be mapped by qemu-dm > does not mean that either: 1. qemu-dm *already* has it mapped WB in its > mapcache; or 2. qemu-dm may map it WB in future because the mapcache > operates at the granularity of multi-page blocks. Also you should be worried > about Xen''s own full 1:1 WB mapping of all RAM (x86/64 Xen only of course).There is a missing piece here, to fully explain why this is bad: A page mapped WB permits speculative accesses. Thus data can be pulled into a CPU cache even though no load/store is ever actually executed on that data. This is particularly bad if a store operation is speculated -- then the data may be marked as Modified in the cache (even though it is not actually modified, just because the speculated operation is a store). Then it gets evicted at some random point in the future, and written back over whatever critical command structures happened to be residing there. Actually it may be the case that if the attribute mismatch involves only host CPUs then you''ll be okay. It depends how much cache consistency is maintained despite attribute mismatches. All the real-world problems I can think of involve memory aliasing to multiple physical addresses (e.g., via GART) and/or interaction with I/O devices. But the PRM does quite strongly recommend against attribute mismatches. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
xen-devel-bounces@lists.xensource.com wrote:> > Given that this is probably only safe for non-RAM pages, and given > that usually such mappings will be UC anyway, is there any advantage > to this large patch except accelerated access to passed-through video > framebuffer access? The current ''collapse all memory types to UC for > non-RAM mappings'' > is I believe always safe,Propogating guest memory type for MMIO only is what I posted last year which is rejected by you since pass-through devices support are not in at that time :-( But later on, with VT-d enabled, we found this is insufficient. We know some audio drivers will use non-snoopy + UC map for RAM buffers, and same for video batch buffers in addition to vram buffer. We have to propogating guest RAM memory type too to fix the audio/video issues :-(> Won''t WBINVD and CLFLUSH also need to be virtualised?Basically native OS can use WB + WBINVD scenario for output buffer, but can''t be used as input buffer. I doutb if a driver will use different memory attribute for input/output buffer. But, yes this is really a good point. So we need to do WBINVD when VP migrates (of course for pass-through domain only), while the prefered approach is to pin VCPU on pCPUs.> > If there are reservations about how this will interact with mappings > in qemu-dm, perhaps this new attribute mechanism should only be > enabled for > pass-thru domains? We get no benefit for non-pass-thru domains and > some concern about correctness.Yes, we can do in this way.> about Xen''s own full 1:1 WB mapping of all RAM (x86/64 Xen > only of course).For a native OS, will OS unconditionally map the whole memory itself? I think no, otherwise same issue will happen to native OS. BTW, if we don''t remove those fixed 1:1 mapping, even without MTRR patch, we may meet problem in future when there are complicated drivers in domain0. E.g. high end graphics driver in domain0 may use UC for its batch buffer, same for audio driver and other PCIe drivers in future. Processor will see 2 different memory attribute in its direct page table. So I think Xen need to take same policy like native OS though it may be painful. With Xen moving to client, those issue will become more popular.> There is a missing piece here, to fully explain why this is bad: A > page mapped WB permits speculative accesses. Thus data can be > pulled into a CPU > cache even though no load/store is ever actually executed on > that data. This > is particularly bad if a store operation is speculated -- thenDo u mean AMD processor support speculative write? I don''t think Intel processor support this.> the data may > be marked as Modified in the cache (even though it is not > actually modified, > just because the speculated operation is a store). Then it > gets evicted at > some random point in the future, and written back over > whatever critical > command structures happened to be residing there.I may lose here. The MTRR patch only tighten the memory access attribute, never loosen it, so we are toward much safer direction, isn''t it? Anyway, we can split the patch into 2, one for MMIO only and another one for both with RAM. Probably the patch is almost same, just one line to eliminate the RAM mmeory type propogatio. Thanks, eddie _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 9/10/07 05:38, "Dong, Eddie" <eddie.dong@intel.com> wrote:> But, yes this is really a good point. So we need to do WBINVD when VP > migrates (of course for pass-through domain only), while the prefered > approach is to pin VCPU on pCPUs.Or WBINVD all CPUs when a VCPU executes WBINVD. Or explicitly track dirty caches for each vCPU.>> We get no benefit for non-pass-thru domains and some concern about >> correctness. > Yes, we can do in this way.Good.> For a native OS, will OS unconditionally map the whole memory itself? > I think no, otherwise same issue will happen to native OS.Yup. This issue was fixed in Linux back in 2.4.> BTW, if we don''t remove those fixed 1:1 mapping, even without MTRR > patch, we may meet problem in future when there are complicated drivers > in domain0. E.g. high end graphics driver in domain0 may use UC for its > batch > buffer, same for audio driver and other PCIe drivers in future. > Processor will see 2 different memory attribute in its direct page > table.Yup. Jan Beulich posted a patch some time ago. At least some variant of that is certainly needed and I plan to put that in before 3.2.0.> Do u mean AMD processor support speculative write? I don''t think > Intel processor support this.A speculative store is never made visible to other processors of course. That would be hard to roll back! However the data can be safely prefetched into the cache and it is then an implementation detail as to whether the processor ''optimistically'' marks the cache line as dirty before the speculated store is actually executed (rather than discarded).> I may lose here. The MTRR patch only tighten the memory > access attribute, never loosen it, so we are toward much safer > direction, isn''t it?Yes, that may be true. I don''t think that mismatching attributes can cause system stability issues except in very special circumstances where having all attributes as WB is no safer.> Anyway, we can split the patch into 2, one for MMIO only and > another one for both with RAM. Probably the patch is almost > same, just one line to eliminate the RAM mmeory type propogatio.If you need both then may as well enable both in the same patch. Just make it for pass-thru domains only, clean up the patch as I described before, and I''ll be happy. Probably. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Michael A Fetterman
2007-Oct-09 09:23 UTC
Re: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization
> > But, yes this is really a good point. So we need to do WBINVD when VP > > migrates (of course for pass-through domain only), while the prefered > > approach is to pin VCPU on pCPUs. > > Or WBINVD all CPUs when a VCPU executes WBINVD. Or explicitly track dirty > caches for each vCPU.I shutter to think of allowing a guest to cause a WBINVD. In modern systems (8M+ of cache, etc), it can take 4+ milliseconds to execute. I dare say it could, in a worst case scenerio, be even worse if you did it on multiple CPUs or hyperthreads at once. And the cpu is non-interruptable the entire time. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 9/10/07 10:23, "Michael A Fetterman" <Michael.Fetterman@cl.cam.ac.uk> wrote:>>> But, yes this is really a good point. So we need to do WBINVD when VP >>> migrates (of course for pass-through domain only), while the prefered >>> approach is to pin VCPU on pCPUs. >> >> Or WBINVD all CPUs when a VCPU executes WBINVD. Or explicitly track dirty >> caches for each vCPU. > > I shutter to think of allowing a guest to cause a WBINVD. > > In modern systems (8M+ of cache, etc), it can take 4+ milliseconds to execute. > I dare say it could, in a worst case scenerio, be even worse if you > did it on multiple > CPUs or hyperthreads at once. And the cpu is non-interruptable the entire > time.It only needs to be allowed for guests with direct hardware access. Those WBINVD instructions in AGP drivers and the like are there for a reason, I suspect. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser wrote:>> BTW, if we don''t remove those fixed 1:1 mapping, even without MTRR >> patch, we may meet problem in future when there are complicated >> drivers in domain0. E.g. high end graphics driver in domain0 may use >> UC for its batch buffer, same for audio driver and other PCIe >> drivers in future. Processor will see 2 different memory attribute >> in its direct page table. > > Yup. Jan Beulich posted a patch some time ago. At least some variant > of that is certainly needed and I plan to put that in before 3.2.0.So we can rely on you for this change, right? :-)> If you need both then may as well enable both in the same > patch. Just make > it for pass-thru domains only, clean up the patch as I > described before, and > I''ll be happy. Probably. >Sure! How about the initial MTRR value? Taking it easy, we can do this when create a VCPU and leave future work to do in Bochs BIOS. What is your opnion? in HVM loader from now? thx,eddie _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 9/10/07 16:30, "Dong, Eddie" <eddie.dong@intel.com> wrote:>>> BTW, if we don''t remove those fixed 1:1 mapping, even without MTRR >>> patch, we may meet problem in future when there are complicated >>> drivers in domain0. E.g. high end graphics driver in domain0 may use >>> UC for its batch buffer, same for audio driver and other PCIe >>> drivers in future. Processor will see 2 different memory attribute >>> in its direct page table. >> >> Yup. Jan Beulich posted a patch some time ago. At least some variant >> of that is certainly needed and I plan to put that in before 3.2.0. > > So we can rely on you for this change, right?Yeah. It''ll be specific to PV domains with I/O privileges though. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 9/10/07 16:30, "Dong, Eddie" <eddie.dong@intel.com> wrote:> How about the initial MTRR value? Taking it easy, we can > do this when create a VCPU and leave future work to do in > Bochs BIOS. What is your opnion? in HVM loader from now?Yeah, do it in the HVM vcpu initialisation hook for now. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel