thr3ads.net - Xen devel - [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization [Oct 2007]

If this information is useful, please help other people find it:
Share via:

Su, Disheng

2007-Oct-08 07:34 UTC

[Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

Hi,
	The following 2 patches add basic MTRR/PAT support into
hypervisor. When vt-d enabled, direct IO and RAM 
could be mapped to different cache attribute, such as UC or WC, which
will bring some trouble. 
	xen.patch: some data structures of MTRR in xen are exported.
	mtrr_pat.patch: The basic idea is listed below:
          a. MTRR/PAT MSRs are per vcpu. The value of guest MTRR is
initialized in host, after guest E820 table is build. The value of guest
PAT is initialized as default. The host PAT is initialized to cover all
the page attributes. 	
          b. The guest page attribute is virtualized through shadow page
attribute. First, get the effective guest page attribute, by calculating
from the combination of guest MTRR/PAT. Then the shadow page attribute
is calculated from effective guest page attribute and host MTRR. If
conflict occurs(e.g effective guest page attribute is WB, host MTRR of
this range is UC, can''t set this page attribute as guest required), set
this range as UC. Some lookup tables are added to accelerate above
procedure.

Signed-off-by: Disheng Su <disheng.su@intel.com> 

Best Regards,
Disheng, Su

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2007-Oct-08 07:57 UTC

head link

Re: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

The second patch is a bit of a hack-and-slash job. New memory-type
virtualisation code should go in a new file, instead of being put in host
MTRR management files (which are mostly unmodified from Linux anyway, and
shouldn''t deviate more than necessary). There are various other
one-line
changes in various MTRR source files that are unexplained (e.g., mask
changes from |0xf00 to |0x7ff). Initialisation of guest MTRR state will need
to be moved to hvmloader -- we do not call setvcpucontext for HVM guests any
more, so the current hack won''t work with current xen-unstable. Coding
style
needs cleaning up (looks like there is some spaces vs. tabs mixing up going
on).

  -- Keir

On 8/10/07 08:34, "Su, Disheng" <disheng.su@intel.com> wrote:
> Hi,
> The following 2 patches add basic MTRR/PAT support into
> hypervisor. When vt-d enabled, direct IO and RAM
> could be mapped to different cache attribute, such as UC or WC, which
> will bring some trouble.
> xen.patch: some data structures of MTRR in xen are exported.
> mtrr_pat.patch: The basic idea is listed below:
>           a. MTRR/PAT MSRs are per vcpu. The value of guest MTRR is
> initialized in host, after guest E820 table is build. The value of guest
> PAT is initialized as default. The host PAT is initialized to cover all
> the page attributes.  
>           b. The guest page attribute is virtualized through shadow page
> attribute. First, get the effective guest page attribute, by calculating
> from the combination of guest MTRR/PAT. Then the shadow page attribute
> is calculated from effective guest page attribute and host MTRR. If
> conflict occurs(e.g effective guest page attribute is WB, host MTRR of
> this range is UC, can''t set this page attribute as guest
required), set
> this range as UC. Some lookup tables are added to accelerate above
> procedure.
> 
> Signed-off-by: Disheng Su <disheng.su@intel.com>
> 
> Best Regards,
> Disheng, Su
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Su, Disheng

2007-Oct-08 08:59 UTC

head link

RE: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

Thanks for your comments.

Keir Fraser wrote:> The second patch is a bit of a hack-and-slash job. New memory-type
> virtualisation code should go in a new file, instead of being put in
> host MTRR management files (which are mostly unmodified from Linux
> anyway, and shouldn''t deviate more than necessary). There are
various
How about add a new file called mtrr.c under xen/arch/x86/hvm/ ?
> other one-line changes in various MTRR source files that are
> unexplained (e.g., mask changes from |0xf00 to |0x7ff).
Ok, will add more comments about the changes.
> Initialisation of guest MTRR state will need to be moved to hvmloader
> -- we do not call setvcpucontext for HVM guests any more, so the
The problem here is to initialize MTRR MSRs of both BP and AP, in order 
to make sure MTRR MSRs are the same. Can do it in hvmloader? 
Another place to do it is in ROMBIOS, but there is lack of 
AP initialization logic. So what''s your opinion about it? Do it in
hvmloader
, ROMBIOS or other place? 
> current hack won''t work with current xen-unstable. Coding style
needs
> cleaning up (looks like there is some spaces vs. tabs mixing up going
> on). 
> 
>   -- Keir
> 
> On 8/10/07 08:34, "Su, Disheng" <disheng.su@intel.com>
wrote:
> 
>> Hi,
>> The following 2 patches add basic MTRR/PAT support into
>> hypervisor. When vt-d enabled, direct IO and RAM
>> could be mapped to different cache attribute, such as UC or WC,
>> which will bring some trouble. xen.patch: some data structures of
>> MTRR in xen are exported. mtrr_pat.patch: The basic idea is listed
>>           below: a. MTRR/PAT MSRs are per vcpu. The value of guest
>> MTRR is 
>> initialized in host, after guest E820 table is build. The value of
>> guest PAT is initialized as default. The host PAT is initialized to
>>           cover all the page attributes. b. The guest page attribute
>> is virtualized through shadow page attribute. First, get the
>> effective guest page attribute, by calculating from the combination
>> of guest MTRR/PAT. Then the shadow page attribute is calculated from
>> effective guest page attribute and host MTRR. If conflict occurs(e.g
>> effective guest page attribute is WB, host MTRR of this range is UC,
>> can''t set this page attribute as guest required), set this
range as
>> UC. Some lookup tables are added to accelerate above procedure. 
>> 
>> Signed-off-by: Disheng Su <disheng.su@intel.com>
>> 
>> Best Regards,
>> Disheng, Su
>> 
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel


Best Regards,
Disheng, Su

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2007-Oct-08 09:05 UTC

head link

Re: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

A couple of other comments:
 1. I don''t see any changes to shadow code to look at the _PAGE_PAT
bit;
only _PAGE_PCD and _PAGE_PWT. Is this correct?
 2. Introducing non-WB types for normal RAM is potentially a problem due to
attribute mismatches (E.g., mappings in Xen and mappings in qemu-dm). Do you
disallow non-WB types for RAM, or handle this in some other way? Disallow
would probably work fine except for a few cases like AGP GART.

Given that this is probably only safe for non-RAM pages, and given that
usually such mappings will be UC anyway, is there any advantage to this
large patch except accelerated access to passed-through video framebuffer
access? The current ''collapse all memory types to UC for non-RAM
mappings''
is I believe always safe, and video framebuffer access is the only case I
can think of where there would be a performance loss that we might care
about. We could probably fix that with a much smaller patch with a higher
chance of acceptance for 3.2.0.

 -- Keir

On 8/10/07 08:57, "Keir Fraser" <Keir.Fraser@cl.cam.ac.uk>
wrote:
> The second patch is a bit of a hack-and-slash job. New memory-type
> virtualisation code should go in a new file, instead of being put in host
> MTRR management files (which are mostly unmodified from Linux anyway, and
> shouldn''t deviate more than necessary). There are various other
one-line
> changes in various MTRR source files that are unexplained (e.g., mask
> changes from |0xf00 to |0x7ff). Initialisation of guest MTRR state will
need
> to be moved to hvmloader -- we do not call setvcpucontext for HVM guests
any
> more, so the current hack won''t work with current xen-unstable.
Coding style
> needs cleaning up (looks like there is some spaces vs. tabs mixing up going
> on).
> 
>   -- Keir
> 
> On 8/10/07 08:34, "Su, Disheng" <disheng.su@intel.com>
wrote:
> 
>> Hi,
>> The following 2 patches add basic MTRR/PAT support into
>> hypervisor. When vt-d enabled, direct IO and RAM
>> could be mapped to different cache attribute, such as UC or WC, which
>> will bring some trouble.
>> xen.patch: some data structures of MTRR in xen are exported.
>> mtrr_pat.patch: The basic idea is listed below:
>>           a. MTRR/PAT MSRs are per vcpu. The value of guest MTRR is
>> initialized in host, after guest E820 table is build. The value of
guest
>> PAT is initialized as default. The host PAT is initialized to cover all
>> the page attributes.
>>           b. The guest page attribute is virtualized through shadow
page
>> attribute. First, get the effective guest page attribute, by
calculating
>> from the combination of guest MTRR/PAT. Then the shadow page attribute
>> is calculated from effective guest page attribute and host MTRR. If
>> conflict occurs(e.g effective guest page attribute is WB, host MTRR of
>> this range is UC, can''t set this page attribute as guest
required), set
>> this range as UC. Some lookup tables are added to accelerate above
>> procedure.
>> 
>> Signed-off-by: Disheng Su <disheng.su@intel.com>
>> 
>> Best Regards,
>> Disheng, Su
>> 
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2007-Oct-08 09:13 UTC

head link

Re: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

On 8/10/07 09:59, "Su, Disheng" <disheng.su@intel.com> wrote:
>> Initialisation of guest MTRR state will need to be moved to hvmloader
>> -- we do not call setvcpucontext for HVM guests any more, so the
> 
> The problem here is to initialize MTRR MSRs of both BP and AP, in order
> to make sure MTRR MSRs are the same. Can do it in hvmloader?
> Another place to do it is in ROMBIOS, but there is lack of
> AP initialization logic. So what''s your opinion about it? Do it in
> hvmloader
> , ROMBIOS or other place?
The best way, although a bit of a pain to implement, would be to bring up
all APs in hvmloader, initialise them, and HLT them. Like a proper BIOS.

I wonder whether full MTRR/PAT emulation is necessary or even particularly
desirable, however. I''d like to know what scenarios you are testing
where
this improves correctness or performance.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jiang, Yunhong

2007-Oct-08 15:29 UTC

head link

RE: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

xen-devel-bounces@lists.xensource.com <> wrote:> A couple of other comments:
> 1. I don''t see any changes to shadow code to look at the _PAGE_PAT
bit;> only _PAGE_PCD and _PAGE_PWT. Is this correct?
I think the changes to shadow code is done in following hunk, which will
update the shadow flag according to guest flag.

@@ -703,6 +713,17 @@ _sh_propagate(struct vcpu *v,
         pass_thru_flags |= _PAGE_NX_BIT;
     sflags = gflags & pass_thru_flags;
 
+    if ( level == 1 ) {
+        /*XXXX:Don''t change pat attributes for VRAM spaces,
+         * as it is shared between qemu and domain*/
+        sflags |= get_pat_flags(&v->arch.hvm_vcpu.mtrr,
+                                NULL,
+                                v->arch.hvm_vcpu.pat_cr,
+                                gflags,
+                                guest_l1e_get_paddr(*gp),
+                                target_mfn<<PAGE_SHIFT);
+    }

As the changes to _PAGE_PCD and _PAGE_PWT, not sure if you are talking
about following hunk, which is for flags in super pages.
@@ -267,6 +267,10 @@ guest_walk_tables(struct vcpu *v, unsign
          * us reflect l2 changes later without touching the l1s. */
         int flags = (_PAGE_PRESENT|_PAGE_USER|_PAGE_RW|
                      _PAGE_ACCESSED|_PAGE_DIRTY);
+        if ( (guest_l2e_get_flags(*gw->l2e) & _PAGE_PWT) )
+            flags |= _PAGE_PWT;
+        if ( (guest_l2e_get_flags(*gw->l2e) & _PAGE_PCD) )
+            flags |= _PAGE_PCD

> 2. Introducing non-WB types for normal RAM is potentially a
> problem due to
> attribute mismatches (E.g., mappings in Xen and mappings in
> qemu-dm). Do you
> disallow non-WB types for RAM, or handle this in some other
> way? Disallow
> would probably work fine except for a few cases like AGP GART.
> 
> Given that this is probably only safe for non-RAM pages, and given
that> usually such mappings will be UC anyway, is there any advantage to
this> large patch except accelerated access to passed-through video
framebuffer> access? The current ''collapse all memory types to UC for
> non-RAM mappings''
> is I believe always safe, and video framebuffer access is the
> only case I
> can think of where there would be a performance loss that we might
care> about. We could probably fix that with a much smaller patch
> with a higher
> chance of acceptance for 3.2.0.
For RAM assigned to guest, the patch will allow non-WB types. The reason
is for following scenerio: A  PCI-E device setting the non-snoop bit to
1 in TLP header when doing memory access transaction to RAM. and the
driver/OS will access that RAM with UC attribute. 

In current implementation without this patch, WB type will be used by
guest  , then PCI-E device may get wrong data, becaues the data updated
by CPU may still in cache, and the PCI-E device''s access is not
snooped.
This patch will virtualize the cache attribute through attribute in
shadow page table.
Not sure if this is similar to AGP GART.

Of course, some issues left for this method:
1) For qemu''s cirrus VGA VRAM, even if guest set that memory range to
be
non-WB type like UC, the attribute in shadow table will be WB
(precisely, it will be the type that domain0/qemu-dm will use) , since
domain0 and guest will access them at the same time. This is commented
in the patch, and we have a internal patch under review.
2) If Xen/domain0 try to map guest''s RAM  that guest has set to be UC,
problem may happen. This issue do exists, but we think it is safe
currently, considering guest OS will usually only change DMA target
RAM''s attribute, and Xen/domain0 should not try map those guest RAM. Of
course, we need to consider this still.

Yunhong Jiang
> 
> -- Keir
> 
> On 8/10/07 08:57, "Keir Fraser" <Keir.Fraser@cl.cam.ac.uk>
wrote:
> 
>> The second patch is a bit of a hack-and-slash job. New memory-type
>> virtualisation code should go in a new file, instead of being put in
host>> MTRR management files (which are mostly unmodified from Linux anyway,
and>> shouldn''t deviate more than necessary). There are various
other
one-line>> changes in various MTRR source files that are unexplained (e.g., mask
>> changes from |0xf00 to |0x7ff). Initialisation of guest MTRR state
will>> need to be moved to hvmloader -- we do not call setvcpucontext for
HVM>> guests any more, so the current hack won''t work with current
xen-unstable.>> Coding style needs cleaning up (looks like there is some spaces vs.
tabs>> mixing up going on). 
>> 
>>   -- Keir
>> 
>> On 8/10/07 08:34, "Su, Disheng" <disheng.su@intel.com>
wrote:
>> 
>>> Hi,
>>> The following 2 patches add basic MTRR/PAT support into
>>> hypervisor. When vt-d enabled, direct IO and RAM
>>> could be mapped to different cache attribute, such as UC or WC,
which>>> will bring some trouble. xen.patch: some data structures of MTRR in
xen>>> are exported. mtrr_pat.patch: The basic idea is listed below:
>>>           a. MTRR/PAT MSRs are per vcpu. The value of guest MTRR is
>>> initialized in host, after guest E820 table is build. The value of
guest>>> PAT is initialized as default. The host PAT is initialized to cover
all>>>           the page attributes. b. The guest page attribute is
virtualized>>> through shadow page attribute. First, get the effective guest page
>>> attribute, by calculating from the combination of guest MTRR/PAT.
Then>>> the shadow page attribute is calculated from effective guest page
>>> attribute and host MTRR. If conflict occurs(e.g effective guest
page
>>> attribute is WB, host MTRR of this range is UC, can''t set
this page
>>> attribute as guest required), set this range as UC. Some lookup
tables>>> are added to accelerate above procedure. 
>>> 
>>> Signed-off-by: Disheng Su <disheng.su@intel.com>
>>> 
>>> Best Regards,
>>> Disheng, Su
>>> 
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xensource.com
>>> http://lists.xensource.com/xen-devel
>> 
>> 
>> 
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2007-Oct-08 15:45 UTC

head link

Re: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

On 8/10/07 16:29, "Jiang, Yunhong" <yunhong.jiang@intel.com>
wrote:
> For RAM assigned to guest, the patch will allow non-WB types. The reason
> is for following scenerio: A  PCI-E device setting the non-snoop bit to
> 1 in TLP header when doing memory access transaction to RAM. and the
> driver/OS will access that RAM with UC attribute.
> 
> In current implementation without this patch, WB type will be used by
> guest, then PCI-E device may get wrong data, becaues the data updated
> by CPU may still in cache, and the PCI-E device''s access is not
snooped.
> This patch will virtualize the cache attribute through attribute in
> shadow page table.
Won''t WBINVD and CLFLUSH also need to be virtualised?

If there are reservations about how this will interact with mappings in
qemu-dm, perhaps this new attribute mechanism should only be enabled for
pass-thru domains? We get no benefit for non-pass-thru domains and some
concern about correctness.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jiang, Yunhong

2007-Oct-08 16:00 UTC

head link

RE: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

Keir Fraser <mailto:Keir.Fraser@cl.cam.ac.uk>
wrote:> On 8/10/07 16:29, "Jiang, Yunhong"
<yunhong.jiang@intel.com> wrote:
> 
>> For RAM assigned to guest, the patch will allow non-WB types. The
reason>> is for following scenerio: A  PCI-E device setting the non-snoop bit
to>> 1 in TLP header when doing memory access transaction to RAM. and the
>> driver/OS will access that RAM with UC attribute.
>> 
>> In current implementation without this patch, WB type will be used by
>> guest, then PCI-E device may get wrong data, becaues the data updated
>> by CPU may still in cache, and the PCI-E device''s access is
not
snooped.>> This patch will virtualize the cache attribute through attribute in
>> shadow page table.
> 
> Won''t WBINVD and CLFLUSH also need to be virtualised?Why? Can you give me more hints? If guest try to flush cache, just let
it do it, right?
> 
> If there are reservations about how this will interact with mappings
in> qemu-dm, perhaps this new attribute mechanism should only be
> enabled for
> pass-thru domains? We get no benefit for non-pass-thru domains and
some> concern about correctness. 
I think interact with qemu-dm should be ok, since we knows about which
range is VRAM from QEMU. One method is, qemu notify xen, which guest RAM
range''s attribute will set according to host type, not gust type.

And yes, this patch is mainly for pass-thru domains. But for
non-pass-thru domain, I think some work may still needed. For example,
currently we always set _PAGE_PAT/_PAGE_PCD/_PAGE_PWT to be 0, and the
real meaning of this combination depends on host PAT MTRR setting, and
not always WB.
Of course, that may be simpler.
> 
> -- Keir
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2007-Oct-08 16:19 UTC

head link

Re: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

On 8/10/07 17:00, "Jiang, Yunhong" <yunhong.jiang@intel.com>
wrote:
>> Won''t WBINVD and CLFLUSH also need to be virtualised?
> Why? Can you give me more hints? If guest try to flush cache, just let
> it do it, right?
If a VCPU is migrated across physical CPUs, it can leave multiple dirty
caches in its wake.
> And yes, this patch is mainly for pass-thru domains. But for
> non-pass-thru domain, I think some work may still needed. For example,
> currently we always set _PAGE_PAT/_PAGE_PCD/_PAGE_PWT to be 0, and the
> real meaning of this combination depends on host PAT MTRR setting, and
> not always WB.
> Of course, that may be simpler.
The host PAT is under our control, and we know that 0/0/0 -> WB. The host
MTRR is mostly under dom0 control, but it had better not screw with the
types for E820_RAM areas, as that could crash the whole system. So we can
assume that any RAM regions are WB in the MTRR (i.e., should be left as the
BIOS set it up). So we do *not* need to add any extra logic for
non-pass-thru domains, as they map nothing but ordinary RAM. If these basic
assumptions are wrong then we would need to worry about PV domain mappings,
and Xen''s own mappings, as well!

I don''t think you''re concerned enough about interaction with
qemu-dm''s
mapcache. Just because a page that is mapped UC for real hardware
interaction by a pass-thru domain should not need to be mapped by qemu-dm
does not mean that either: 1. qemu-dm *already* has it mapped WB in its
mapcache; or 2. qemu-dm may map it WB in future because the mapcache
operates at the granularity of multi-page blocks. Also you should be worried
about Xen''s own full 1:1 WB mapping of all RAM (x86/64 Xen only of
course).

In light of all this, messing with attribute logic for non-pass-thru domains
before 3.2.0 is released is not going to fly.

 -- Keir

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2007-Oct-08 16:40 UTC

head link

Re: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

On 8/10/07 17:19, "Keir Fraser" <Keir.Fraser@cl.cam.ac.uk>
wrote:
> I don''t think you''re concerned enough about interaction
with qemu-dm''s
> mapcache. Just because a page that is mapped UC for real hardware
> interaction by a pass-thru domain should not need to be mapped by qemu-dm
> does not mean that either: 1. qemu-dm *already* has it mapped WB in its
> mapcache; or 2. qemu-dm may map it WB in future because the mapcache
> operates at the granularity of multi-page blocks. Also you should be
worried
> about Xen''s own full 1:1 WB mapping of all RAM (x86/64 Xen only of
course).
There is a missing piece here, to fully explain why this is bad: A page
mapped WB permits speculative accesses. Thus data can be pulled into a CPU
cache even though no load/store is ever actually executed on that data. This
is particularly bad if a store operation is speculated -- then the data may
be marked as Modified in the cache (even though it is not actually modified,
just because the speculated operation is a store). Then it gets evicted at
some random point in the future, and written back over whatever critical
command structures happened to be residing there.

Actually it may be the case that if the attribute mismatch involves only
host CPUs then you''ll be okay. It depends how much cache consistency is
maintained despite attribute mismatches. All the real-world problems I can
think of involve memory aliasing to multiple physical addresses (e.g., via
GART) and/or interaction with I/O devices. But the PRM does quite strongly
recommend against attribute mismatches.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Dong, Eddie

2007-Oct-09 04:38 UTC

head link

RE: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

xen-devel-bounces@lists.xensource.com wrote:> 
> Given that this is probably only safe for non-RAM pages, and given
> that usually such mappings will be UC anyway, is there any advantage
> to this large patch except accelerated access to passed-through video
> framebuffer access? The current ''collapse all memory types to UC
for
> non-RAM mappings''
> is I believe always safe,
Propogating guest memory type for MMIO only is what I posted last year
which is rejected by you since pass-through devices support are not in 
at that time :-(

But later on, with VT-d enabled, we found this is insufficient. We know
some
audio drivers will use non-snoopy + UC map for RAM buffers, and same for
video
batch buffers in addition to vram buffer. We have to propogating guest
RAM 
memory type too to fix the audio/video issues :-(
> Won''t WBINVD and CLFLUSH also need to be virtualised?
Basically native OS can use WB + WBINVD
scenario for output buffer, but can''t be used as input
buffer. I doutb if a driver will use different memory attribute
for input/output buffer. 

But, yes this is really a good point. So we need to do WBINVD when VP 
migrates (of course for pass-through domain only), while the prefered 
approach is to pin VCPU on pCPUs.
> 
> If there are reservations about how this will interact with mappings
> in qemu-dm, perhaps this new attribute mechanism should only be
> enabled for
> pass-thru domains? We get no benefit for non-pass-thru domains and
> some concern about correctness. 
Yes, we can do in this way.

> about Xen''s own full 1:1 WB mapping of all RAM (x86/64 Xen
> only of course).
For a native OS, will OS unconditionally map the whole memory itself?
I think no, otherwise same issue will happen to native OS.

BTW, if we don''t remove those fixed 1:1 mapping, even without MTRR
patch, we may meet problem in future when there are complicated drivers
in domain0. E.g. high end graphics driver in domain0 may use UC for its
batch
buffer, same for audio driver and other PCIe drivers in future.
Processor will see 2 different memory attribute in its direct page
table.

So I think Xen need to take same policy like native OS though it may be
painful. 
With Xen moving to client, those issue will become more popular.

> There is a missing piece here, to fully explain why this is bad: A
> page mapped WB permits speculative accesses. Thus data can be
> pulled into a CPU
> cache even though no load/store is ever actually executed on
> that data. This
> is particularly bad if a store operation is speculated -- then
Do u mean AMD processor support speculative write? I don''t think
Intel processor support this.
> the data may
> be marked as Modified in the cache (even though it is not
> actually modified,
> just because the speculated operation is a store). Then it
> gets evicted at
> some random point in the future, and written back over
> whatever critical
> command structures happened to be residing there.
I may lose here. The MTRR patch only tighten the memory
access attribute, never loosen it, so we are toward much safer
direction, isn''t it?


Anyway, we can split the patch into 2, one for MMIO only and
another one for both with RAM. Probably the patch is almost
same, just one line to eliminate the RAM mmeory type propogatio.


Thanks, eddie

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2007-Oct-09 08:14 UTC

head link

Re: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

On 9/10/07 05:38, "Dong, Eddie" <eddie.dong@intel.com> wrote:
> But, yes this is really a good point. So we need to do WBINVD when VP
> migrates (of course for pass-through domain only), while the prefered
> approach is to pin VCPU on pCPUs.
Or WBINVD all CPUs when a VCPU executes WBINVD. Or explicitly track dirty
caches for each vCPU.
 >> We get no benefit for non-pass-thru domains and some concern about
>> correctness. 
> Yes, we can do in this way.
Good.
> For a native OS, will OS unconditionally map the whole memory itself?
> I think no, otherwise same issue will happen to native OS.
Yup. This issue was fixed in Linux back in 2.4.
> BTW, if we don''t remove those fixed 1:1 mapping, even without MTRR
> patch, we may meet problem in future when there are complicated drivers
> in domain0. E.g. high end graphics driver in domain0 may use UC for its
> batch
> buffer, same for audio driver and other PCIe drivers in future.
> Processor will see 2 different memory attribute in its direct page
> table.
Yup. Jan Beulich posted a patch some time ago. At least some variant of that
is certainly needed and I plan to put that in before 3.2.0.
> Do u mean AMD processor support speculative write? I don''t think
> Intel processor support this.
A speculative store is never made visible to other processors of course.
That would be hard to roll back! However the data can be safely prefetched
into the cache and it is then an implementation detail as to whether the
processor ''optimistically'' marks the cache line as dirty
before the
speculated store is actually executed (rather than discarded).
> I may lose here. The MTRR patch only tighten the memory
> access attribute, never loosen it, so we are toward much safer
> direction, isn''t it?
Yes, that may be true. I don''t think that mismatching attributes can
cause
system stability issues except in very special circumstances where having
all attributes as WB is no safer.
> Anyway, we can split the patch into 2, one for MMIO only and
> another one for both with RAM. Probably the patch is almost
> same, just one line to eliminate the RAM mmeory type propogatio.
If you need both then may as well enable both in the same patch. Just make
it for pass-thru domains only, clean up the patch as I described before, and
I''ll be happy. Probably.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Michael A Fetterman

2007-Oct-09 09:23 UTC

head link

Re: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

> > But, yes this is really a good point. So we need to do WBINVD when VP
> > migrates (of course for pass-through domain only), while the prefered
> > approach is to pin VCPU on pCPUs.
>
> Or WBINVD all CPUs when a VCPU executes WBINVD. Or explicitly track dirty
> caches for each vCPU.
I shutter to think of allowing a guest to cause a WBINVD.

In modern systems (8M+ of cache, etc), it can take 4+ milliseconds to execute.
I dare say it could, in a worst case scenerio, be even worse if you
did it on multiple
CPUs or hyperthreads at once.  And the cpu is non-interruptable the entire time.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2007-Oct-09 10:43 UTC

head link

Re: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

On 9/10/07 10:23, "Michael A Fetterman"
<Michael.Fetterman@cl.cam.ac.uk>
wrote:
>>> But, yes this is really a good point. So we need to do WBINVD when
VP
>>> migrates (of course for pass-through domain only), while the
prefered
>>> approach is to pin VCPU on pCPUs.
>> 
>> Or WBINVD all CPUs when a VCPU executes WBINVD. Or explicitly track
dirty
>> caches for each vCPU.
> 
> I shutter to think of allowing a guest to cause a WBINVD.
> 
> In modern systems (8M+ of cache, etc), it can take 4+ milliseconds to
execute.
> I dare say it could, in a worst case scenerio, be even worse if you
> did it on multiple
> CPUs or hyperthreads at once.  And the cpu is non-interruptable the entire
> time.
It only needs to be allowed for guests with direct hardware access. Those
WBINVD instructions in AGP drivers and the like are there for a reason, I
suspect.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Dong, Eddie

2007-Oct-09 15:30 UTC

head link

RE: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

Keir Fraser wrote:
 >> BTW, if we don''t remove those fixed 1:1 mapping, even without
MTRR
>> patch, we may meet problem in future when there are complicated
>> drivers in domain0. E.g. high end graphics driver in domain0 may use
>> UC for its batch buffer, same for audio driver and other PCIe
>> drivers in future. Processor will see 2 different memory attribute
>> in its direct page table.
> 
> Yup. Jan Beulich posted a patch some time ago. At least some variant
> of that is certainly needed and I plan to put that in before 3.2.0.
So we can rely on you for this change, right? 
:-)

 > If you need both then may as well enable both in the same
> patch. Just make
> it for pass-thru domains only, clean up the patch as I
> described before, and
> I''ll be happy. Probably.
> Sure!

How about the initial MTRR value? Taking it easy, we can
do this when create a VCPU and leave future work to do in
Bochs BIOS. What is your opnion? in HVM loader from now?



thx,eddie

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2007-Oct-09 15:38 UTC

head link

Re: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

On 9/10/07 16:30, "Dong, Eddie" <eddie.dong@intel.com> wrote:
>>> BTW, if we don''t remove those fixed 1:1 mapping, even
without MTRR
>>> patch, we may meet problem in future when there are complicated
>>> drivers in domain0. E.g. high end graphics driver in domain0 may
use
>>> UC for its batch buffer, same for audio driver and other PCIe
>>> drivers in future. Processor will see 2 different memory attribute
>>> in its direct page table.
>> 
>> Yup. Jan Beulich posted a patch some time ago. At least some variant
>> of that is certainly needed and I plan to put that in before 3.2.0.
> 
> So we can rely on you for this change, right?
Yeah. It''ll be specific to PV domains with I/O privileges though.

 -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2007-Oct-09 15:45 UTC

head link

Re: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

On 9/10/07 16:30, "Dong, Eddie" <eddie.dong@intel.com> wrote:
> How about the initial MTRR value? Taking it easy, we can
> do this when create a VCPU and leave future work to do in
> Bochs BIOS. What is your opnion? in HVM loader from now?
Yeah, do it in the HVM vcpu initialisation hook for now.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Oct 2007 - [PATCH 0/2] MTRR/PAT virtualization

[Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

Re: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

RE: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

Re: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

Re: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

RE: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

Re: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

RE: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

Re: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

Re: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

RE: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

Re: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

Re: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

Re: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

RE: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

Re: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization

Re: [Xen-devel] [PATCH 0/2] MTRR/PAT virtualization